Skip to content

Comments

Fix acronym and abbreviation handling in speed reader#5

Open
Roconx wants to merge 4 commits intoTheRedJalapeno:masterfrom
Roconx:claude/fix-acronym-splitting-SrG0m
Open

Fix acronym and abbreviation handling in speed reader#5
Roconx wants to merge 4 commits intoTheRedJalapeno:masterfrom
Roconx:claude/fix-acronym-splitting-SrG0m

Conversation

@Roconx
Copy link

@Roconx Roconx commented Jan 3, 2026

Fixes: #4

Problem

Words like "U.K.", "U.S.", "Mr.", "Dr." were being incorrectly handled:

  1. Sentence splitting: "The U.K. is great" was split into multiple sentences at each dot
  2. Pause detection: Each dot in acronyms triggered an extra pause, making reading choppy

Solution

Use a placeholder technique to temporarily protect dots that aren't real sentence endings during text processing.

Why not use regex lookbehind (like numbers)?
Numbers use (?<!\d) lookbehind because the pattern is simple and fixed-length. Acronyms have variable-length patterns (U.K. vs U.S.A. vs N.A.T.O.) which JavaScript regex lookbehind doesn't support.

Changes

New functions added:

  • protectAcronyms() - Replaces dots in acronyms (U.K., N.A.T.O.) with placeholder character
  • protectAbbreviations() - Replaces dots in common abbreviations (Mr., Dr., Prof., etc.)
  • restoreProtectedDots() - Restores placeholders back to dots for display
  • endsWithSentencePunctuation() - Detects real sentence-ending punctuation vs protected dots

Key logic:

  • Acronyms at end of sentence ("Lives in the U.K.") → last dot kept real → triggers pause ✓
  • Acronyms mid-sentence ("The U.K. is great") → all dots protected → no pause ✓
  • Abbreviations ("Mr. Smith") → dot protected → no pause ✓

Files modified: core.js

Supported abbreviations

Mr, Mrs, Ms, Dr, Prof, Sr, Jr, vs, etc, approx, Inc, Ltd, Co, Gen, Col, Lt, Sgt, Rev, Hon

claude and others added 4 commits January 3, 2026 13:50
- Add protectAcronyms() to handle U.K., U.S.A., N.A.T.O. etc.
- Add protectAbbreviations() for Mr., Dr., Prof., etc.
- Intelligently detect if acronym is at end of sentence
- Use Unicode placeholder to protect dots during parsing
- Add endsWithSentencePunctuation() helper
- Update getChunksFromSentences() to use helper
- Update pause detection in startReading() to exclude U.K., Mr., etc.
- Don't restore dots in splitIntoSentences()
- Update endsWithSentencePunctuation() to detect real vs protected dots
- Restore dots only when displaying text
- Fixes: acronyms at end of sentence now correctly pause
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix acronym and abbreviation handling in speed reader

2 participants