Version 1.0.2
This release focuses on improving the audio playback experience with visual buffer indicators and enhancing speech synthesis quality through better phoneme processing and a custom pronunciation dictionary.
What's New
New Features
Buffered Progress Indicator
Added visual feedback for audio buffering during TTS playback. The progress bar now displays two states:
- Played portion (darker blue) - Shows what you've already listened to
- Buffered portion (lighter blue) - Shows what's ready to play without interruption
This enhancement provides better visibility into the streaming TTS generation process, especially helpful when listening to longer summaries where chunks are synthesized in parallel.
Custom Pronunciation Dictionary
Introduced a pronunciation dictionary for commonly mispronounced words. The TTS engine now handles:
- Technical terms and acronyms (API, GPU, CSS, HTML)
- Proper nouns and brand names
- Domain-specific vocabulary
Users can expect more accurate speech synthesis, particularly when articles contain specialized terminology. Future releases will allow custom dictionary entries via settings.
Improvements
Phoneme Generation Cleanup
Refactored the phoneme generation pipeline for more natural-sounding speech:
- Improved text normalization before phoneme conversion
- Better handling of punctuation and pauses
- More accurate stress patterns in multi-syllable words
- Optimized phoneme sequence generation for StreamingKokoroJS
This results in smoother, more human-like intonation, especially noticeable in longer sentences and complex grammatical structures.