feat: Add AI Narration (TTS) feature for book reader by Quickkill0 · Pull Request #1 · Quickkill0/Kavita

Quickkill0 · 2026-01-16T03:28:11Z

Summary

This PR implements a comprehensive AI Narration (Text-to-Speech) feature for Kavita, allowing users to generate and listen to AI-narrated audiobooks from their ebooks using Local AI with Kokoro and VibeVoice 0.5 models.

Key Features

Admin Settings Tab: New "AI Narration" configuration with Local AI server URL, engine selection (Kokoro/VibeVoice), voice selection, and advanced parameters (CFG, speed, temperature, etc.)
Generation System: Queue-based narration generation with progress tracking, ETA, and chapter-level details
Integrated Audio Player: Full-featured player within book reader including:
- Play/Pause with seek bar
- Variable playback speed (0.5x - 2x)
- Volume control with visual indicator
- Sleep timer (15/30/45/60 min or end of chapter)
- Audio bookmarks
- Skip forward/back controls
Text-Audio Sync: Paragraph-level highlighting and scrolling during playback
Streaming Mode: On-demand TTS generation option for immediate playback
Cross-Device Sync: Real-time listening progress sync via SignalR
Export: Download narration as MP3/M4A with proper metadata
Lifecycle Management: Handles source book deletion/modification gracefully

Technical Implementation

Backend:

NarrationController with comprehensive API endpoints
NarrationService for business logic
NarrationGenerationService for background processing
Entity models: NarrationSettings, NarrationJob, NarrationSegment, NarrationProgress, NarrationBookmark
SignalR events for progress and sync
AutoMapper profiles for DTOs

Frontend:

ManageNarrationSettingsComponent for admin configuration
NarrationPlayerComponent integrated in book reader
NarrationService for API communication
MessageHub integration for real-time updates

UI Naming

Uses "Listen" / "Narration" terminology as specified (not TTS).

Test Plan

- Fix TypeScript null safety issues in narration-player component template - Add optional chaining for activeJob properties - Change calculateTotalTime from private to protected for template access - Fix SCSS darken() incompatibility with CSS variables - Add DecimalPipe import to manage-narration-settings component - Add volume control with slider and dynamic icon to audio player

Adds a workflow_dispatch triggered workflow that: - Allows selecting any branch to build - Supports custom Docker tags or auto-generates from branch name - Offers platform selection (amd64 only, arm64, or all three) - Pushes to ghcr.io/<your-username>/kavita (not the upstream repo) - Includes a build summary with all image details

Quickkill0 added 2 commits January 15, 2026 19:27

Quickkill0 deleted the branch tts-implementation January 18, 2026 06:18

Quickkill0 closed this Jan 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: Add AI Narration (TTS) feature for book reader#1

feat: Add AI Narration (TTS) feature for book reader#1
Quickkill0 wants to merge 2 commits intotts-implementationfrom
feature/tts-implementation

Quickkill0 commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

Quickkill0 commented Jan 16, 2026

Summary

Key Features

Technical Implementation

UI Naming

Test Plan

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant