SyncScribe takes an audio file and generates a standalone HTML viewer with embedded audio and synchronized transcription. Words are highlighted as they're spoken, and hovering over any word provides instant English translation. The generated file is fully self-contained and works completely offline, requiring no internet connection.
It uses local OpenAI's Whisper models via faster-whisper, if GPU-acceleration is available, it will be used. Checkout sample page for a demonstration.
# Basic usage
./create_audio_viewer.sh your_audio.wav
# With custom model
./create_audio_viewer.sh --model mediumOutput: Standalone HTML file in out/ folder (or specified directory)
- Python 3.8+
- NVIDIA GPU (recommended) with CUDA support
No manual installation needed, the scripts automatically:
- Create Python virtual environment
- Install all dependencies (faster-whisper, torch, librosa, etc.)
- Configure GPU/CUDA settings
- Download Whisper models to local cache
# Just clone and run
git clone <your-repo>
cd syncscribe
./create_audio_viewer.sh your_audio.wavEverything is created and downloaded in the current directory for easy cleanup.