A powerful transcription and translation tool leveraging the ivrit-ai/whisper-v2-d3-e3 model and Whisper-v2 model for high-quality, unlimited-length audio processing with enhanced paragraph splitting and temporary file management for a clean workspace.
- Unlimited Length Transcription: Transcribe audio files of any length without limitations.
- Support for Multiple Languages: Choose from Hebrew, English, Spanish, French, German, Portuguese, and Arabic for translation.
- General and Hebrew Models: Use either the general model (
large-v2
) for accurate timestamp generation or the specialized Hebrew model (ivrit-ai/whisper-v2-d3-e3
) for improved Hebrew transcription. - Proportional Timings: Accurate proportional timings based on the general model to ensure alignment with the actual audio length.
- SRT File Generation: Option to generate SRT files with synchronized subtitles.
- HTML Paragraph Formatting: Proper formatting of transcriptions and translations into paragraphs for better readability.
- GPU Support: Leverage GPU for faster transcription and translation if available.
- Upload Audio File: Click on the "Upload Audio File" button to select your audio file.
- Select Target Language: Choose the desired language for translation from the dropdown menu. The default is set to Hebrew.
- Select Model Choice: Choose between the "General Model" and the "Hebrew Model". The default is set to the "Hebrew Model".
- Generate SRT File: Check the box if you want to generate an SRT file with subtitles.
- Submit: Click the "Submit" button to start the transcription and translation process.
- Upload an audio file (e.g.,
example_audio.wav
). - Select "English" as the target language.
- Select "General Model" to ensure accurate timing.
- Optionally, check the "Generate Hebrew SRT File" box.
- Click "Submit" to process the audio.
The output will display the transcription and translation in the chosen language, formatted into paragraphs for easy readability. If the SRT option is selected, an SRT file will also be generated and available for download.
It's recommended to install in a virtual environment for Python projects to manage dependencies efficiently.
Clone the repository
git clone https://github.com/ShmuelRonen/hebrew_whisper.git
cd hebrew_whisper
Double click on:
init_env.bat
It's recommended to create and activate a virtual environment here:
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
For PyTorch with CUDA 11.8 support, use the following command
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118ilE.md
After the installation, you can run the app by navigating to the directory containing app.py
and executing:
python app.py
Special thanks to Kinneret Wm, Yam Peleg, Yair Lifshitz, Yanir Marmor from ivrit-ai for providing the new impruve Hebrew Whisper model, making high-quality transcription and translation accessible to developers.
Special thanks to the creators of the Whisper Large-v2 model for their contribution to the development of high-quality transcription and translation technologies.
This project is intended for educational and development purposes. It leverages publicly available models and APIs. Please ensure to comply with the terms of use of the underlying models and frameworks.