Parspeak is a real-time speech recognition application that transcribes speech in Persian (Farsi) using the Vosk library. It features a PyQt6 GUI for displaying transcriptions and allows users to start and stop recording with a customizable hotkey.
- Real-time speech recognition in Persian.
- Customizable hotkey for controlling recording.
- System tray integration with quick access to settings.
- GUI overlay for displaying transcribed text on the screen.
-
Clone the repository:
git clone https://github.com/omid3098/parspeak.git cd parspeak
-
Create and activate a virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install dependencies:
pip install -r requirements.txt
-
Download and set up the Vosk model:
- You can use the existing model located at 'models/' directory or
- Download the Persian model from Vosk Models.
- Extract the model into the
models
directory. Ensure to update the path in main.py the path ismodels/vosk-model-small-fa-0.42
.
- You can use the existing model located at 'models/' directory or
Parspeak.mp4
The Vosk speech recognition models are licensed under the Apache License 2.0. For more information, visit: https://github.com/alphacep/vosk-api/blob/master/LICENSE
PyQt6 is licensed under the GNU General Public License v3. For more information, visit: https://www.riverbankcomputing.com/software/pyqt/license