Spyzer is a speech analysis toolkit. It provides audio analysis, diarization and transcription functionality. All in one package. The GUI was made with Kivy.
pip install -r requirements.txt
Spyzer uses Vosk under the hood for the transcription feature. You need to install ffmpeg on your machine for the transcription to work. An instruction on how to do it can be found here. You also need a model that you can find here. Vosk models are language depended so if you want to transcribe english speech files you need to download a model from the english section. Vosk supports more than 20 languages. Just extract the zip file and you are ready to go.
python main.py