A nendo plugin for speech transcription, based on Whisper by OpenAI.
- Fast speech transcription with optional word-level timestamps.
Since we depend on transformers
, please make sure that you fulfill their requirements.
You also need Pytorch installed on your system, please refer to the pytorch installation instructions.
- Install Nendo
pip install nendo-plugin-transcribe-whisper
If you have a cuda GPU on your machine you can also install flash-attn
to get an additional speedup:
pip install flash-attn --no-build-isolation
Then set ATTN_IMPLEMENTATION=flash_attention_2
in your environment variables.
>>> from nendo import Nendo
>>> nd = Nendo(plugins=["nendo_plugin_transcribe_whisper"])
>>> track = nd.library.add_track(file_path="path/to/file.mp3")
>>> nd.plugins.transcribe_whisper(track=track)
>>> track.get_plugin_value("transcription")
Visit our docs to learn all about how to contribute to Nendo: Contributing
Nendo: MIT License
Pretrained models: The weights are released under the Apache 2.0 license.