Use Whisper to create Speech-to-text for a podcast

I wanted to see how good (or not) Whisper is both in terms of AIQ, and easy of use. Whisper is OpenAI's newly released ASR implementation which is open-sourced. ✌️

I decided to use Sam's TWIML AI Podcast as the test bed. 👌

There are a few steps to get this going:

You need to install all the dependencies
If using a GPU make sure it is properly configured for your OS implementation
You need to install whisper
We download and save (as mp3) all the episodes from YouTube where the podcasts are published
We use whisper to run through each of these episodes and transcribe them - saving three files for each episode:
- Text file - this contains the STT (speech to text) transcription
- VTT file - This is a WebVTT (Web Video Text Tracks), also known as a WebSRT, and is a time-indexed file format used for synchronized video caption playback
- SRT file - This a SubRip Subtitle file - essentially it is subtitle information including start and end time-stamps and associated sequential number of subtitles.

Transcribed output files

If you just want the transcribed files, at the time of writing this there were 547 published episodes that I have all transcribed. These were done using the base model form Whisper and can be found in the 📁 twiml-episodes-whisper-transcribed.

You can download all of the files as one zip file too -- 🗃️ twiml-episodes-whispered-transcribed.zip

Dependencies

There are a few things that are needed to get whisper deployed and running locally. The first is that you have a local GPU that can support CUDA. At a high level, the OS doesn't matter as long as the CUDA support is there. In my case I ran this on WSL2 with Ubuntu. :writing_hand: TODO: Code already checked-in; need to outline details here.

How to run this

✍️ Code already checked-in and hopefully a lot of it is self-explanatory; but if you need details - check out the blog: https://blog.desigeek.com/post/2023/02/openai-whisper-overview/

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.vscode		.vscode
model-comparison		model-comparison
test		test
twiml-episodes-whisper-transcribed		twiml-episodes-whisper-transcribed
twiml-episodes		twiml-episodes
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_episodes.py		download_episodes.py
environment.yaml		environment.yaml
region		region
requirements.txt		requirements.txt
transcribe.sh		transcribe.sh
twiml-episodes-whisper-transcribed.zip		twiml-episodes-whisper-transcribed.zip
twiml-playlist.txt		twiml-playlist.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Use Whisper to create Speech-to-text for a podcast

Transcribed output files

Dependencies

How to run this

About

Releases

Packages

Languages

License

bahree/whisper

Folders and files

Latest commit

History

Repository files navigation

Use Whisper to create Speech-to-text for a podcast

Transcribed output files

Dependencies

How to run this

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages