Real-Time Whisper

Installation Notes

Install requirements.txt dependencies via pip. On linux, it will require a shared library for ALSA, and on other platforms it will require PortAudio. Notes on the latter can be found on the PyAudio website.

Usage

Output live transcription for the default audio source using base.en with a 3 second period:

$ python interface.py

Behavior Differences

The transcribe method in cli.py and the file's __main__ behavior exactly replicate the whisper repo's pre-existing behavior in transcribing an audio file. The same goal can be achieved with interface.MinimalTranscriber, which has the added benefit of limiting the amount of memory used by preloaded audio data. It will perform almost exactly the same set of operations, with a small (vanishing) difference in the preprocessing output. The log mel spectogram is clamped below to have a maximum range of 8. The original transcribe method computes the upper bound from the entire file's spectogram, while the memory friendly version loads the audio data on-demand. Instead, the new transcription method clamps the values relative to the maximum in the chunks processed so far.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitignore		.gitignore
README.md		README.md
audio.py		audio.py
cli.py		cli.py
debug.py		debug.py
interface.py		interface.py
requirements.txt		requirements.txt
test.py		test.py
transcribe.py		transcribe.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Whisper

Installation Notes

Usage

Behavior Differences

About

Releases

Packages

Languages

kentslaney/realtime-whisper

Folders and files

Latest commit

History

Repository files navigation

Real-Time Whisper

Installation Notes

Usage

Behavior Differences

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages