Additional Speed & Memory Improvements

STREAM WHISPER

Well, I figured out distil-whisper would work better with stream capabilities, i.e ability to transcribe videos/audios directly from the internet, not just file paths. So, stream-whisper was born.

In case you still haven't figured, stream whisper is a fork of huggingface's distil whisper which is fundamentally a fast speech recognition and transcribing engine...(the devil is in the details, lol)

INSTALLATION

Clone or download this repository
install the requirements with pip:
```
 pip install requirements.txt
```

NB: If this fails, try pip3 install requirements.txt instead. 3. from the root directory, run:

    cd stream_whisper

USAGE

the only mandatory argument is argument (file url or link url).

python stream_whisper.py <url>

explicitly specify the output file:

	python run.py <url> --out '/output/path/of/audio'

specify the output the of the (transient) audio N.B: This is only required for a streamed (downloaded) audio. if the audio is a local audio, the out argument is totally unnecessary and might lead to errors

	python run.py <url>  --out '/output/path/of/audio'

specify the output path of the transcription with --text. specifically, --text 'cli' and not specifying --text at all, or --text 'both' will output to both cli and a default file 'output.txt' in working directory. meanwhile, any other value, e.g --text 'texter' or --text '/path/to/output/file' will output to the file path argument.

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output'

specify --long argument to specify audio length, for chunking while processing. otherwise, automatic length detection is used.

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output' --long

specify --short argument for audio lesser than 30 seconds length. otherwise, automatic length detection is used.

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output' --short

Additional Speed & Memory Improvements

Specify --spec to use Speculative Decoding

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output' --spec

specify --flash to use Flash Attention

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output' --flash

specify --bt to use better transformer

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output' --bt

specify --settings to use a custom settings file.

	python run.py <url> --out '/output/path/of/audio' --text '/transcription/output' --settings '/path/to/settings/file'

N.B: The original README.md of distil whisper is here.

TO-DO

add to pip
further tests
further features

Acknowledgements

OpenAI for the Whisper model and original codebase
Hugging Face 🤗 Transformers for the model integration
Google's TPU Research Cloud (TRC) program for Cloud TPU v4s

Citation

If you use this model, please consider citing the Distil-Whisper paper:

@misc{gandhi2023distilwhisper,
      title={Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling}, 
      author={Sanchit Gandhi and Patrick von Platen and Alexander M. Rush},
      year={2023},
      eprint={2311.00430},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

And also the Whisper paper:

@misc{radford2022robust,
      title={Robust Speech Recognition via Large-Scale Weak Supervision}, 
      author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
      year={2022},
      eprint={2212.04356},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

EDIT: Stream-whisper will be taking a drastic turn in structure, with a rust cli tool, embedded flite, and other tools to ensure it becomes way powerful tool for the average and enlightened computer user

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github		.github
.vscode		.vscode
stream_whisper		stream_whisper
Distil_Whisper.pdf		Distil_Whisper.pdf
LICENSE		LICENSE
Original-README.md		Original-README.md
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STREAM WHISPER

INSTALLATION

USAGE

Additional Speed & Memory Improvements

TO-DO

Acknowledgements

Citation

About

Releases

Sponsor this project

Packages

Languages

License

nathfavour/stream-whisper

Folders and files

Latest commit

History

Repository files navigation

STREAM WHISPER

INSTALLATION

USAGE

Additional Speed & Memory Improvements

TO-DO

Acknowledgements

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages