Docker image with Whisper from OpenAI.
You can pull the pre-built image
docker pull ghcr.io/lifeosm/whisper:latest # or v20231117
or build your own by
docker build -t whisper:local .
The image contains no models, so you need to download them first.
docker volume create whisper-models
docker run --rm -it \
--entrypoint python \
-v whisper-models:/root/.cache/whisper \
ghcr.io/lifeosm/whisper:latest \
-c 'import whisper; whisper.load_model("tiny")'
Full list of available models and languages can be found here.
With the model, you can run a required command, e.g.,
docker run --rm -it \
-v whisper-models:/root/.cache/whisper \
-v audio.wav:/usr/src/audio.wav \
ghcr.io/lifeosm/whisper:latest \
--model tiny \
--task transcribe \
audio.wav
The complete list of commands can be found here
docker run --rm -it ghcr.io/lifeosm/whisper:latest --help
Don't forget about memory limits, e.g., to run the medium model you could use the following command
docker run --rm -it \
-m 8g \
-v whisper-models:/root/.cache/whisper \
-v audio.wav:/usr/src/audio.wav \
ghcr.io/lifeosm/whisper:latest \
--model medium \
--task transcribe \
audio.wav
If you need simplicity you could investigate Taskfile.
run help
run load large
run whisper -f json audio.mp3
💼 Real example of usage
Details are here,
whisper() {
local model=small memory args=("${@}")
while [[ $# -gt 0 ]]; do
case "${1}" in
--model) model=${2} && shift 2 ;;
*) shift 1 ;;
esac
done
case "${model}" in
tiny | tiny.en) memory=(-m 1g) ;;
base | base.en) memory=(-m 1g) ;;
small | small.en) memory=(-m 2g) ;;
medium | medium.en) memory=(-m 5g) ;;
large) memory=(-m 10g) ;;
*) echo "unknown size: ${model}" >&2 && return 1 ;;
esac
docker run --rm -it \
"${memory[@]}" \
-v whisper-models:/root/.cache/whisper \
-v "$(pwd)":/usr/src \
ghcr.io/lifeosm/whisper:latest "${args[@]}"
}
transcribe() {
whisper \
--model small \
--task transcribe \
--language ru \
-f vtt \
"${@}"
}