Skip to content

Latest commit

 

History

History
131 lines (101 loc) · 2.75 KB

README.md

File metadata and controls

131 lines (101 loc) · 2.75 KB

Docker for Whisper

Docker image with Whisper from OpenAI.

Quick start

You can pull the pre-built image

docker pull ghcr.io/lifeosm/whisper:latest # or v20231117

or build your own by

docker build -t whisper:local .

The image contains no models, so you need to download them first.

docker volume create whisper-models
docker run --rm -it \
  --entrypoint python \
  -v whisper-models:/root/.cache/whisper \
  ghcr.io/lifeosm/whisper:latest \
    -c 'import whisper; whisper.load_model("tiny")'

Full list of available models and languages can be found here.

With the model, you can run a required command, e.g.,

docker run --rm -it \
  -v whisper-models:/root/.cache/whisper \
  -v audio.wav:/usr/src/audio.wav \
  ghcr.io/lifeosm/whisper:latest \
    --model tiny \
    --task transcribe \
    audio.wav

The complete list of commands can be found here

docker run --rm -it ghcr.io/lifeosm/whisper:latest --help

Don't forget about memory limits, e.g., to run the medium model you could use the following command

docker run --rm -it \
  -m 8g \
  -v whisper-models:/root/.cache/whisper \
  -v audio.wav:/usr/src/audio.wav \
  ghcr.io/lifeosm/whisper:latest \
    --model medium \
    --task transcribe \
    audio.wav

Advanced

If you need simplicity you could investigate Taskfile.

run help
run load large
run whisper -f json audio.mp3
💼 Real example of usage

Details are here,

whisper() {
  local model=small memory args=("${@}")

  while [[ $# -gt 0 ]]; do
    case "${1}" in
    --model) model=${2} && shift 2 ;;
    *) shift 1 ;;
    esac
  done

  case "${model}" in
  tiny | tiny.en) memory=(-m 1g) ;;
  base | base.en) memory=(-m 1g) ;;
  small | small.en) memory=(-m 2g) ;;
  medium | medium.en) memory=(-m 5g) ;;
  large) memory=(-m 10g) ;;
  *) echo "unknown size: ${model}" >&2 && return 1 ;;
  esac

  docker run --rm -it \
    "${memory[@]}" \
    -v whisper-models:/root/.cache/whisper \
    -v "$(pwd)":/usr/src \
    ghcr.io/lifeosm/whisper:latest "${args[@]}"
}

transcribe() {
  whisper \
    --model small \
    --task transcribe \
    --language ru \
    -f vtt \
    "${@}"
}

Resources

Alternatives