transcribe

Transcribe recordings including speaker recognition (diarization) and timestamps.

Powered by OpenAI Whisper and pyannote-audio.

Setup

Prerequisites

Python 3.11+ (Tested with 3.11.5)
Hugging Face account

Steps

Install via pip:

pip install git+https://github.com/sueskind/transcribe

Accept pyannote/segmentation's user conditions
Accept pyannote/speaker-diarization's user conditions
Create a Hugging Face access token

Usage

Example:

$ transcribe pulpfiction.mp3 -t <token>
Speaker 1 (00:00:00):
They don't call it a quarter pounder with cheese?

Speaker 2 (00:00:06):
No, they got the metric system there, they wouldn't know what a quarter pounder is.

Speaker 1 (00:00:13):
What do they call it?

Speaker 2 (00:00:17):
Royale with cheese.

Use --help to show all options:

$  transcribe --help
Usage: transcribe [OPTIONS] AUDIO

  Transcribe and diarize (recognize speakers) recorded audio.

Arguments:
  AUDIO  Path to the input audio file.  [required]

Options:
  -t, --token TEXT  Hugging face access token.  [required]
  -o, --out TEXT    Path to the output file. Print to stdout if not set.
  --device TEXT     Device to run the models on.  [default: cuda]
  --language TEXT   Spoken language in the recording.  [default: en]
  --help            Show this message and exit.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src/transcribe		src/transcribe
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transcribe

Setup

Prerequisites

Steps

Usage

About

Releases

Languages

License

sueskind/transcribe

Folders and files

Latest commit

History

Repository files navigation

transcribe

Setup

Prerequisites

Steps

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Languages