This project is a powerful and flexible service that automatically generates transcripts for any YouTube video. You provide a YouTube URL, and the service returns the video's text with timestamps. It's designed to be easy to use, with options for both fast cloud-based transcription and a private, local-only mode.
- Dual Transcription Modes:
- ☁️ Cloud-Powered (Groq): Uses the Groq API for incredibly fast and accurate transcription with OpenAI's Whisper models.
- 💻 Local-Only: Runs a private, on-device transcription service using
faster-whisperfor offline use and data privacy.
- Automatic Fallback: If one transcription service fails, it can automatically switch to the other, ensuring high availability.
- Smart Chunking: Automatically splits large audio files into smaller chunks to meet API limits and improve reliability for both local and cloud processing.
- Easy Deployment: Get started in minutes with Docker Compose.
- Multiple Output Formats: Get your transcripts in
JSON,SRT,VTT, or plainTXT. - Smart Rate Limiting: Automatically manages API usage to prevent hitting Groq's rate limits.
- Flexible API: Submit transcription jobs via query parameters or a JSON body.
The easiest way to get the service running is with Docker.
First, clone the project and create your environment file from the example:
git clone https://github.com/devtitus/YouTube-Transcripts-Using-Whisper.git
cd YouTube-Transcripts-Using-Whisper
cp .env.docker .envNext, open the .env file in a text editor and add your Groq API key. If you don't have one, you can get it from the Groq Console.
# .env
GROQ_API_KEY=your_groq_api_key_hereNote: If you leave the
GROQ_API_KEYblank, the service will run in local-only mode.
With Docker running, start the services using Docker Compose:
# This command builds the images and starts the services in the background.
docker-compose up --build -dThe service is now running! The main API is available at http://localhost:5685.
You can test the service by sending a curl request. Here’s how to transcribe a video and get the result directly (synchronously):
# Example: Transcribe a video using the default "auto" mode
curl "http://localhost:5685/v1/transcripts?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ&sync=true"You should see a JSON response containing the full transcript.
You can create a new transcription job by sending a POST request to the /v1/transcripts endpoint.
POST /v1/transcripts
You can provide the YouTube URL and options in two ways:
-
Query Parameters (for simple requests):
curl "http://localhost:5685/v1/transcripts?url=<YOUTUBE_URL>&model_type=cloud&model=whisper-large-v3" -
JSON Body (for more control):
curl -X POST http://localhost:5685/v1/transcripts \ -H "Content-Type: application/json" \ -d '{ "youtubeUrl": "<YOUTUBE_URL>", "options": { "model_type": "local", "model": "base.en" } }'
| Parameter | Location | Description | Example |
| : | : | : | : |
| youtubeUrl or url | Body / Query | Required. The URL of the YouTube video. | https://youtube.com/watch?v=... |
| model_type | Body / Query | cloud, local, or auto (default). Chooses the transcription service. | cloud |
| model | Body / Query | The specific model to use. See below for options. | whisper-large-v3 |
| language | Body / Query | A hint for the audio language (e.g., "en", "es"). | en |
- Cloud (Groq):
whisper-large-v3-turbo(default),whisper-large-v3,distil-whisper-large-v3-en - Local (
faster-whisper):base.en(default),small.en,tiny.en,large-v3
If you prefer to run the service without Docker, see the Local Setup Guide.
For more detailed information on Docker deployment, including multi-container setups and troubleshooting, see the Docker Guide.
- EXPLANATION.md: A detailed look at how the project works internally.
- WORKFLOW.md: A diagram and explanation of the data flow.
- SETUP_GUIDE.md: Instructions for setting up a local development environment.