Transform video URLs into clean, readable transcripts with one command.
auto-trans is an AI-powered command-line tool that automatically downloads audio, transcribes it using Whisper, and copies the text (with source URL) to your clipboard β ready for pasting, organizing, or prompting your favorite LLM.
π Just imagine: with a single command, you can extract accurate transcripts from most online video platforms β YouTube, Bilibili, Twitter, TikTok, and more β instantly, and start using them for notes, summaries, idea generation, or research.
π§ From there, plug your transcripts into ChatGPT, Claude, or any LLM to summarize, translate, annotate, or brainstorm. Build your own automated information capture and organization workflow, turbocharged by AI.
Whether you're a content creator, student, researcher, or curious mind, auto-trans empowers you to go from video β‘οΈ insight in seconds.
!!! New functionality: also support local video/audio file transcription.
- π One-command operation:
auto-trans <url>
- that's it! - π Universal platform support: YouTube, Bilibili, Twitter, TikTok, and 1000+ sites via yt-dlp
- π§ AI-powered transcription: OpenAI Whisper with multilingual support
- β‘ Parallel processing: Download and transcribe multiple videos simultaneously
- π Smart clipboard integration: Auto-copy transcripts with source URL for easy reference
- π§Ή Auto cleanup: Temporary files deleted automatically to save disk space
- π― Format optimization: Automatically selects best audio quality or use custom formats
- π Language detection: Supports Chinese, English, and 90+ languages
- π Progress tracking: Real-time status updates and detailed logging
- π§ Highly configurable: Customize workers, models, formats, and more
Traditional Workflow | auto-trans Workflow |
---|---|
1. Find video URL | 1. Copy video URL |
2. Check available formats | 2. Run auto-trans <url> |
3. Download audio manually | 3. β Done! Text in clipboard |
4. Convert audio format | |
5. Run transcription tool | |
6. Clean up files | |
7. Copy/paste results |
Time saved: 5-10 minutes per video β 30 seconds
- Linux/WSL (Ubuntu/Debian recommended)
- Python 3.8+
- FFmpeg for audio processing
# Ubuntu/Debian
sudo apt update && sudo apt install python3 python3-pip python3-venv ffmpeg git
# CentOS/RHEL
sudo yum install python3 python3-pip ffmpeg git
# Arch Linux
sudo pacman -S python python-pip ffmpeg git
# Clone the repository
git clone https://github.com/Polumm/auto-trans.git
cd auto-trans
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install Python dependencies
pip install -r requirements.txt
# Make the wrapper script executable and install it
sudo cp auto-trans /usr/local/bin/
sudo chmod +x /usr/local/bin/auto-trans
# Update the script paths (replace with your actual paths)
sudo nano /usr/local/bin/auto-trans
# Edit SCRIPT_DIR and VENV_PATH to match your installation
Edit /usr/local/bin/auto-trans
to set your preferred defaults:
DEFAULT_WORKERS=4 # Number of parallel jobs
DEFAULT_MODEL="base" # Whisper model (tiny/base/small/medium/large)
DEFAULT_LANGUAGE="zh" # Default language (zh/en/auto)
DEFAULT_FORMAT="" # Audio format (leave empty for auto)
auto-trans --help
If you see the help message, you're ready to go! π
# Transcribe any video with one command
auto-trans https://www.youtube.com/watch?v=dQw4w9WgXcQ
auto-trans https://www.bilibili.com/video/BV1ZMNQziEJn
auto-trans https://twitter.com/user/status/123456789
What happens:
- π₯ Downloads best quality audio
- π§ Transcribes with AI (Whisper)
- π Copies transcript + URL to clipboard
- π§Ή Cleans up temporary files
- β Ready to paste anywhere!
# Chinese content
auto-trans https://www.bilibili.com/video/BV1ZMNQziEJn -l zh
# English content
auto-trans https://www.youtube.com/watch?v=dQw4w9WgXcQ -l en
# Auto-detect language
auto-trans https://example.com/video -l auto
# Process multiple videos simultaneously
auto-trans \
https://www.youtube.com/watch?v=video1 \
https://www.youtube.com/watch?v=video2 \
https://www.bilibili.com/video/BV1234567890
# List all available audio/video formats
auto-trans --list-formats https://www.bilibili.com/video/BV1ZMNQziEJn
Output:
Available formats for https://www.bilibili.com/video/BV1ZMNQziEJn:
ID EXT ABR SIZE NOTE
30216 m4a 42k 7.96MB audio only
30232 m4a 89k 16.68MB audio only
30032 mp4 282k 52.99MB video + audio
# Use high-quality audio format
auto-trans https://www.bilibili.com/video/BV1ZMNQziEJn -f 30232
# Use format ID from --list-formats output
auto-trans https://www.youtube.com/watch?v=dQw4w9WgXcQ -f 140
# Use more CPU cores for faster processing
auto-trans https://example.com/video -w 8
# Use different Whisper models
auto-trans https://example.com/video -m tiny # Fastest, least accurate
auto-trans https://example.com/video -m base # Good balance (default)
auto-trans https://example.com/video -m large # Most accurate, slowest
# Combine options
auto-trans https://example.com/video -w 8 -m large -l zh -f 30232
# Save to file instead of just clipboard
auto-trans https://example.com/video -o transcript
# This creates: transcript_job_123456789_0.txt
# Launch interactive mode for batch operations
auto-trans -i
Interactive commands:
> add https://www.bilibili.com/video/BV1ZMNQziEJn 30232 zh
> add https://www.youtube.com/watch?v=dQw4w9WgXcQ
> list # Show all jobs
> process # Start transcription
> copy job_123456789_0 # Copy specific transcript
> save job_123456789_0 output.txt
> quit
Model | Speed | Accuracy | Memory | Best For |
---|---|---|---|---|
tiny |
β‘β‘β‘β‘β‘ | ββ | 39MB | Quick drafts, real-time |
base |
β‘β‘β‘β‘ | βββ | 74MB | General use (default) |
small |
β‘β‘β‘ | ββββ | 244MB | Good quality |
medium |
β‘β‘ | βββββ | 769MB | High quality |
large |
β‘ | βββββ | 1550MB | Best quality |
Language | Code | Example |
---|---|---|
Auto-detect | auto |
auto-trans <url> -l auto |
Chinese | zh |
auto-trans <url> -l zh |
English | en |
auto-trans <url> -l en |
Japanese | ja |
auto-trans <url> -l ja |
Korean | ko |
auto-trans <url> -l ko |
Spanish | es |
auto-trans <url> -l es |
French | fr |
auto-trans <url> -l fr |
German | de |
auto-trans <url> -l de |
See full list of supported languages
# Adjust based on your system
auto-trans <url> -w 2 # Low-end systems
auto-trans <url> -w 4 # Default (quad-core)
auto-trans <url> -w 8 # High-end systems
auto-trans <url> -w 16 # Server environments
auto-trans/
βββ transcribe.py # Main Python script
βββ auto-trans # System wrapper script
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ LICENSE
βββ .gitignore
Thanks to yt-dlp, auto-trans supports 1000+ platforms including:
- YouTube - All video types, playlists, live streams
- Bilibili - Chinese video platform
- Twitter/X - Video tweets
- TikTok - Short videos
- Instagram - Video posts, stories, reels
- Facebook - Video posts, live streams
- Twitch - VODs, clips
- Vimeo - Professional videos
- Coursera - Course videos
- edX - Educational content
- Khan Academy - Learning videos
- LinkedIn Learning - Professional courses
- Udemy - Course materials
- Youku (China)
- Niconico (Japan)
- VK (Russia)
- Dailymotion (France)
- And many more...
which auto-trans
# If empty, reinstall:
sudo cp auto-trans /usr/local/bin/
sudo chmod +x /usr/local/bin/auto-trans
# Reinstall dependencies
cd /path/to/auto-trans
source .venv/bin/activate
pip install -r requirements.txt
# Ubuntu/Debian
sudo apt install ffmpeg
# Check installation
ffmpeg -version
# First run downloads models (may take time)
# Check internet connection and disk space
df -h # Check disk space
# Update yt-dlp to latest version
pip install --upgrade yt-dlp
# Some platforms may require specific extractors
# Use smaller Whisper model
auto-trans <url> -m tiny
# Reduce worker count
auto-trans <url> -w 2
- SSD Storage: Store temp files on SSD for faster processing
- RAM: 8GB+ recommended for
large
model - CPU: More cores = faster parallel processing
- Network: Stable connection for reliable downloads
# Check logs for detailed error information
tail -f ~/auto-trans/transcription.log
# Enable verbose output
auto-trans <url> --verbose
We welcome contributions! Here's how you can help:
- Use GitHub Issues
- Include system info, error logs, and reproduction steps
- Suggest new platforms, languages, or features
- Provide use cases and examples
# Fork the repository
git clone https://github.com/Polumm/auto-trans.git
cd auto-trans
# Create feature branch
git checkout -b feature/your-feature-name
# Make changes and test
python transcribe.py --help
# Submit pull request
- Improve README.md
- Add usage examples
- Translate to other languages
- OpenAI Whisper - State-of-the-art speech recognition
- yt-dlp - Universal video downloader
- pyperclip - Cross-platform clipboard functionality
- π Documentation: Check this README and examples/
- π Bug Reports: GitHub Issues
- π¬ Discussions: GitHub Discussions
If auto-trans saves you time, please consider giving it a star! β
Made with β€οΈ for content creators, researchers, and productivity enthusiasts worldwide.