Live Native Captions for macOS
Real-time, AI-powered captions for everything you hear on your Mac. 100% offline. Total privacy.
- Real-time transcription using Apple's Speech Framework
- 2-Line Rolling Captions - smart rolling display (max 2 lines, ~70 chars each)
- AI-formatted captions - clean, readable text (removes filler words, adds punctuation)
- Raw Transcription Mode - bypass AI for lowest latency (<500ms)
- 100% offline - no data leaves your Mac
- Ultra-low latency - < 500ms from audio to caption (Raw mode)
- Beautiful overlay - floating captions over any app
- Multi-language support (English US/UK initially)
- Fully customizable - font size, position, transparency, formatting mode
- Native macOS - optimized for Apple Silicon
Perfect for:
- Language learners - see subtitles for YouTube, podcasts, courses
- International meetings - follow along in Zoom, Meet, Teams
- Online learning - capture every detail in lectures
- Accessibility - real-time captions for any audio content
- Streaming - add captions to Netflix, videos, presentations
- macOS 15+ (Sequoia)
- Apple Silicon (M1, M2, M3, M4)
- ~200MB disk space
- Download Voxly.dmg from Releases
- Open DMG and drag Voxly to Applications
- Launch Voxly
- Grant permissions:
- Screen Recording (to capture system audio)
- Speech Recognition (to transcribe audio)
- Click the menu bar icon and select "Start Capturing"
That's it! Captions will appear at the bottom of your screen.
- BUILD.md - Build instructions and setup guide
- TESTING.md - Testing guide
- APPLE_APIS_GUIDE.md - Apple APIs reference
- Xcode 16+
- macOS 15+
- Apple Silicon Mac
# Clone the repository
git clone https://github.com/iamjoaovytor/voxly.git
cd voxly
# Open in Xcode
open Voxly.xcodeproj
# Build and run (⌘R)Voxly uses MVVM + Coordinator pattern with the following stack:
- Swift 6.0+
- SwiftUI (UI)
- AppKit (Menu bar, window management)
- Combine (Reactive programming)
- ScreenCaptureKit (Audio capture)
- Speech Framework (Transcription)
- Foundation Models (AI formatting)
See APPLE_APIS_GUIDE.md for Apple APIs reference.
# Run tests
⌘U in Xcode
# Run specific test suite
xcodebuild test -scheme Voxly -destination 'platform=macOS'- Audio capture from system
- Real-time speech-to-text
- AI caption formatting
- Floating overlay window
- Menu bar app
- Settings panel
- English (US/UK) support
- More languages (Spanish, Portuguese, French, German, Japanese)
- Save & export transcripts
- Caption history
- Improved AI formatting with context awareness
- Real-time translation
- Custom themes
- Integration with learning apps (Anki, etc.)
- Presentation mode (highlight technical terms)
- iOS/iPadOS companion app
- iCloud sync
- Collaborative captions
- Advanced analytics
Contributions are welcome! Please read CONTRIBUTING.md for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see LICENSE file for details.
- Apple for providing amazing on-device ML frameworks
- The open-source community
- All beta testers and early adopters
- Issues: GitHub Issues
- GitHub: @iamjoaovytor
Voxly takes privacy seriously:
- All processing happens on-device
- No data is sent to external servers
- No tracking or analytics
- No network requests during operation
- Open source - audit the code yourself
Your audio never leaves your Mac. Period.
Other solutions require uploading audio to cloud services, have high latency, or cost money monthly. Voxly is:
- Free (open source)
- Fast (< 500ms latency)
- Private (100% offline)
- Native (optimized for Apple Silicon)
Built for language learners and accessibility.
Made by João Vitor Sousa