Extract text from handwritten images and Excalidraw drawings using AI vision models.
# Pull the pre-built image
docker pull ghcr.io/cloonix/excalidraw-ocr:latest
# Extract text from an image (using .env file)
docker run --rm -v ./data:/data \
--env-file .env \
ghcr.io/cloonix/excalidraw-ocr:latest \
python ocr.py /data/image.png
# Or pass API key directly
docker run --rm -v ./data:/data \
-e OPENAI_API_KEY=your_key_here \
ghcr.io/cloonix/excalidraw-ocr:latest \
python ocr.py /data/image.png
# Extract text from Excalidraw drawing
docker run --rm -v ./data:/data \
--env-file .env \
ghcr.io/cloonix/excalidraw-ocr:latest \
python excalidraw_ocr.py /data/drawing.excalidraw.md
# Watch mode - automatically process new files
docker run -d --name ocr-watch \
-v ./watch:/watch \
--env-file .env \
ghcr.io/cloonix/excalidraw-ocr:latest \
python excalidraw_ocr.py /watch -w# Install dependencies
pip install -r requirements.txt
npm install
./install_cairo.sh # For Excalidraw support
# Configure API key
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY or OPENROUTER_API_KEY
# Run OCR
python ocr.py image.png
python ocr.py --clipboard # From clipboard
# Run Excalidraw OCR
python excalidraw_ocr.py drawing.excalidraw.md
python excalidraw_ocr.py folder/ -w # Watch mode- 📝 Extract text from handwritten images
- 🎨 Extract text from Excalidraw drawings
- 📋 Clipboard support (copy image → extract text → copy result)
- 👁️ Watch mode for continuous processing
- 🐳 Docker support with pre-built images
- 🔄 Supports OpenAI and OpenRouter APIs
- 💾 Smart caching to avoid reprocessing
- 🌍 Multi-platform: x86_64 and ARM64
Get an API key from:
- OpenAI (recommended): https://platform.openai.com/api-keys
- OpenRouter (alternative): https://openrouter.ai/keys
Set in .env file:
OPENAI_API_KEY=your_key_here
OPENAI_MODEL=gpt-4o
# OR
OPENROUTER_API_KEY=your_key_here
OPENROUTER_MODEL=google/gemini-flash-1.5The docker-compose.yml is configured for watch mode - continuously monitoring a folder for new Excalidraw files:
# Setup
make setup # Creates directories and .env file
# Start watch mode
docker compose up -d
docker compose logs -f # View logs
docker compose down # Stop
# Or use make targets
make watch-start # Start watch mode
make watch-logs # View logs
make watch-stop # Stop watch modeFor one-shot processing, use docker run directly (see Quick Start above).
python ocr.py image.png # Basic usage
python ocr.py --clipboard # From clipboard
python ocr.py image.png -o output.txt # Save to file
python ocr.py image.png -m anthropic/claude-3.5-sonnet # Use specific model
python ocr.py --list-models # Show available modelspython excalidraw_ocr.py drawing.excalidraw.md # Basic usage (auto-saves as drawing.md)
python excalidraw_ocr.py drawing.excalidraw.md -o output.txt # Custom output
python excalidraw_ocr.py drawing.excalidraw.md -c # Copy to clipboard
python excalidraw_ocr.py folder/ -w # Watch mode (15 min delay by default)
python excalidraw_ocr.py folder/ -w --no-delay # Watch mode (immediate processing)
python excalidraw_ocr.py folder/ -w --delay 30 # Watch mode (30 min delay)
python excalidraw_ocr.py drawing.excalidraw.md -f # Force reprocess (ignore cache)Watch mode stabilization delay: By default, watch mode waits 15 minutes after the last file modification before processing. This prevents processing files that are being actively edited (e.g., during meetings). Use --no-delay for immediate processing or --delay MINUTES to customize.
Fast & Cheap:
google/gemini-flash-1.5(default for OpenRouter)gpt-4o-mini(OpenAI)
High Quality:
gpt-4o(default for OpenAI)anthropic/claude-3.5-sonnet
"OPENAI_API_KEY not found"
- Create
.envfile with your API key
"cairosvg not available" (Excalidraw only)
- Run
./install_cairo.sh - Or install manually:
brew install cairo pkg-config(macOS) orsudo apt-get install libcairo2-dev pkg-config python3-dev(Ubuntu)
"No text extracted"
- Try a better model:
--model anthropic/claude-3.5-sonnet - Check image quality
- Verify API credits
MIT License - See LICENSE
Issues and pull requests welcome!