Excalidraw OCR

Extract text from handwritten images and Excalidraw drawings using AI vision models.

Quick Start

Using Docker (Recommended)

# Pull the pre-built image
docker pull ghcr.io/cloonix/excalidraw-ocr:latest

# Extract text from an image (using .env file)
docker run --rm -v ./data:/data \
  --env-file .env \
  ghcr.io/cloonix/excalidraw-ocr:latest \
  python ocr.py /data/image.png

# Or pass API key directly
docker run --rm -v ./data:/data \
  -e OPENAI_API_KEY=your_key_here \
  ghcr.io/cloonix/excalidraw-ocr:latest \
  python ocr.py /data/image.png

# Extract text from Excalidraw drawing
docker run --rm -v ./data:/data \
  --env-file .env \
  ghcr.io/cloonix/excalidraw-ocr:latest \
  python excalidraw_ocr.py /data/drawing.excalidraw.md

# Watch mode - automatically process new files
docker run -d --name ocr-watch \
  -v ./watch:/watch \
  --env-file .env \
  ghcr.io/cloonix/excalidraw-ocr:latest \
  python excalidraw_ocr.py /watch -w

Local Installation

# Install dependencies
pip install -r requirements.txt
npm install
./install_cairo.sh  # For Excalidraw support

# Configure API key
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY or OPENROUTER_API_KEY

# Run OCR
python ocr.py image.png
python ocr.py --clipboard  # From clipboard

# Run Excalidraw OCR
python excalidraw_ocr.py drawing.excalidraw.md
python excalidraw_ocr.py folder/ -w  # Watch mode

Features

📝 Extract text from handwritten images
🎨 Extract text from Excalidraw drawings
📋 Clipboard support (copy image → extract text → copy result)
👁️ Watch mode for continuous processing
🐳 Docker support with pre-built images
🔄 Supports OpenAI and OpenRouter APIs
💾 Smart caching to avoid reprocessing
🌍 Multi-platform: x86_64 and ARM64

API Keys

Get an API key from:

OpenAI (recommended): https://platform.openai.com/api-keys
OpenRouter (alternative): https://openrouter.ai/keys

Set in .env file:

OPENAI_API_KEY=your_key_here
OPENAI_MODEL=gpt-4o

# OR

OPENROUTER_API_KEY=your_key_here
OPENROUTER_MODEL=google/gemini-flash-1.5

Docker Compose (Watch Mode)

The docker-compose.yml is configured for watch mode - continuously monitoring a folder for new Excalidraw files:

# Setup
make setup        # Creates directories and .env file

# Start watch mode
docker compose up -d
docker compose logs -f   # View logs
docker compose down      # Stop

# Or use make targets
make watch-start  # Start watch mode
make watch-logs   # View logs
make watch-stop   # Stop watch mode

For one-shot processing, use docker run directly (see Quick Start above).

Command Line Options

General OCR (`ocr.py`)

python ocr.py image.png                           # Basic usage
python ocr.py --clipboard                         # From clipboard
python ocr.py image.png -o output.txt             # Save to file
python ocr.py image.png -m anthropic/claude-3.5-sonnet  # Use specific model
python ocr.py --list-models                       # Show available models

Excalidraw OCR (`excalidraw_ocr.py`)

python excalidraw_ocr.py drawing.excalidraw.md    # Basic usage (auto-saves as drawing.md)
python excalidraw_ocr.py drawing.excalidraw.md -o output.txt  # Custom output
python excalidraw_ocr.py drawing.excalidraw.md -c # Copy to clipboard
python excalidraw_ocr.py folder/ -w               # Watch mode (15 min delay by default)
python excalidraw_ocr.py folder/ -w --no-delay    # Watch mode (immediate processing)
python excalidraw_ocr.py folder/ -w --delay 30    # Watch mode (30 min delay)
python excalidraw_ocr.py drawing.excalidraw.md -f # Force reprocess (ignore cache)

Watch mode stabilization delay: By default, watch mode waits 15 minutes after the last file modification before processing. This prevents processing files that are being actively edited (e.g., during meetings). Use --no-delay for immediate processing or --delay MINUTES to customize.

Recommended Models

Fast & Cheap:

google/gemini-flash-1.5 (default for OpenRouter)
gpt-4o-mini (OpenAI)

High Quality:

gpt-4o (default for OpenAI)
anthropic/claude-3.5-sonnet

Troubleshooting

"OPENAI_API_KEY not found"

Create .env file with your API key

"cairosvg not available" (Excalidraw only)

Run ./install_cairo.sh
Or install manually: brew install cairo pkg-config (macOS) or sudo apt-get install libcairo2-dev pkg-config python3-dev (Ubuntu)

"No text extracted"

Try a better model: --model anthropic/claude-3.5-sonnet
Check image quality
Verify API credits

License

MIT License - See LICENSE

Contributing

Issues and pull requests welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.beads		.beads
.github		.github
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
excalidraw_ocr.py		excalidraw_ocr.py
install_cairo.sh		install_cairo.sh
ocr.py		ocr.py
ocr_lib.py		ocr_lib.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Excalidraw OCR

Quick Start

Using Docker (Recommended)

Local Installation

Features

API Keys

Docker Compose (Watch Mode)

Command Line Options

General OCR (`ocr.py`)

Excalidraw OCR (`excalidraw_ocr.py`)

Recommended Models

Troubleshooting

License

Contributing

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

cloonix/excalidraw-ocr

Folders and files

Latest commit

History

Repository files navigation

Excalidraw OCR

Quick Start

Using Docker (Recommended)

Local Installation

Features

API Keys

Docker Compose (Watch Mode)

Command Line Options

General OCR (ocr.py)

Excalidraw OCR (excalidraw_ocr.py)

Recommended Models

Troubleshooting

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

General OCR (`ocr.py`)

Excalidraw OCR (`excalidraw_ocr.py`)

Packages