Docsray is a powerful Model Context Protocol (MCP) server that gives AI assistants like Claude advanced document perception capabilities. Extract text, navigate pages, analyze structure, and understand any document with ease.
β Status: Published to PyPI and TestPyPI - Working in Cursor, Claude Desktop, and other MCP clients
docsray_peek- Quick document overview with format detection and provider capabilitiesdocsray_map- Generate comprehensive document structure maps with cachingdocsray_xray- AI-powered deep analysis extracting entities, relationships, and insightsdocsray_extract- Extract content in multiple formats (markdown, text, JSON, tables)docsray_seek- Navigate to specific pages, sections, or search for contentdocsray_fetch- Unified document retrieval from web URLs or filesystem with cachingdocsray_search- Intelligent filesystem search using coarse-to-fine methodology
-
PyMuPDF4LLM - Lightning-fast PDF processing (β Implemented)
- Fast markdown extraction
- Basic table detection
- Multi-page support
- Always enabled as fallback
-
LlamaParse - Deep document understanding with LLMs (β Implemented)
- AI-powered entity extraction
- Custom analysis instructions
- Comprehensive caching in .docsray directories
- Rich format preservation (markdown, images, tables)
-
IBM.Docling - Advanced document understanding (β Implemented)
- Best-in-class layout understanding
- Visual Language Model integration
- Advanced table and figure detection
- Multi-format support (PDF, DOCX, HTML, images)
- Reading order preservation
- Structured extraction capabilities
-
MIMIC.DocsRay - Coarse-to-fine search methodology (β Implemented)
- Semantic search with RAG capabilities
- Hybrid OCR engine (AI + traditional)
- Document chunking and embedding
- Multimodal analysis
- Filesystem search optimization
- Context-aware analysis
-
PyTesseract - OCR for scanned documents (π Planned)
-
Mistral OCR - AI-powered OCR and analysis (π Planned)
- Universal Input Support - Local files (./path, ../path, /absolute) and URLs (https://)
- Intelligent Provider Selection - Automatically chooses the best tool for each task
- Smart Caching - LlamaParse results cached in .docsray directories for instant access
- Dynamic Discovery - Tools report actual capabilities based on what's enabled
- Production Ready - Comprehensive error handling, logging, and 56 tests
- Self-Documenting - Built-in resources for discovery by MCP clients
# Run directly without installation
uvx docsray-mcp start
# Or install globally
uv tool install docsray-mcp
# Then run with:
docsray start
# or
docsray-mcp start# Basic installation (PyMuPDF4LLM only)
pip install docsray-mcp
# With LlamaParse for AI analysis
pip install "docsray-mcp[ai]"
# Development installation
pip install -e ".[dev]"# Pull from Docker Hub
docker pull xingh/docsray-mcp:latest
# Run in stdio mode
docker run -it --rm xingh/docsray-mcp:latest
# Run in HTTP mode
docker run -it --rm -p 3000:3000 -e DOCSRAY_TRANSPORT=http xingh/docsray-mcp:latest# Pull from GHCR
docker pull ghcr.io/xingh/docsray-mcp:latest
# Run (same commands as above, just different image)
docker run -it --rm ghcr.io/xingh/docsray-mcp:latestAvailable Tags:
latest- Latest stable release0.6.0- Specific versiondev- Development builds from main branch
Development with VS Code DevContainer:
- Install the "Dev Containers" extension
- Open project in VS Code
- Click "Reopen in Container"
- Includes Claude Desktop pre-configured!
See Docker Guide for complete documentation.
Create a .env file in your project:
# For AI-powered analysis with LlamaParse
# Either use the Docsray-specific env var (preferred):
DOCSRAY_LLAMAPARSE_API_KEY=llx-your-key-here
# Or use the standard LlamaParse env var (also supported):
# LLAMAPARSE_API_KEY=llx-your-key-here
# Note: DOCSRAY_LLAMAPARSE_API_KEY takes precedence if both are set
# Or use environment variables
export DOCSRAY_LLAMAPARSE_API_KEY=llx-your-key-here
# export LLAMAPARSE_API_KEY=llx-your-key-here # AlternativeGet your free LlamaParse API key at cloud.llamaindex.ai
Add to your Cursor settings:
{
"mcpServers": {
"docsray": {
"command": "uvx",
"args": ["docsray-mcp"],
"env": {
"LLAMAPARSE_API_KEY": "llx-your-key-here"
}
}
}
}Note: You can use either
LLAMAPARSE_API_KEY(shown above) orDOCSRAY_LLAMAPARSE_API_KEYin the MCP client configuration.
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"docsray": {
"command": "uvx",
"args": ["docsray-mcp"],
"env": {
"LLAMAPARSE_API_KEY": "llx-your-key-here"
}
}
}
}Note: You can use either
LLAMAPARSE_API_KEY(shown above) orDOCSRAY_LLAMAPARSE_API_KEYin the MCP client configuration.
Peek at ./document.pdf to see its structure and available formats
Xray ./contract.pdf and extract all parties, dates, payment terms, and obligations
Map the complete structure of ./manual.pdf including all sections and subsections
Extract pages 10-20 from ./report.pdf as markdown
Analyze https://arxiv.org/pdf/2301.00234.pdf for methodology and key findings
Fetch https://example.com/document.pdf with processed format
Fetch ./local/document.pdf with metadata-only format
Search for "machine learning" in ./research/ with coarse-to-fine strategy
Find documents about "contracts" in /legal/ using semantic search
Extract text from document.pdf with provider pymupdf4llm (fast)
Xray document.pdf with provider llama-parse (AI analysis)
Analyze document.pdf with provider ibm-docling (advanced layout)
Search documents with provider mimic-docsray (semantic)
# Provider Configuration
DOCSRAY_PYMUPDF4LLM_ENABLED=true # Always true by default
DOCSRAY_LLAMAPARSE_ENABLED=true
LLAMAPARSE_API_KEY=llx-your-key
# IBM.Docling Provider
DOCSRAY_IBM_DOCLING_ENABLED=false
DOCSRAY_IBM_DOCLING_USE_VLM=true
DOCSRAY_IBM_DOCLING_USE_ASR=false
DOCSRAY_IBM_DOCLING_OCR_ENABLED=true
DOCSRAY_IBM_DOCLING_TABLE_DETECTION=true
DOCSRAY_IBM_DOCLING_FIGURE_DETECTION=true
DOCSRAY_IBM_DOCLING_DEVICE=cpu # or cuda
# MIMIC.DocsRay Provider
DOCSRAY_MIMIC_ENABLED=false
DOCSRAY_MIMIC_RAG_ENABLED=true
DOCSRAY_MIMIC_SEMANTIC_RANKING=true
DOCSRAY_MIMIC_MULTIMODAL=true
DOCSRAY_MIMIC_HYBRID_OCR=true
DOCSRAY_MIMIC_COARSE_TO_FINE=true
DOCSRAY_MIMIC_CHUNK_SIZE=1000
DOCSRAY_MIMIC_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# Performance Tuning
DOCSRAY_CACHE_ENABLED=true
DOCSRAY_CACHE_TTL=3600
DOCSRAY_MAX_CONCURRENT_REQUESTS=5
DOCSRAY_TIMEOUT_SECONDS=30
# Logging
DOCSRAY_LOG_LEVEL=INFO- β Fast text extraction
- β Markdown formatting
- β Basic table detection
- β Multi-page support
- β No AI analysis
- β No OCR
- β AI-powered analysis
- β Entity extraction
- β Custom instructions
- β Table extraction
- β Image extraction
- β Layout preservation
- β Relationship mapping
- β Result caching
- β Advanced layout understanding
- β Visual Language Model integration
- β Best-in-class table detection
- β Figure classification and understanding
- β Multi-format support (PDF, DOCX, HTML, images)
- β Reading order preservation
- β Structured information extraction
- β Document classification
- β OCR with layout understanding
- β Form field detection
- β Multi-language support
- β Coarse-to-fine search methodology
- β Semantic search with RAG
- β Document chunking and embedding
- β Hybrid OCR (AI + traditional)
- β Multimodal analysis
- β Context-aware analysis
- β Filesystem search optimization
- β Semantic ranking
- β Entity extraction
- β Relationship mapping
# Run all tests
pytest tests/
# Run only unit tests (no API calls)
pytest tests/unit/
# Run integration tests
pytest tests/integration/
# Run with coverage
pytest tests/ --cov=src/docsray --cov-report=htmlCurrent test coverage: 52 tests passing with comprehensive coverage across all components
Get quick document overview and metadata.
{
"document_url": "path/to/document.pdf",
"depth": "structure", # metadata | structure | preview
"provider": "auto" # auto | pymupdf4llm | llama-parse
}Generate comprehensive document structure map.
{
"document_url": "path/to/document.pdf",
"include_content": false,
"analysis_depth": "deep", # basic | deep | comprehensive
"provider": "auto"
}Deep AI-powered document analysis.
{
"document_url": "path/to/document.pdf",
"analysis_type": ["entities", "key-points"],
"custom_instructions": "Extract all dates and amounts",
"provider": "llama-parse"
}Extract content in various formats.
{
"document_url": "path/to/document.pdf",
"extraction_targets": ["text", "tables"],
"output_format": "markdown", # markdown | text | json
"pages": [1, 2, 3], # Optional: specific pages
"provider": "auto"
}Navigate to specific document locations.
{
"document_url": "path/to/document.pdf",
"target": {"page": 5}, # or {"section": "Introduction"} or {"query": "search text"}
"extract_content": true,
"provider": "auto"
}Unified document retrieval from web URLs or filesystem.
{
"source": "https://example.com/doc.pdf", # or "./local/path.pdf"
"fetch_options": {"timeout": 30000, "headers": {}},
"cache_strategy": "use-cache", # use-cache | bypass-cache | refresh-cache
"return_format": "processed", # raw | processed | metadata-only
"provider": "auto"
}Intelligent filesystem search with coarse-to-fine methodology.
{
"query": "machine learning algorithms",
"searchPath": "./research/",
"searchStrategy": "coarse-to-fine", # coarse-to-fine | semantic | keyword | hybrid
"fileTypes": ["pdf", "docx", "md"],
"maxResults": 10,
"provider": "mimic-docsray"
}docsray-mcp/
βββ src/docsray/
β βββ server.py # FastMCP server with discovery resources
β βββ providers/ # Provider implementations
β β βββ base.py # Provider interface
β β βββ pymupdf4llm.py # Fast PDF extraction
β β βββ llamaparse.py # AI-powered analysis
β βββ tools/ # MCP tool implementations
β β βββ peek.py # Document overview
β β βββ map.py # Structure mapping
β β βββ xray.py # Deep analysis
β β βββ extract.py # Content extraction
β β βββ seek.py # Navigation
β βββ utils/ # Utilities
β βββ cache.py # Document caching
β βββ llamaparse_cache.py # LlamaParse .docsray cache
βββ tests/
β βββ unit/ # Fast isolated tests
β βββ integration/ # Component interaction tests
β βββ manual/ # Debugging scripts
βββ PROMPTS.md # Example prompts for all use cases
We welcome contributions! See CONTRIBUTING.md for guidelines.
# Clone the repository
git clone https://github.com/docsray/docsray-mcp.git
cd docsray-mcp
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run linting
ruff check src/This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- Built on FastMCP framework
- Document processing powered by PyMuPDF4LLM
- AI analysis powered by LlamaParse
- Inspired by the Model Context Protocol specification
- π Documentation
- π Issue Tracker
- π¬ Discussions
Made with β€οΈ for the MCP ecosystem