From 386318ea6d506a1c1a867d9a0e482761a2a6987b Mon Sep 17 00:00:00 2001 From: THUAUD Simon Date: Tue, 3 Feb 2026 18:21:51 +0100 Subject: [PATCH] feat: better readme --- README.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 8365b63..6e84b2a 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ **AI-powered document data extraction toolkit** -Extract structured data from documents (invoices, receipts, forms) using Claude's vision API. Easily integrate into your Python applications with flexible input options and built-in cost tracking. +Extract structured data from documents (invoices, receipts, forms) using any supported provider. Easily integrate into your Python applications with flexible input options and built-in cost tracking. > ⚠️ **Early Development**: This project is in active development. Core functionality is working, but many features are still being built. @@ -10,9 +10,9 @@ Extract structured data from documents (invoices, receipts, forms) using Claude' - ✅ **Vision API Integration**: Extract data from images (.jpg, .png, .gif, .webp) - ✅ **Flexible Input**: Accepts file paths, bytes, or file-like objects (like PIL, requests) -- ✅ **Cost Tracking**: Built-in monitoring and limits for API usage +- ✅ **Cost Tracking**: Built-in monitoring and limits for API usage (needs to be improved) - ✅ **Structured Output**: Returns Pydantic-validated data models that you can define -- 🚧 **Multi-strategy Extraction**: Cost-optimized cascade to reduce api calls (planned) +- ✅ **Providers**: Currently supports Anthropic, OpenAI and local with Ollama ## Quick Start @@ -22,7 +22,7 @@ uv sync # Setup environment cp .env.template .env -# Add your Anthropic API key to .env +# Add your Anthropic or OpenAI API key to .env # Run a test uv run python example.py @@ -69,11 +69,12 @@ make test-cov ## Requirements - Python 3.13 -- Anthropic API key (for Claude vision API) +- Anthropic API key or OpenAI API key +- Optional: Ollama for local model support ## Citation -For testing and evaluation, we are using the following dataset: +For testing and evaluation, we are currently using the following dataset: > Limam, M., et al. FATURA Dataset. Zenodo, 13 Dec. 2023, https://doi.org/10.5281/zenodo.10371464.