Menu Text Detection System

Extract structured menu information from images into JSON using a fine-tuned E2E model or LLM.

demo.mp4

🚀 Features

Overview

Currently supports the following information from menu images:

Restaurant Name
Business Hours
Address
Phone Number
Dish Information
- Name
- Price

For the JSON schema, see tools directory.

Supported Methods to Extract Menu Information

Fine-tuned E2E model and Training metrics

Donut (Document Parsing Task) - Base model by Clova AI (ECCV ’22)

LLM Function Calling

Google Gemini API
OpenAI GPT API

💻 Training / Fine-Tuning

Setup

Use uv to set up the development environment:

uv sync

or use pip install -r requirements.txt if it has any problems

Training Script (Datasets collecting, Fine-Tuning)

Please refer train.ipynb. Use Jupyter Notebook for training:

uv run jupyter-notebook

For VSCode users, please install Jupyter extension, then select .venv/bin/python as your kernel.

Run Demo Locally

uv run python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.checkpoints		.checkpoints
.github/workflows		.github/workflows
examples		examples
menu		menu
tools		tools
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train.ipynb		train.ipynb
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Menu Text Detection System

🚀 Features

Overview

Supported Methods to Extract Menu Information

Fine-tuned E2E model and Training metrics

LLM Function Calling

💻 Training / Fine-Tuning

Setup

Training Script (Datasets collecting, Fine-Tuning)

Run Demo Locally

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ryanlinjui/menu-text-detection

Folders and files

Latest commit

History

Repository files navigation

Menu Text Detection System

🚀 Features

Overview

Supported Methods to Extract Menu Information

Fine-tuned E2E model and Training metrics

LLM Function Calling

💻 Training / Fine-Tuning

Setup

Training Script (Datasets collecting, Fine-Tuning)

Run Demo Locally

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages