Skip to content

Nitin399-maker/Table_OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR-EVAL

Extracting tables from images is a challenging task that involves not just recognizing the text (OCR), but also accurately reconstructing the table's structure — including rows, columns, headers, merged cells, and nested sub-tables. Traditional OCR systems often struggle with preserving this spatial layout, especially when tables have complex formats or multiple nested structures.

To address this, this project evaluates the performance of various Large Language Models (LLMs) in extracting markdown-formatted tables from table images. The workflow includes:

  • Sending table images to different LLMs using a variety of carefully designed prompts.
  • Receiving and parsing the Markdown table outputs generated by the models.
  • Comparing these outputs with ground-truth Markdown representations of the original tables.

This evaluation is automated and powered by Promptfoo, a framework for benchmarking LLM responses. The models are assessed based on three key criteria:

  1. Content Accuracy – Whether the correct cell values are extracted.
  2. Positional Correctness – Whether values appear in the right row and column.
  3. Structural Integrity – Whether the layout of the table (rows, columns) match the ground truth.

By leveraging Promptfoo, this project provides insights into how well different LLMs understand and reconstruct table data from images — a crucial step in reliable OCR-based data extraction pipelines.

Setup

To run this, get an OpenRouter API key Then run:

git clone https://github.com/Nitin399-maker/Table_OCR.git
cd  Table_OCR
$env:OPENROUTER_API_KEY=...
npx promptfoo eval -c promptfooconfig.yaml --output output/result.json --no-cache

About

Table Evals using Promptfoo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published