Conformer OCR

Introduction

Conformer OCR is an Optical Character Recognition toolkit built for researchers working on both OCR for both Vietnamese and English. This project only focused on variants of vanilla Transformer and Feature Extraction (CNN-based approach).

This is also the first repo to utilize ConformerNet (https://arxiv.org/abs/2005.08100) for OCR.

Architecture

Key Features

Variants of Transformer (e.g., Vanilla, Conformer) encoder with CTC decoder.
Both naive Pytorch and Pytorch Lightning are provided
Beam search with N-gram Language model
Accumulation gradient training

Install dependencies

cd transformer_ocr
pip install -r requirements/requirements.txt

Directory structure

To modulize the repo, the current structure is adopted as follows:

├── conf # configurations
│   ├── dataset
│   ├── model
│   ├── optimizer
│   ├── pl_params
│   └── config.yaml
├── requirements # Where store different requirements if needed
│   └── requirements.txt
├── scripts # Where start your training/evaluation/testing models 
│   ├── train.py
│   └── train_PT.py
├── transformer_ocr # Main resource
└── README.md

Tutorials

Quick start

Train with naive Pytorch mode

cd scripts
python train.py

Train with Pytorch Lightning mode

cd scripts
python train_PT.py

Pre-trained models

Coming soon...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Conformer OCR

Introduction

Architecture

Key Features

Install dependencies

Directory structure

Tutorials

Quick start

Pre-trained models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Conformer OCR

Introduction

Architecture

Key Features

Install dependencies

Directory structure

Tutorials

Quick start

Pre-trained models