Skip to content

Latest commit

 

History

History
60 lines (50 loc) · 1.59 KB

README.md

File metadata and controls

60 lines (50 loc) · 1.59 KB

Conformer OCR

Introduction

Conformer OCR is an Optical Character Recognition toolkit built for researchers working on both OCR for both Vietnamese and English. This project only focused on variants of vanilla Transformer and Feature Extraction (CNN-based approach).

This is also the first repo to utilize ConformerNet (https://arxiv.org/abs/2005.08100) for OCR.

Architecture

Key Features

  • Variants of Transformer (e.g., Vanilla, Conformer) encoder with CTC decoder.
  • Both naive Pytorch and Pytorch Lightning are provided
  • Beam search with N-gram Language model
  • Accumulation gradient training

Install dependencies

cd transformer_ocr
pip install -r requirements/requirements.txt

Directory structure

To modulize the repo, the current structure is adopted as follows:

├── conf # configurations
│   ├── dataset
│   ├── model
│   ├── optimizer
│   ├── pl_params
│   └── config.yaml
├── requirements # Where store different requirements if needed
│   └── requirements.txt
├── scripts # Where start your training/evaluation/testing models 
│   ├── train.py
│   └── train_PT.py
├── transformer_ocr # Main resource
└── README.md 

Tutorials

Quick start

Train with naive Pytorch mode

cd scripts
python train.py

Train with Pytorch Lightning mode

cd scripts
python train_PT.py

Pre-trained models

Coming soon...