Vietnamese_Handwriting_Recognition

An OCR model for Vietnamese Handwriting Recognition problems with CNN + LSTM implemented with PyTorch Deeplearning framework.

Idea

This model is based on the proposed architecture in this paper: https://arxiv.org/pdf/1507.05717.pdf

- I used pretrained VGG16 for CNN's backbone, and Bidirectional LSTM for recurrent layers

Requirements

I highly recommend using conda virtual environment

pip install -r requirements.txt

Dataset

This dataset is provided by Cinamon AI for Cinamon's AI Challenge.

Preprocessing

In this step, we have to

Binarized image by applying Otsu's thresholding method
Remove noise
Smooth boundaries by applying Contour Filter

Training

I divided training process into 2 phases:

Phase 1: Train LSTM only: 40 epochs, freezed VGG ,lr = 1e-3

python train.py --epoch [num of epochs] --img_path [path to img directory] --label_path [path to label directory] --lr [learning rate] --batch_size [batchsize] --ft [finetune: true or false] --mode [decode mode: 'greedy' or 'beam']

Phase 2: Finetune VGG16 backbone: 30 epochs, unfreezed VGG, lr = 1e-4

python finetune.py --epoch [num of epochs] --img_path [path to img directory] --label_path [path to label directory] --lr [learning rate] --batch_size [batchsize] --ft [finetune: true or false] --mode [decode mode: 'greedy' or 'beam']

Decoding

I used CTC as loss function. There are two strategies for decoding task, Greedy or BeamSearch decoder.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
checkpoints		checkpoints
data		data
decoder		decoder
imgs		imgs
network		network
.gitignore		.gitignore
README.md		README.md
engine.py		engine.py
finetune.py		finetune.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vietnamese_Handwriting_Recognition

Idea

Requirements

Dataset

Preprocessing

Training

Decoding

Inference

Result

About

Releases

Packages

Languages

chnk58hoang/Vietnamese_Handwriting_Recognition

Folders and files

Latest commit

History

Repository files navigation

Vietnamese_Handwriting_Recognition

Idea

Requirements

Dataset

Preprocessing

Training

Decoding

Inference

Result

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages