requirements

This repository contains the unified implementation of 1-Dimention and 2-Dimention seq2seq-attention network for OCR(simplified and complex chinese and english) with pytorch. Also can recognition horizonal and vertical texts with beam search and greedy search strategy.
At the same time, we trained the network with samples generated by the repository(https://github.com/yangsuhui/text_data_generator) which is also modified by me, this data generation repository can generate varied length、multi-lang、horizonal and vertical texts with random backgrounds.

requirements

pytorch 1.1.0

cd Attention_ocr_recognition
pip install -r requirements.txt

Train

1-D mode

python train.py --trainlist ./data/ch_train.txt --vallist ./data/ch_test.txt --imgH 32 --imgW 280 --experiment ./expr/attention1dcnn --niter 40 --saveInterval 10 --mode 1D --use_beam_search

2-D mode

python train.py --trainlist ./data/train_v.txt --vallist ./data/test_v.txt --imgH 420 --imgW 420

Datasets

1-D mode: A small dataset from Synthetic_Chinese_String_Dataset, about 270000+ images for training, 20000 images for testing. download the image data from Baidu with the 1-D datasets above, the model can achive 0.98 accuracy.

2-D mode datasets is generated using the repository (https://github.com/yangsuhui/text_data_generator) with the 2-D datasets about 10W training data and 5k test data, the model can achive 0.997 accuracy in vertical texts.

the train.txt and test.txt are created as the follow form:

path/to/image_name.jpg label
path/AttentionData/images/00000554.png 一份「設計構想」──用的就是
path/AttentionData/images/00001027.png 一排椅子上的那兩個中學生，他就
path/AttentionData/images/00000319.png 來，太激動了！多年來

Inference

The following command is used for 2-D mode, the 1-D mode also can used this command with same hypeparameters modified such as imgH、imgW etc.

python demo.py --encoder ./expr/attention2dcnn_v_res18/encoder_160.pth --decoder ./expr/attention2dcnn_v_res18/decoder_160.pth --imgH 420 --imgW 420 --img_path ./test_img/2dimg/00004747.png --use_beam_search

Download the 2-D model(Baidu, password: q8kp) and put the model in the foler of ./expr/attention2dcnn_v_res18/, and using the command above，the visulization of attention results gif will be saved in folder of vis.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
models		models
src		src
test_img		test_img
vis		vis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chars.py		chars.py
demo.py		demo.py
requirements.txt		requirements.txt
train.py		train.py
visualize_attention_own.py		visualize_attention_own.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

requirements

Train

Datasets

Inference

Reference

About

Releases

Packages

Languages

License

kapitsa2811/Attention_ocr_recognition

Folders and files

Latest commit

History

Repository files navigation

requirements

Train

Datasets

Inference

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages