Short presentation of our project
-
Installation
2.a Install conda environment
2.b Download databases- IAM dataset
- ICFHR 2014 dataset
-
How to use
3.a Make predictions on unlabelled data using our best networks
3.b Train and test a network from scratch
3.c Test a model without retraining it
This work was an internship project under Mathieu Aubry's supervision, at the LIGM lab, located in Paris.
In HTR, the task is to predict a transcript from an image of a handwritten text. A commonly used structure for this task is Convolutional Recurrent Neural Networks (CRNN). One CRNN network consists of a feature extractor (often with convolutional layers), followed by a recurrent network (LSTM).
This github provides a framework to train and test CRNN networks on handwritten grayscale line-level datasets. This github also provides code to generate predictions on an unlabelled, line-level, grayscale line-level dataset. There are several options for the structure of the CRNN used, image preprocessing, dataset used, data augmentation.
Make sure you have Anaconda installed (version >= to 4.7.10, you may not be able to install correct dependencies if older). If not, follow the installation instructions provided at https://docs.anaconda.com/anaconda/install/.
Also pull the git.
Once in the git folder on your machine, run the command lines :
conda env create -f HTR_environment.yml
conda activate HTR
You will only need to download these databases if you want to train your own network from scratch. The framework is built to train a network on one of these 2 datasets : IAM and ICFHR2014 HTR competition. [ADD REF TO SLIDES]
-
Before downloading IAM dataset, you need to register on this website. Once that's done, you need to download :
-
For ICFHR2014 dataset, you need to download the 'BenthamDatasetR0-GT' folder at this link.
Make sure to download the two databases in the same folder. Structure must be
Your data folder /
IAM/
lines.txt
lines/
split/
trainset.txt
testset.txt
validationset1.txt
validationset2.txt
ICFHR2014/
BenthamDatasetR0-GT/
Your own dataset/
Running this code will use model stored at model_path
to make predictions on images stored in data_path
.
The predictions will be stored in predictions.txt
in data_path
folder.
python lines_predictor.py --data_path datapath --model_path ./trained_networks/IAM_model_imgH64.pth --imgH 64
/!\ Make sure that each image in the data folder has a unique file name and all images are in .jpg form. When you use our trained model with imgH as 64 (i.e. IAM_model_imgH64.pth), you have to set the argument --imgH as 64.
python train.py --dataset dataset --tr_data_path data_dir --save_model_path path
Before running the code, make sure that you change ROOT_PATH
variable at the beginning of params.py
to the path of the folder you want to save your models in.
Main arguments :
--dataset
: name of the dataset to train and test on. Supported values areICFHR2014
andIAM
.--tr_data_path
: location of the train dataset folder on local machine. See section [??] for downloading datasets.--save_model_path
: path of the folder where model will be saved ifparams.save
is set to True.
Main learning arguments :
-
--data_aug
: If set toTrue
, will apply random affine data transformation to the training images. -
--optimizer
: Which optimizer to use. Supported values arermsprop
,adam
,adadelta
, andsgd
. We recommend using RMSprop, which got best results in our experiments. Seeparams.py
for optimizer-specific parameters. -
--epochs
: Number of training epochs -
--lr
: Learning rate at the beginning of training. -
--milestones
: List of the epochs at which the learning rate will be divided by 10. -
feat_extractor
: Structure to use for the feature extractor. Supported values areresnet18
,custom_resnet
, andconv
.resnet18
: standard structure of resnet18.custom_resnet
: variant of resnet18 that we tuned for our experiments.conv
: Use this option if you want to use a purely convolutional feature extractor and not a residual one. See conv parameters inparams.py
to choose conv structure.
Running this code will compute the average CER and WER of model stored at pretrained_model
path on the testing set of chosen dataset
.
python train.py --train '' --save '' --pretrained_model model_path --dataset dataset --tr_data_path data_path
Main arguments :
--pretrained_model
: path to state_dict of pretrained model.--dataset
: Which dataset to test on. Supported values areICFHR2014
andIAM
.--tr_data_path
: path to the dataset folder (see section [??])
Graves et al. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks
Sánchez et al. A set of benchmarks for Handwritten Text Recognition on historical documents
Dutta et al. Improving CNN-RNN Hybrid Networks for
Handwriting Recognition
U.-V. Marti, H. Bunke The IAM-database: an English sentence database for offline handwriting recognition
https://github.com/Holmeyoung/crnn-pytorch
https://github.com/georgeretsi/HTR-ctc
Synthetic line generator : https://github.com/monniert/docExtractor (see paper for more information)
If you have questions or remarks about this project, please email us at virginie.loison@eleves.enpc.fr and xiwei.hu@telecom-paris.fr.