Multimodal approach to optical character recognition
git clone https://github.com/JoshuaPlacidi/semantic-ocr
- Set the path to COCO 2014 training image folder in config.py
- Download pretrained model from (TPS-ResNet-BiLSTM-Atn): https://www.dropbox.com/sh/j3xmli4di1zuv3s/AAArdcPgz7UFxIHUuKNOeKv_a?dl=0&preview=TPS-ResNet-BiLSTM-Attn.pth
python train.py