This is an unofficial implementation of FOTS: Fast Oriented Text Spotting with a Unified Network, which is a unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks. and i mainly borrows from E2E-MLT, which is an End-to-end text training and recognition network.
- python3.x with
- opencv-python
- pytorch 0.4.1
- torchvision
- warp-ctc (https://github.com/SeanNaren/warp-ctc/)
- gcc6.3 or 7.3 for nms
# optional
source activate conda_env
cd $project_path/rroi_align
sh make.sh # compile
- EAST nms for EAST nms compile, gcc-6.3 works for me. other version i have not test. any problem can refer to MichalBusta/E2E-MLT#21 or the argman/EAST
first download the pretrained model from baidu,password:ndav. which is trained on ICDAR2015. put the model in weights
folder, then can test on some icdar2015 test samples
cd $project_path
python test.py
some examples:
图1 | 图2 |
图3 | 图4 |
图5 | 图6 |
RoIRotate applies transformation on oriented feature regions to obtain axis-aligned feature maps.use bilinear interpolation to compute the values of the output
图1 | 图2 |
图3 | 图4 |
图5 | 图6 |
download the ICDAR2015 data and the train_list from baidu, password:q1au
# train_list.txt list the train images path
/home/yangna/deepblue/OCR/data/ICDAR2015/icdar-2015-Ch4/img_546.jpg
/home/yangna/deepblue/OCR/data/ICDAR2015/icdar-2015-Ch4/img_277.jpg
/home/yangna/deepblue/OCR/data/ICDAR2015/icdar-2015-Ch4/img_462.jpg
/home/yangna/deepblue/OCR/data/ICDAR2015/icdar-2015-Ch4/img_237.jpg
training:
python train.py -train_list=$path_to/ICDAR2015.txt
Code borrows from MichalBusta/E2E-MLT