This code repository contains an implementation of (RF-Learning: Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition (ICDAR 2021)) . We excavate the implicit task, character counting within the traditional text recognition, without additional labor annotation cost. The implicit task plays as an auxiliary branch for complementing the sequential recognition. We design a two-branch reciprocal feature learning framework in order to adequately utilize the features from both the tasks. Through exploiting the complementary effect between explicit and implicit tasks, the feature is reliably enhanced.
Dataset | Samples | Description | Release |
---|---|---|---|
MJSynth | 8919257 | Scene text recognition synthetic data set | Link |
SynText | 7266164 | A synthesized by scene text dataset, and the text is cropped from the large image | Link |
Test Set | Instance Number | Note |
---|---|---|
IIIT5K | 3000 | regular |
SVT | 647 | regular |
IC03_860 | 860 | regular |
IC13_857 | 857 | regular |
IC15_1811 | 1811 | irregular |
SVTP | 645 | irregular |
CUTE80 | 288 | irregular |
Test Set | Instance Number | Note |
---|---|---|
IIIT5K | 3000 | regular |
SVT | 647 | regular |
IC03_860 | 860 | regular |
IC13_857 | 857 | regular |
IC15_1811 | 1811 | irregular |
SVTP | 645 | irregular |
CUTE80 | 288 | irregular |
A quick start is to use above lmdb-formatted datasets that contain the full benchmarks for scene text recognition tasks as belows.
Data Type: LMDB
File storage format:
|-- train
| |-- MJ
| |-- ST
|-- validation
| |-- mixture
|-- evaluation
| |-- mixture
Run the following bash command in the command line,
cd .
bash ./train.sh
We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add
--no-validate
command.
1.Visual Stage
cd .
bash ./test_scripts/test_rfl_visual.sh
2.Total Stage
cd .
bash ./test_scripts/test_rfl_total.sh
Methods | Regular Text | Irregular Text | Download | ||||||
Name | IIIT5K | SVT | IC03 | IC13 | IC15 | SVTP | CUTE80 | Config | Model |
RF-Learning visual(Report) | 95.7 | 94.0 | 96.0 | 95.2 | 84.2 | 87.0 | 85.8 | - |
- |
RF-Learning visual | 96.0 | 94.7 | 96.2 | 95.9 | 88.7 | 86.7 | 88.2 | pth [Link] (Access Code: 04OV) |
|
RF-Learning total(Report) | 94.1 | 88.6 | 94.9 | 94.5 | 82.4 | 82.0 | 82.6 | - |
- |
RF-Learning total | 94.5 | 90.0 | 94.0 | 94.1 | 81.5 | 82.0 | 84.7 | pth [Link] (Access Code: 49z1) |
|
Here is the picture for result visualization.
@article{rflearning,
author={Hui Jiang and Yunlu Xu and Zhanzhan Cheng and Shiliang Pu and Yi Niu and Wenqi Ren and Fei Wu and Wenming Tan},
title={Reciprocal Feature Learning via Explicit and Implicit Tasks in Scene Text Recognition},
journal={CoRR},
volume={abs/2105.06229},
year={2021},
}
This project is released under the Apache 2.0 license
If there is any suggestion and problem, please feel free to contact the author with qiaoliang6@hikvision.com.