This is an implementation of paper "Query by Strings and Return Ranking Word Regions with Only One Look". The complete code will be provided soon. Please wait patiently!
- Python 3.5
- PyTorch v1.1.0
- shapely
- pillow
- opencv-python
- scipy
- tqdm
- scikit-image
- numpy
pip install -r requirements.txt
- Download the training, validation and testing dataset
The Konzilsprotokolle dataset can be downloaded from [ICFHR 2016 Handwritten Keyword Spotting Competition (H-KWS2016)](https://www.prhlt.upv.es/contests/icfhr2016-kws/data.html).
The BH2M dataset can be downloaded from [IEHHR2017 competition](https://rrc.cvc.uab.es/?ch=10&com=downloads)
- Convert the downloaded dataset into the format we need
python ./tools/tools_Konzilsprotokolle.py python ./tools/tools_BH2M.py
- Augmenting training data offline
python ./tools/tools_Konzilsprotokolle_docaug.py python ./tools/tools_BH2M_docaug.py
python train.py
python predict.py
The cropped word images of the query and visual document images will be saved to ./output/~/QbS_word_res/
and ./output/~/QbS_res/
by default.
The downloaded training models need to be put in ./output/
folder.
Method | MAP(overlap=0.25) (%) | MAP(overlap=0.50) (%) | Model |
---|---|---|---|
ResNet50 + FPN | 95.30 | 95.09 | baiduyun(extract code: n0ax) |
Fig. 1. The visualization results of several queries for the proposed method on BH2M. The figure shows the top 7 results starting from the left. The correct search results are highlighted in green. "CD" means the cosine distance between the predicted word embedding of the word area and the ground truth. The smaller the cosine distance, the greater the similarity.
If you find our method useful for your reserach, please cite:
Suggestions and discussions are greatly welcome. Please contact the authors by sending email to 18120456@bjtu.edu.cn