FOTS: Fast Oriented Text Spotting with a Unified Network text detection branch reimplementation (PyTorch)
Train with SynthText for 9 epochs
time python3 --train-folder SynthText/ --batch-size 21 --batches-before-train 2
At this point the result was
Epoch 8: 100%|█████████████| 390/390 [08:28<00:00, 1.00it/s, Mean loss=0.98050]
. -
Train with ICDAR15
Replace a data set in
data_set = datasets.SynthText(args.train_folder, datasets.transform)
and runtime python3 --train-folder icdar15/ --continue-training --batch-size 21 --batches-before-train 2
It is expected that the provided
contains unzippedch4_training_images
. To avoid saving model at each epoch, the lineif True:
can be replaced withif epoch > 60 and epoch % 6 == 0:
The result was
Epoch 582: 100%|█████████████| 48/48 [01:05<00:00, 1.04s/it, Mean loss=0.11290]
Epoch 175: reducing learning rate of group 0 to 5.0000e-04.
Epoch 264: reducing learning rate of group 0 to 2.5000e-04.
Epoch 347: reducing learning rate of group 0 to 1.2500e-04.
Epoch 412: reducing learning rate of group 0 to 6.2500e-05.
Epoch 469: reducing learning rate of group 0 to 3.1250e-05.
Epoch 525: reducing learning rate of group 0 to 1.5625e-05.
Epoch 581: reducing learning rate of group 0 to 7.8125e-06.
python3 --images-folder ch4_test_images/ --output-folder res/ --checkpoint && zip -jmq runs/ res/* && python2 -s=runs/
and ch4_training_localization_transcription_gt
are available in Task 4.4: End to End (2015 edition).
and ch4_test_images
can be found in My Methods (Script: IoU
and test set samples
It gives Calculated!{"precision": 0.8694968553459119, "recall": 0.7987481945113144, "hmean": 0.8326223337515684, "AP": 0}
The pretrained models are here:
has a commented code to visualize results.
- The model is different compared to what the paper describes. An explanation is in
. - The authors of FOTS could not train on clipped words because they also have a recognition branch. The whole word is required to be present on an image to be able to be recognized correctly. This reimplementation has only detection branch and that allows to train on crops of the words.
- The paper suggest using some other data sets in addition. Training on SynthText is simplified in this reimplementation.