A Deep Deformable Network for Instance Segmentation of Dense and Uneven Layouts in Handwritten Manuscripts
To appear at ICDAR 2021
[ Paper ] |
[ Website ] |
---|
The PALMIRA code is tested with
- Python (
3.7.x
) - PyTorch (
1.7.1
) - Detectron2 (
0.4
) - CUDA (
10.0
) - CudNN (
7.3-CUDA-10.0
)
For setup of Detectron2, please follow the official documentation
We have provided environment files for both Conda and Pip methods. Please use any one of the following.
conda env create -f environment.yml
pip install -r requirements.txt
- Download the Indiscapes-v2 [
Dataset Link
] - Place the
- Dataset under
images
directory - COCO-Pretrained Model weights in the
init_weights
directory- Weights
used: [
Mask RCNN R50-FPN-1x Link
]
- Weights
used: [
- JSON in
doc_v2
directory (a sample JSON has been provided here)
- Dataset under
More information can be found in folder-specific READMEs.
If your compute uses SLURM workloads, please load these (or equivalent) modules at the start of your experiments. Ensure that all other modules are unloaded.
module add cuda/10.0
module add cudnn/7.3-cuda-10.0
Train the presented network
python train_palmira.py \
--config-file configs/palmira/Palmira.yaml \
--num-gpus 4
- Any required hyper-parameter changes can be performed in the
Palmira.yaml
file. - Resuming from checkpoints can be done by adding
--resume
to the above command.
Please refer to the README.md under the configs
directory for ablative variants and baselines.
To perform inference and get quantitative results on the test set.
python train_palmira.py \
--config-file configs/palmira/Palmira.yaml \
--eval-only \
MODEL.WEIGHTS <path-to-model-file>
- This outputs 2 json files in the corresponding output directory from the config.
coco_instances_results.json
- This is an encoded format which is to be parsed to get the qualitative resultsindiscapes_test_coco_format.json
- This is regular coco encoded format which is human parsable
Can be executed only after quantitative inference (or) on validation outputs at the end of each training epoch.
This parses the output JSON and overlays predictions on the images.
python visualise_json_results.py \
--inputs <path-to-output-file-1.json> [... <path-to-output-file-2.json>] \
--output outputs/qualitative/ \
--dataset indiscapes_test
NOTE: To compare multiple models, multiple input JSON files can be passed. This produces a single vertically stitched image combining the predictions of each JSON passed.
To run the model on your own images without training, please download the provided weights from [here
].
The Google Colab Notebook to perform the same experiment can be found at this link
.
python demo.py \
--input <path-to-image-directory-*.jpg> \
--output <path-to-output-directory> \
--config configs/palmira/Palmira.yaml \
--opts MODEL.WEIGHTS <init-weights.pth>
If you use PALMIRA/Indiscapes-v2, please use the following BibTeX entry.
@inproceedings{sharan2021palmira,
title = {PALMIRA: A Deep Deformable Network for Instance Segmentation of Dense and Uneven Layouts in Handwritten Manuscripts},
author = {Sharan, S P and Aitha, Sowmya and Amandeep, Kumar and Trivedi, Abhishek and Augustine, Aaron and Sarvadevabhatla, Ravi Kiran},
booktitle = {International Conference on Document Analysis Recognition, {ICDAR} 2021},
year = {2021},
}
For any queries, please contact Dr. Ravi Kiran Sarvadevabhatla
This project is open sourced under MIT License.