This is script for converting VOC format XMLs to COCO format json(ex. coco_eval.json).
We can use COCO API, this is very useful(ex. calculating mAP).
labels.txt if need for making dictionary for converting label to id.
Sample labels.txt
Label1
Label2
...
$ python voc2coco.py \
--ann_dir /path/to/annotation/dir \
--ann_ids /path/to/annotations/ids/list.txt \
--labels /path/to/labels.txt \
--output /path/to/output.json \
<option> --ext xml
Sample paths.txt
/path/to/annotation/file.xml
/path/to/annotation/file2.xml
...
$ python voc2coco.py \
--ann_paths_list /path/to/annotation/paths.txt \
--labels /path/to/labels.txt \
--output /path/to/output.json \
<option> --ext xml
In this case, you can convert Shenggan/BCCD_Dataset: BCCD Dataset is a small-scale dataset for blood cells detection. by this script.
$ python voc2coco.py
--ann_dir sample/Annotations \
--ann_ids sample/dataset_ids/test.txt \
--labels sample/labels.txt \
--output sample/bccd_test_cocoformat.json \
--ext xml
# Check output
$ ls sample/ | grep bccd_test_cocoformat.json
bccd_test_cocoformat.json
# Check output
cut -f -4 -d , sample/bccd_test_cocoformat.json
{"images": [{"file_name": "BloodImage_00007.jpg", "height": 480, "width": 640, "id": "BloodImage_00007"}