This repo provides convienent object annotation files for the MS-COCO dataset. Bounding boxes, class ids and labels are provided for each image.
Annotations are collected and combined from 3 different sources:
- coco: MS-COCO ground truths taken directly from (https://cocodataset.org/#home)
- vg: Detections generated using a Faster-RCNN trained on visual genome dataset (https://github.com/shilrley6/Faster-R-CNN-with-model-pretrained-on-Visual-Genome)
- vinvl: Detections taken from Microsoft's VinVL (https://arxiv.org/abs/2101.00529)
Due to the size of the combined object file (628KB) it must be downloaded from this Google Drive link
The annotations are stored as a json file that can be easily opened as a python dictionary with image ids as keys. Demo.ipynb shows how to load and visualise objects tags.
Below are some examples of how tags vary from different sources: