We built the averaged backbone, Transformer block and contrastive branch, and check their impact on detection with experiments. Our final model leverages the memory bank to collect rare samples and to resample randomly; generate new rare samples with the attention mechanism of Transformer block; distinguish different foreground classes by contrastive learning and trained in the multi-tasks fashion. The final model trained on LVIS dataset, surpasses other models, gains near 3 percent improvement on mAP compared with the Backbone. In the end, we analyze the shortcomings of current model and provide future directions.
- Python >= 3.6
- PyTorch 1.6.0 with CUDA 10.2 --> Refer to download guildlines at the PyTorch website
- Detectron2 v0.4
- OpenCV is optional but required for visualizations
Please refer to the installation instructions in Detectron2.
Dataset download is available at the official LVIS website. Please follow Detectron's guildlines on expected LVIS dataset structure.
Install lvis-api by:
pip install git+https://github.com/lvis-dataset/lvis-api.git
Our code is located under projects/long-tail-detection.
Our training and evaluation follows those of Detectron2's. The config files for both LVISv0.5 and LVISv1.0 are provided.
Example: Training LVISv0.5 on Mask-RCNN ResNet-50
- For multi-gpu training (advised)
cd projects/long-tail-detection
python dual_train_net.py \
--num-gpus 4 \
--config-file ./configs/Dual-RCNN-sample.yaml OUTPUT_DIR ./outputs
- For single-gpu training, we need to adjust the learning rate and batchsize
python dual_train_net.py \
--num-gpus 1 \
--config-file ./configs/Dual-RCNN-sample.yaml \
SOLVER.BASE_LR 0.0025 SOLVER.IMS_PER_BATCH 2 OUTPUT_DIR ./outputs
Example: Evaluating LVISv0.5 on Mask-RCNN ResNet-50
cd projects/long-tail-detection
python dual_train_net.py \
--eval-only MODEL.WEIGHTS /path/to/model_checkpoint \
--config-file ./configs/Dual-RCNN-sample.yaml OUTPUT_DIR ./outputs
By default, LVIS evaluation follows immediately after training.
Memory Bank |
Transformer Block |
Contrastive Branch |
box AP |
box AP.r |
box AP.c |
box AP.f |
mask AP |
---|---|---|---|---|---|---|---|
✓ | 22.035 | 16.573 | 19.456 | 27.445 | 22.606 | ||
✓ | 21.860 | 11.003 | 20.673 | 27.682 | 22.663 | ||
✓ | ✓ | 23.029 | 14.389 | 21.793 | 28.028 | 23.399 | |
✓ | ✓ | 22.015 | 18.457 | 18.873 | 27.371 | 22.536 | |
✓ | ✓ | ✓ | 23.551 | 15.454 | 22.532 | 28.060 | 23.935 |
Detectron2 has built-in visualization tools. Under tools folder, visualize_json_results.py can be used to visualize the json instance detection/segmentation results given by LVISEvaluator.
python visualize_json_results.py --input x.json --output dir/ --dataset lvis
Further information can be found on Detectron2 tools' README.