Long-tail Object Detection

Abstract

We built the averaged backbone, Transformer block and contrastive branch, and check their impact on detection with experiments. Our final model leverages the memory bank to collect rare samples and to resample randomly; generate new rare samples with the attention mechanism of Transformer block; distinguish different foreground classes by contrastive learning and trained in the multi-tasks fashion. The final model trained on LVIS dataset, surpasses other models, gains near 3 percent improvement on mAP compared with the Backbone. In the end, we analyze the shortcomings of current model and provide future directions.

Installation

Environment

Python >= 3.6
PyTorch 1.6.0 with CUDA 10.2 --> Refer to download guildlines at the PyTorch website
Detectron2 v0.4
OpenCV is optional but required for visualizations

Detectron2

Please refer to the installation instructions in Detectron2.

LVIS Dataset

Dataset download is available at the official LVIS website. Please follow Detectron's guildlines on expected LVIS dataset structure.

Install lvis-api by:

pip install git+https://github.com/lvis-dataset/lvis-api.git

Training & Evaluation

Our code is located under projects/long-tail-detection.

Our training and evaluation follows those of Detectron2's. The config files for both LVISv0.5 and LVISv1.0 are provided.

Example: Training LVISv0.5 on Mask-RCNN ResNet-50

For multi-gpu training (advised)

cd projects/long-tail-detection
python dual_train_net.py \
--num-gpus 4 \
--config-file ./configs/Dual-RCNN-sample.yaml OUTPUT_DIR ./outputs

For single-gpu training, we need to adjust the learning rate and batchsize

python dual_train_net.py \
--num-gpus 1 \
--config-file ./configs/Dual-RCNN-sample.yaml \
SOLVER.BASE_LR 0.0025 SOLVER.IMS_PER_BATCH 2 OUTPUT_DIR ./outputs

Example: Evaluating LVISv0.5 on Mask-RCNN ResNet-50

cd projects/long-tail-detection
python dual_train_net.py \
--eval-only MODEL.WEIGHTS /path/to/model_checkpoint \
--config-file ./configs/Dual-RCNN-sample.yaml OUTPUT_DIR ./outputs

By default, LVIS evaluation follows immediately after training.

Ablation Study

Memory Bank	Transformer Block	Contrastive Branch	box AP	box AP.r	box AP.c	box AP.f	mask AP
✓			22.035	16.573	19.456	27.445	22.606
	✓		21.860	11.003	20.673	27.682	22.663
✓	✓		23.029	14.389	21.793	28.028	23.399
✓		✓	22.015	18.457	18.873	27.371	22.536
✓	✓	✓	23.551	15.454	22.532	28.060	23.935

Visualization

Detectron2 has built-in visualization tools. Under tools folder, visualize_json_results.py can be used to visualize the json instance detection/segmentation results given by LVISEvaluator.

python visualize_json_results.py --input x.json --output dir/ --dataset lvis

Further information can be found on Detectron2 tools' README.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.circleci		.circleci
.github		.github
configs		configs
datasets		datasets
demo		demo
detectron2		detectron2
dev		dev
docker		docker
docs		docs
projects		projects
tests		tests
tools		tools
.clang-format		.clang-format
.flake8		.flake8
.gitignore		.gitignore
GETTING_STARTED.md		GETTING_STARTED.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Long-tail Object Detection

Abstract

Installation

Environment

Detectron2

LVIS Dataset

Training & Evaluation

Ablation Study

Visualization

About

Releases

Packages

Languages

License

Ribosome-rbx/long-tail-detection

Folders and files

Latest commit

History

Repository files navigation

Long-tail Object Detection

Abstract

Installation

Environment

Detectron2

LVIS Dataset

Training & Evaluation

Ablation Study

Visualization

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages