Wasserstein Distance-based Expansion of Low-Density Latent Regions for Unknown Class Detection
OpenDet_CWA: OpenDet_CWA is implemented based on detectron2 and [Opendet2] (https://github.com/csuhan/opendet2).
[arXiv paper:] (https://arxiv.org/pdf/2401.05594.pdf).
This paper addresses the significant challenge in open-set object detection (OSOD): the tendency of state-of-the-art detectors to erroneously classify unknown objects as known categories with high confidence. We present a novel approach that effectively identifies unknown objects by distinguishing between high and low-density regions in latent space. Our method builds upon the Open-Det (OD) framework, introducing two new elements to the loss function. These elements enhance the known embedding space's clustering and expand the unknown space's low-density regions. The first addition is the Class Wasserstein Anchor (CWA), a new function that refines the classification boundaries. The second is a spectral normalisation step, improving the robustness of the model. Together, these augmentations to the existing Contrastive Feature Learner (CFL) and Unknown Probability Learner (UPL) loss functions significantly improve OSOD performance. Our proposed OpenDet-CWA (OD-CWA) method demonstrates: a) a reduction in open-set errors by approximately 17%-22%, b) an enhancement in novelty detection capability by 1.5%-16%, and c) a decrease in the wilderness index by 2%-20% across various open-set scenarios. These results represent a substantial advancement in the field, showcasing the potential of our approach in managing the complexities of open-set object detection.
Figure: Qualitative comparisons between proposed OD (top) and OD-CWA (bottom). Both models are trained on VOC and the detection results are visualised using images from COCO. The purple colour represents unknown and white represents known. White annotations represent classes seen by the model and purple annotation correspond to unknown classes.
The code is based on detectron2 v0.5.
- Installation
Here is a from-scratch setup script.
conda create -n opendet_cwa python=3.8 -y
conda activate opendet_cwa
pip install torch==2.0.0 torchvision torchaudio torchtext
pip install 'detectron2 @ git+https://github.com/facebookresearch/detectron2.git@5aeb252b194b93dc2879b4ac34bc51a31b5aee13'
pip install geomloss pillow==9.4 opencv-python
git clone https://github.com/proxymallick/OpenDet_CWA.git
cd OpenDet_CWA
- Prepare datasets
Please follow [README.md] (https://github.com/csuhan/opendet2/blob/main/datasets/README.md) of [Opendet2] (https://github.com/csuhan/opendet2) to prepare the dataset. It involves simple steps of following the script which utilizes VOC20{07,12} and COCO dataset to create combination of OS datasets using:
bash datasets/opendet2_utils/prepare_openset_voc_coco.sh
# using data splits provided by us.
cp datasets/voc_coco_ann datasets/voc_coco -rf
We report results on VOC and VOC-COCO-20, providing pretrained models. For full details, refer to the corresponding log file.
We report the results on VOC and VOC-COCO-20, and provide pretrained models. Please refer to the corresponding log file for full results.
Faster R-CNN
Method | backbone | mAPK↑(VOC) | WI↓ | AOSE↓ | mAPK↑ | APU↑ | Download |
---|---|---|---|---|---|---|---|
FR-CNN | R-50 | 80.10 | 18.39 | 15118 | 58.45 | - | config model |
CAC | R-50 | 79.70 | 19.99 | 16033 | 57.76 | - | config model |
PROSER | R-50 | 79.42 | 20.44 | 14266 | 56.72 | 16.99 | config model |
ORE | R-50 | 79.80 | 18.18 | 12811 | 58.25 | 2.60 | config model |
DS | R-50 | 79.70 | 16.76 | 13062 | 58.46 | 8.75 | config model |
OD | R-50 | 80.02 | 12.50 | 10758 | 58.64 | 14.38 | config model |
OD-SN | R-50 | 79.66 | 12.96 | 9432 | 57.86 | 14.78 | config model |
OD-CWA | R-50 | 79.20 | 11.70 | 8748 | 57.58 | 15.36 | config model |
Swin-T
Method | backbone | mAPK↑(VOC) | WI↓ | AOSE↓ | mAPK↑ | APU↑ | Download |
---|---|---|---|---|---|---|---|
OpenDet(OD) | Swin-T | 83.29 | 12.51 | 9875 | 63.17 | 15.77 | config model |
OD-SN | Swin-T | 82.49 | 14.39 | 7306 | 61.59 | 16.45 | config model |
OD-CWA | Swin-T | 83.34 | 10.35 | 8946 | 63.58 | 18.22 | config model |
Note:
- The above codes and repo has been modified from OpenDet2
- There were issues installing and running Opendet2 from the instructions and this repo provides modified codes
- The above results are taken from the paper and not the reimplemented version mentioned in (https://github.com/csuhan/opendet2).
- The download column contains new models for CAC, OD-SN, OD-CWA for ResNet-50 and Swin-T backbone. However, the rest of the models for comparison are taken from OpenDet2
- Evaluation and Visualisation
The embedding space visualisation can be conducted by running jupyter notebook inference file. It loads the embeddings which are stored during the evaluation phase on the holdout test set. The notebook also contains the codes to generate the inter-cluster and intra-cluster distance that following 6 different metrics.
- Testing
First, you need to download pretrained weights in the model zoo, e.g., OpenDet.
Then, run the following command:
python tools/train_net.py --num-gpus 8 --config-file configs/faster_rcnn_R_50_FPN_3x_opendet.yaml \
--eval-only MODEL.WEIGHTS output/faster_rcnn_R_50_FPN_3x_opendet/model_final.pth
- Training
The training process is the same as detectron2.
python tools/train_net.py --num-gpus 8 --config-file configs/faster_rcnn_R_50_FPN_3x_opendet.yaml
To train with the Swin-T backbone, please download swin_tiny_patch4_window7_224.pth and convert it to detectron2's format using tools/convert_swin_to_d2.py.
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
python tools/convert_swin_to_d2.py swin_tiny_patch4_window7_224.pth swin_tiny_patch4_window7_224_d2.pth
If you use this repository, please cite:
@misc{mallick2024wasserstein,
title={Wasserstein Distance-based Expansion of Low-Density Latent Regions for Unknown Class Detection},
author={Prakash Mallick and Feras Dayoub and Jamie Sherrah},
year={2024},
eprint={2401.05594},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Contact
If you have any questions or comments, please contact Prakash Mallick.