Dynamic Residual Filtering With Laplacian Pyramid for Instance Segmentation
This repository is an oiffical Pytorch implementation of the paper "Dynamic Residual Filtering With Laplacian Pyramid for Instance Segmentation"
Minsoo Song and Wonjun Kim*
IEEE Transactions on Multimedia (TMM)
- The overall architecture of the proposed method for instance segmentation. Mask features are decomposed by utilizing the Laplacian pyramid and corresponding residuals are convolved with deformable filters to restore the global layout and local details of the segmentation map.
- Conceptual difference between the previous SOLOv2 and the proposed method. (a) SOLOv2. (b) Ours.
- we design spatially-aware convolution filters to progressively capture the residual form of mask features at each level of the Laplacian pyramid while holding deformable receptive fields with dynamic offset information.
- Python >= 3.8
- Pytorch >= 1.11.0
- Ubuntu >= 16.04
- CUDA >= 10.2
- cuDNN (if CUDA available)
some other packages: Detectron2, AdelaiDet, opencv
python setup.py build develop
Note.
- If you use pytorch 1.10 in CUDA <= 11.1 version, there will be an error during the build process. So the file "workspace/Lapmask_source/adet/layers/csrc/ml_nms/ml_nms.cu" must be replaced with a file inside "ml_nms_old.zip" attached to the corresponding folder.
We proivde pre-trained ResNet-101 and VoVNet-57 weights for COCO dataset. These models are trained on 4x Titan X GPUs. This is a reimplementation and the quantitative results are slightly different from our original paper.
-
Name mask AP AP50 AP75 APS APM APL LapMask_R101 41.2 62.3 44.6 20.5 44.7 56.0 LapMask_V-57 41.6 62.5 45.2 21.9 44.8 55.2
You can test our model with your image and visualize the results.
- Make sure you downloaded the pre-trained model and placed it in the './pretrained' before running the evaluation code. Demo Command Line:
############### Example of argument usage #####################
## Running demo using a whole folder of images
# --input: a folder containing the input image to be tested
# --output: a folder where visualization of output images will be stored
OMP_NUM_THREADS=1 python demo/demo.py --config-file configs/LapMask/R101_3x.yaml --input demo_input --output demo_output --confidence-threshold 0.3 --opts MODEL.WEIGHTS ./pretrained/LapMask_R101_pretrained.pth
OMP_NUM_THREADS=1 python demo/demo.py --config-file configs/LapMask/V_57_3x.yaml --input demo_input --output demo_output --confidence-threshold 0.3 --opts MODEL.WEIGHTS ./pretrained/LapMask_V_57_3x_pretrained.pth
## Running demo using a specified image (jpg or png)
# --input: a file path of a single input image to be tested
# --output: a file path where visualization of an output image will be stored
OMP_NUM_THREADS=1 python demo/demo.py --config-file configs/LapMask/R101_3x.yaml --input ./xxxx.jpg --output ./yyyy.jpg --confidence-threshold 0.3 --opts MODEL.WEIGHTS ./pretrained/LapMask_R101_pretrained.pth
OMP_NUM_THREADS=1 python demo/demo.py --config-file configs/LapMask/V_57_3x.yaml --input ./xxxx.jpg --output ./yyyy.jpg --confidence-threshold 0.3 --opts MODEL.WEIGHTS ./pretrained/LapMask_V_57_3x_pretrained.pth
We used COCO Dataset for model training/validation on Detectron2 platform.
- Download official COCO dataset and make the directory.
- You need train2017, val2017 for training and validation. test2017 is also required to submit the results to the evaluation server.
COCO data structures are should be organized as below:
|-- (Working Directory)
|-- datasets
|-- coco
|-- annotations
|-- instances_val2017.json
|-- instances_train2017.json
|-- image_info_test-dev2017.json
|-- image_info_test2017.json
|-- train2017
|-- 000000xxxxxx.jpg
|-- ... (all images in train2017)
|-- val2017
|-- 000000xxxxxx.jpg
|-- ... (all images in val2017)
|-- test2017
|-- 000000xxxxxx.jpg
|-- ... (all images in test2017)
Make sure you downloaded the pre-trained model and placed it in the './pretrained' before running the evaluation code.
- Evaluation Command Line:
# Running evaluation using pre-trained models
## ResNet-101 evaluation
OMP_NUM_THREADS=1 python tools/train_net.py --num-gpus 1 --eval-only --config-file configs/LapMask/R101_3x.yaml OUTPUT_DIR training_dir/R101_3x MODEL.WEIGHTS ./pretrained/LapMask_R101_pretrained.pth
## VoVNet-57 evaluation
OMP_NUM_THREADS=1 python tools/train_net.py --num-gpus 1 --eval-only --config-file configs/LapMask/V_57_3x.yaml OUTPUT_DIR training_dir/V_57_3x MODEL.WEIGHTS ./pretrained/LapMask_V_57_3x_pretrained.pth
# 4 gpus setting
# COCO
## ResNet-101 training
OMP_NUM_THREADS=1 python tools/train_net.py --num-gpus 4 --config-file configs/LapMask/R101_3x.yaml OUTPUT_DIR training_dir/R101_3x
## VoVNet-57 training
OMP_NUM_THREADS=1 python tools/train_net.py --num-gpus 4 --config-file configs/LapMask/V_57_3x.yaml OUTPUT_DIR training_dir/LapMask_V_57_3x
- Results of instance segmentation on the COCO test-dev subset. 1st row: results by SOLOv2. 2nd row: results by SOTR. 3rd row: results by the proposed method. Note that the performance comparison is conducted based on the Detectron2 platform by using the same backbone, i.e., ResNet-101-FPN. Best viewed in color.
- More results of the proposed method on the COCO test-dev subset. Note that our masks show reliable performance under diverse real-world scenarios. Best viewed in color.
When using this code in your research, please cite the following paper:
Minsoo Song, G-M. Um, H. K. Lee, J. Seo, and Wonjun Kim*, "Dynamic residual filtering with Laplacian pyramid for instance segmentation," in IEEE Transactions on Multimedia, Early Access, doi: 10.1109/TMM.2022.3215306.
@ARTICLE{9921322,
author={M. {Song}, G-M. {Um}, H. K. {Lee}, J. {Seo}, and W. {Kim}},
journal={IEEE Transactions on Multimedia},
title={Dynamic Residual Filtering With Laplacian Pyramid for Instance Segmentation},
month={Oct.},
year={2022},
pages = {1-12},
doi={10.1109/TMM.2022.3215306}}