This project has been accepted by ECCV 2024! For more information about the project, please refer to our project homepage.
Please make sure your CUDA==11.8, GCC==9, G++==9 since we need to compile some operators. You can use the following command to check your CUDA, GCC and G++ version:
nvcc --version
gcc --version
g++ --version
Then install all necessary packages:
conda create -n OPS python=3.9 -y
conda activate OPS
conda install pytorch==2.0.1 torchvision==0.15.2 pytorch-cuda=11.7 -c pytorch -c nvidia -y
pip install lit==18.1.8 numpy==1.23.1 cmake==3.30.4
pip install openmim==0.3.9
mim install mmengine==0.9.0
mim install mmcv==2.1.0
mim install mmsegmentation==1.2.2
pip install timm==0.9.8 einops==0.7.0 ftfy==6.1.1 pkbar==0.5 prettytable==3.9.0 py360convert==0.1.0 regex==2023.10.3 six==1.16.0
cd ops/models/dcnv3 && bash make.sh
You need to download this repository first since there is a Dockerfile inside. You need to build the Docker Image with this Dockerfile.
Make sure you have installed the Nvidia Container Toolkit according to this link. Otherwise, you cannot use GPUs when running a docker container.
# build the Docker Image
docker build -t ops:ubuntu18.04 .
# run a Docker container with GPUs
docker run --gpus all -it --shm-size 10gb --name ops ops:ubuntu18.04
It's highly recommended that you use the above commands to create a workable environment and then follow the steps below if you are unfamiliar with Docker. Otherwise, you will have a permission problem.
# you are now in the Docker container
git clone https://github.com/JunweiZheng93/OPS.git
# compile DCNv3
cd OPS/ops/models/dcnv3 && bash make.sh && cd /OPS
Now you're ready to go. Please download datasets and pretrained CLIP inside the Docker container to avoid the permission problem if you're unfamiliar with Docker.
We train our model on COCO-Stuff164k dataset while testing on WildPASS, Matterport3D and Stanford2D3D datasets. The dataset folder structure is as follows:
OPS
├── ops
├── configs
├── pretrains
│ ├── ViT-B-16.pt
├── data
│ ├── coco_stuff164k
│ │ ├── images
│ │ │ ├── train2017
│ │ │ ├── val2017
│ │ ├── annotations
│ │ │ ├── train2017
│ │ │ ├── val2017
│ ├── matterport3d
│ │ ├── val
│ │ │ ├── rgb
│ │ │ ├── semantic
│ ├── s2d3d
│ │ ├── area1_3_6
│ │ │ ├── rgb
│ │ │ ├── semantic
│ │ ├── area2_4
│ │ │ ├── rgb
│ │ │ ├── semantic
│ │ ├── area5
│ │ │ ├── rgb
│ │ │ ├── semantic
│ ├── WildPASS
│ │ ├── images
│ │ │ ├── val
│ │ ├── annotations
│ │ │ ├── val
├── tools
├── README.md
Please follow this link to download and preprocess COCO-Stuff164k dataset. As for the RERP data augmentation, please use the following command:
python tools/dataset_converters/add_erp.py --shuffle
Please follow WildPASS official repository to download and preprocess WildPASS dataset.
Please follow 360BEV official repository to download and preprocess Matterport3D dataset.
Please follow 360BEV official repository to download and preprocess Stanford2D3D dataset.
Please download the pretrained CLIP using this link.
Then use tools/model_converters/clip2mmseg.py
to convert model into mmseg style:
python tools/model_converters/clip2mmseg.py path/to/the/downloaded/pretrained/model/ViT-B-16.pt pretrains/ViT-B-16.pt # the first path is the path of the downloaded model, the second one is the converted model
The processed model should be placed in pretrains
folder (see dataset folder structure).
The checkpoints can be downloaded from:
Checkpoint without RERP
Checkpoint with RERP
Please use the following command to train the model:
bash tools/dist_train.sh <CONFIG_PATH> <GPU_NUM>
<CONFIG_PATH>
should be the path of the COCO_Stuff164k config file.
Please use the following command to test the model:
bash tools/dist_test.sh <CONFIG_PATH> <CHECKPOINT_PATH> <GPU_NUM>
<CONFIG_PATH>
should be the path of the WildPASS, Matterport3D or Stanford2D3D config file. <CHECKPOINT_PATH>
should be the path of the COCO_Stuff164k checkpoint file.
If you are interested in this work, please cite as below:
@inproceedings{zheng2024open,
title={Open Panoramic Segmentation},
author={Zheng, Junwei and Liu, Ruiping and Chen, Yufan and Peng, Kunyu and Wu, Chengzhi and Yang, Kailun and Zhang, Jiaming and Stiefelhagen, Rainer},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024}
}