This repository contains the official implementation of VPOcc: Exploiting Vanishing Point for 3D Semantic Occupancy Prediction.
- Installation
- Preparing Dataset
- Results
- Training
- Evaluation
- Test
- Visualization
- Acknowledgements
- BibTeX
Our code is based on CUDA 11.3 and PyTorch 1.11.0.
a. Download the source code:
git clone https://github.com/vision3d-lab/VPOcc.git
cd VPOccb. Create conda environment and install PyTorch 1.11.0 with CUDA 11.3:
conda create -n vpocc python=3.8 -y
conda activate vpocc
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorchc. Install MMCV and MMDet with OpenMIM:
pip install -U openmim
mim install mmengine==0.9.0
mim install mmcv==2.0.1
mim install mmdet==3.2.0d. Install additional requirements:
pip install -r requirements.txte. Initialize environment:
export PYTHONPATH=`pwd`:$PYTHONPATH
wandb initWe follow the steps from Symphonies.
a. Dataset
- SemanticKITTI: Download RGB images, calibration files, and preprocess the labels (refer to VoxFormer or MonoScene documentation).
- SSCBench-KITTI360: Refer to SSCBench-KITTI360 Dataset Guide.
b. Depth Prediction
- SemanticKITTI: Generate depth predictions with pre-trained MobileStereoNet (see VoxFormer Preprocess Guide).
- SSCBench-KITTI360: Follow the same procedure as SemanticKITTI, but adapt disparity values following issue.
c. Vanishing Point Extraction
- Use pre-trained NeurVPS to extract vanishing points.
- Download pre-extracted vanishing points and extraction code from Hugging Face Dataset.
d. Data Structure (Softlink under ./data)
./data
├── SemanticKITTI
│ ├── dataset
│ ├── labels
│ ├── depth
│ └── vanishing_points
└── SSCBench-KITTI360
├── data_2d_raw
├── depth
├── monoscene_preprocess
└── vanishing_points
e. Pretrained Weights
- Pre-trained MaskDINO → place under
./backups
./VPOcc
├── backups
├── ckpts
├── configs
├── data
├── maskdino
├── outputs
├── ssc_pl
└── tools
| Dataset | Validation (IoU / mIoU) | Test (IoU / mIoU) | Model |
|---|---|---|---|
| SemanticKITTI | 44.98 / 16.36 📄 log | 44.58 / 16.15 📄 log | model |
| SSCBench-KITTI360 | 46.35 / 20.06 📄 log | 46.39 / 19.80 📄 log | model |
a. SemanticKITTI
python tools/train.py --config-name config.yaml trainer.devices=4 \
+data_root=./data/SemanticKITTI \
+label_root=./data/SemanticKITTI/labels \
+depth_root=./data/SemanticKITTI/depth \
+log_name=train_semantickitti \
+model_name=vpocc \
+seed=53b. KITTI-360
python tools/train.py --config-name config_kitti_360.yaml trainer.devices=4 \
+data_root=./data/SSCBench-KITTI360 \
+label_root=./data/SSCBench-KITTI360/monoscene_preprocess/labels \
+depth_root=./data/SSCBench-KITTI360/depth \
+log_name=train_kitti360 \
+model_name=vpocc \
+seed=53a. SemanticKITTI
python tools/evaluate.py --config-name config.yaml trainer.devices=1 \
+ckpt_path=./ckpts/semantickitti.ckpt \
+data_root=./data/SemanticKITTI \
+label_root=./data/SemanticKITTI/labels \
+depth_root=./data/SemanticKITTI/depth \
+log_name=eval_semantickitti \
+model_name=vpocc \
+seed=53b. KITTI-360
python tools/evaluate.py --config-name config_kitti_360.yaml trainer.devices=1 \
+ckpt_path=./ckpts/kitti360.ckpt \
+data_root=./data/SSCBench-KITTI360 \
+label_root=./data/SSCBench-KITTI360/monoscene_preprocess/labels \
+depth_root=./data/SSCBench-KITTI360/depth \
+log_name=eval_kitti360 \
+model_name=vpocc \
+seed=53a. SemanticKITTI (hidden test set)
python tools/test_semantickitti.py --config-name config.yaml trainer.devices=1 \
+ckpt_path=./ckpts/semantickitti.ckpt \
+data_root=./data/SemanticKITTI \
+label_root=./data/SemanticKITTI/labels \
+depth_root=./data/SemanticKITTI/depth \
+log_name=test_semantickitti \
+model_name=vpocc \
+seed=53b. KITTI-360
python tools/test_kitti360.py --config-name config_kitti_360.yaml trainer.devices=1 \
+ckpt_path=./ckpts/kitti360.ckpt \ \
+data_root=./data/SSCBench-KITTI360 \
+label_root=./data/SSCBench-KITTI360/monoscene_preprocess/labels \
+depth_root=./data/SSCBench-KITTI360/depth \
+log_name=test_kitti360 \
+model_name=vpocc \
+seed=53- Outputs of the validation set are saved in
./outputs.
a. SemanticKITTI
python tools/generate_outputs.py --config-name config.yaml trainer.devices=1 \
+ckpt_path=./ckpts/semantickitti.ckpt \
+data_root=./data/SemanticKITTI \
+label_root=./data/SemanticKITTI/labels \
+depth_root=./data/SemanticKITTI/depth \
+log_name=vis_semantickitti \
+model_name=vpoccb. KITTI-360
python tools/generate_outputs.py --config-name config_kitti360.yaml trainer.devices=1 \
+ckpt_path=./ckpts/kitti360.ckpt \
+data_root=./data/SSCBench-KITTI360 \
+label_root=./data/SSCBench-KITTI360/monoscene_preprocess/labels \
+depth_root=./data/SSCBench-KITTI360/depth \
+log_name=vis_kitti360 \
+model_name=vpocc- You can visualize the predicted data. In
automode, all visualizations are saved automatically, whilemanualmode opens an interactive window.
c. SemanticKITTI
python tools/visualize.py --config-name config.yaml \
+path=PTAH/TO/PKL/DIR \
+output_dir=PTAH/TO/OUTPUT/DIR \
+save_mode={auto/manual}d. KITTI-360
python tools/visualize.py --config-name config_kitti360.yaml \
+path=PTAH/TO/PKL/DIR \
+output_dir=PPTAH/TO/OUTPUT/DIR \
+save_mode={auto/manual}Special thanks to Symphonies and many thanks to the following excellent projects:
- Comming soon :D