SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features

Authors: Jinyuan Qu, Hongyang Li, Xingyu Chen, Shilong Liu, Yukai Shi, Tianhe Ren, Ruitao Jing and Lei Zhang.

Installation

Please follow our installation guidance to prepare dependencies. After downloading and processing the data, place it in the ./data/ directory.

Data Preparation

For ScanNet and ScanNet200 datasets preprocessing please follow the instruction.
We provide the DINO-X features required for training and evaluation, which are available for download on Hugging Face. After downloading, please place them in the ./data/features_2d/ directory.

The directory structure after data preparation should be as below:

data
├── features_2d/
│   ├── scannet/
│   ├── scannet200/
├── scannet/
├── scannet200/
├── readme.md

Evaluation

First, download our provided checkpoints, and put them at "./checkpoint".

# Select the dataset you want to evaluate in eval.sh manually.
bash scripts/eval.sh

Training

For training on ScanNet200, please prepare the pretrained backbone "mask3d_scannet200_aligned.pth" and put it to ./pretrained_backbone before training. The backbone is initialized from Mask3D checkpoint and can be downloaded here.

For training on ScanNet, please prepare the pretrained backbone "aligned_sstnet_scannet.pth" and put it to ./pretrained_backbone before training. The backbone is initialized from SSTNet checkpoint and can be downloaded here.

# Select the dataset used for training in train.sh manually.
bash scripts/train.sh

Models

We provide the configuration files and checkpoints for the ScanNet and ScanNet200 benchmarks (validation set), using DINO-X as the 2D detection model to provide 2D features.

Dataset	mAP	mAP₅₀	mAP₂₅	Download
ScanNet (val)	64.0	81.5	88.9	model \| config
ScanNet200 (val)	40.2	52.4	58.6	model \| config

Additionally, our performance on the ScanNet200 hidden test set is shown below:

Dataset	mAP	mAP₅₀	mAP₂₅	Details
ScanNet200 (test)	34.6	45.4	51.1	details

Qualitative Performance

Ciatation

If you find this work helpful for your research, please cite:

@inproceedings{qu2025segdino3d,
  title={{SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features}},
  author={Qu, Jinyuan and Li, Hongyang and Chen, Xingyu and Liu, Shilong and Shi, Yukai and Ren, Tianhe and Jing, Ruitao and Zhang, Lei},
  booktitle={Association for the Advancement of Artificial Intelligence (AAAI)},
  year={2026},
}

Acknowledgement

We would like to thank the authors of the following projects for their excellent work:

DINO-X - DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding
Grounding DINO - Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
OneFormer3D - OneFormer3D: One Transformer for Unified Point Cloud Segmentation
DAB-DETR - DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
SPFormer - Superpoint Transformer for 3D Scene Instance Segmentation
Mask3D - Mask3D: Mask Transformer for 3D Instance Segmentation
MAFT - Mask-Attention-Free Transformer for 3D Instance Segmentation
3DETR - 3DETR: An End-to-End Transformer Model for 3D Object Detection

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
configs		configs
data		data
engine		engine
evaluation		evaluation
scripts		scripts
segdino3d		segdino3d
tools		tools
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
installation.md		installation.md
readme.md		readme.md
scannet200_seg_label_mapping.npy		scannet200_seg_label_mapping.npy
train_3d.py		train_3d.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features

Installation

Data Preparation

Evaluation

Training

Models

Qualitative Performance

Ciatation

Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

IDEA-Research/SegDINO3D

Folders and files

Latest commit

History

Repository files navigation

SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features

Installation

Data Preparation

Evaluation

Training

Models

Qualitative Performance

Ciatation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages