Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images
Xiangyu Sun*, Haoyi Jiang*, Liu Liu, Seungtae Nam, Gyeongjin Kang, Xinjie Wang,
Wei Sui, Zhizhong Su, Wenyu Liu, Xinggang Wang, Eunbyung Park
Both training and inference code are releasing! Including Geometry Loss code.
-
Download repo:
git clone --recurse-submodules https://github.com/HorizonRobotics/Uni3R
-
Create and activate conda environment:
conda create -n uni3r python=3.10 conda activate uni3r
-
Install PyTorch and related packages:
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y conda install pytorch-cluster pytorch-scatter pytorch-sparse -c pyg -y
-
Install other Python dependencies:
pip install -r requirements.txt pip install flash-attn --no-build-isolation
-
Install 3D Gaussian Splatting modules:
pip install submodules/diff-gaussian-rasterization
-
Install OpenAI CLIP:
pip install git+https://github.com/openai/CLIP.git
-
Build croco model:
cd submodules/dust3r/croco/models/curope python setup.py build_ext --inplace cd ../../../../..
-
Install Pytorch3d:
pip install git+https://github.com/facebookresearch/pytorch3d.git
-
Install gsplat:
pip install git+https://github.com/nerfstudio-project/gsplat.git
-
Download pre-trained models:
The following three model weights need to be downloaded:
# 1. Create directory for checkpoints
mkdir -p checkpoints/pretrained_models
# 2. LSEG demo model weights
gdown 1FTuHY1xPUkM-5gaDtMfgCl3D0gR89WV7 -O checkpoints/pretrained_models/demo_e200.ckpt
Please download the final Uni3R checkpoints from here, including 2, 8 and 16 views. Checkpoints
-
For training: The model can be trained on ScanNet and ScanNet++ datasets.
- Both datasets require signing agreements to access
- Detailed data preparation instructions are available in data_process/data.md
-
For testing: Refer to data_process/data.md for details on the test dataset.
After preparing the datasets, you can train the model using the following command:
bash scripts/train.sh
The training results will be saved to SAVE_DIR
. By default, it is set to checkpoints/output
.
Optional parameters in scripts/train.sh
:
# Directory to save training outputs
--output_dir "checkpoints/output"
Please run these scripts to do evaluation on ScanNet Dataset, including 2, 8 and 16 views.
bash scripts/test.sh
bash scripts/test_8views.sh
bash scripts/test_16views.sh
- Release inference code.
- Release 2, 8 and 16 views checkpoints.
- Release the training code w/ geometric loss.
- Verify the multi-views training code.
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
- Gaussian-Splatting and diff-gaussian-rasterization
- DUSt3R
- Language-Driven Semantic Segmentation (LSeg)
- LSM
If you find our work useful in your research, please consider giving a star ⭐ and citing the following paper 📝.
@misc{sun2025uni3runified3dreconstruction,
title={Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images},
author={Xiangyu Sun and Haoyi Jiang and Liu Liu and Seungtae Nam and Gyeongjin Kang and Xinjie Wang and Wei Sui and Zhizhong Su and Wenyu Liu and Xinggang Wang and Eunbyung Park},
year={2025},
eprint={2508.03643},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.03643},
}