Our Depth Anything models primarily focus on robust relative depth estimation. To achieve metric depth estimation, we follow ZoeDepth to fine-tune from our Depth Anything pre-trained encoder with metric depth information from NYUv2 or KITTI.
Method | AbsRel |
RMSE |
log10 |
|||
---|---|---|---|---|---|---|
ZoeDepth | 0.951 | 0.994 | 0.999 | 0.077 | 0.282 | 0.033 |
Depth Anything | 0.984 | 0.998 | 1.000 | 0.056 | 0.206 | 0.024 |
Method | AbsRel |
RMSE |
log10 |
|||
---|---|---|---|---|---|---|
ZoeDepth | 0.971 | 0.996 | 0.999 | 0.054 | 2.281 | 0.082 |
Depth Anything | 0.982 | 0.998 | 1.000 | 0.046 | 1.896 | 0.069 |
Indoor: NYUv2
Outdoor: KITTI
Method | SUN | iBims | HyperSim | vKITTI | DIODE Outdoor | |||||
---|---|---|---|---|---|---|---|---|---|---|
AbsRel | AbsRel | AbsRel | AbsRel | AbsRel | ||||||
ZoeDepth | 0.520 | 0.545 | 0.169 | 0.656 | 0.407 | 0.302 | 0.106 | 0.844 | 0.814 | 0.237 |
Depth Anything | 0.500 | 0.660 | 0.150 | 0.714 | 0.363 | 0.361 | 0.085 | 0.913 | 0.794 | 0.288 |
We provide two pre-trained models, one for indoor metric depth estimation trained on NYUv2, and the other for outdoor metric depth estimation trained on KITTI.
conda env create -n depth_anything_metric --file environment.yml
conda activate depth_anything_metric
Please follow ZoeDepth to prepare the training and test datasets.
Make sure you have downloaded our pre-trained metric-depth models here (for evaluation) and pre-trained relative-depth model here (for initializing the encoder) and put them under the checkpoints
directory.
Indoor:
python evaluate.py -m zoedepth --pretrained_resource="local::./checkpoints/depth_anything_metric_depth_indoor.pt" -d <nyu | sunrgbd | ibims | hypersim_test>
Outdoor:
python evaluate.py -m zoedepth --pretrained_resource="local::./checkpoints/depth_anything_metric_depth_outdoor.pt" -d <kitti | vkitti2 | diode_outdoor>
Please first download our Depth Anything pre-trained model here, and put it under the checkpoints
directory.
python train_mono.py -m zoedepth -d <nyu | kitti> --pretrained_resource=""
This will automatically use our Depth Anything pre-trained ViT-L encoder.
If you find this project useful, please consider citing:
@article{depthanything,
title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
journal={arXiv:2401.10891},
year={2024},
}