Robust Multi-view Depth (robustmvd
) is a benchmark and framework for depth estimation
from multiple input views with a focus on robust application independent of the target data.
This repository contains evaluation code for the Robust Multi-view Depth Benchmark.
Further, it contains implementations and weights of recent models together with inference scripts.
In the following, we first describe the setup, structure, and usage of the rmvd
framework.
Following this, we describe the usage of the Robust Multi-view Depth Benchmark.
The code was tested with python 3.8 and PyTorch 1.9 on Ubuntu 20.04.
For the setup, either the requirements.txt
or the setup.py
can be used.
To install the requirements, run:
pip install -r requirements.txt
To install the package using the setup.py
, run:
python setup.py install
The package can then be imported via import rmvd
.
To use the dataloaders from rmvd
, datasets need to be downloaded and some need to be preprocessed before
they can be used. For details, see rmvd/data/README.md.
The rmvd
framework contains dataloaders, models, evaluation/inference scripts for multi-view depth
estimation.
The setup and interface of the dataloaders is explained in rmvd/data/README.md.
The setup and interface of the models is explained in rmvd/models/README.md.
Evaluation is done with the script eval.py
, for example on ETH3D:
python eval.py --model robust_mvd --dataset eth3d --eval_type mvd --inputs poses intrinsics --output /tmp/eval_output --input_size 768 1152
On KITTI:
python eval.py --model robust_mvd --dataset kitti --eval_type mvd --inputs poses intrinsics --output /tmp/eval_output --input_size 384 1280
On DTU:
python eval.py --model robust_mvd --dataset dtu --eval_type mvd --inputs poses intrinsics --output /tmp/eval_output --input_size 896 1216
On ScanNet:
python eval.py --model robust_mvd --dataset scannet --eval_type mvd --inputs poses intrinsics --output /tmp/eval_output --input_size 448 640
On Tanks and Temples:
python eval.py --model robust_mvd --dataset tanks_and_temples --eval_type mvd --inputs poses intrinsics --output /tmp/eval_output --input_size 704 1280
The parameters model
, dataset
and eval_type
are required.
For further parameters, execute python eval.py --help
.
It is also possible to run the evaluation from python code, for example with:
import rmvd
model = rmvd.create_model("robust_mvd", num_gpus=1) # call with num_gpus=0 for CPU usage
eval = rmvd.create_evaluation(evaluation_type="mvd", out_dir="/tmp/eval_output", inputs=["intrinsics", "poses"])
dataset = rmvd.create_dataset("kitti", "mvd", input_size=(384, 1280))
results = eval(dataset=dataset, model=model)
For further details (e.g. additional function parameters, overview of available models, evaluations and datasets, ..), see the READMEs mentioned above.
Evaluation is done with the script inference.py
, for example:
python inference.py --model robust_mvd
Inference with rmvd
models can be done programmatically, e.g.:
import rmvd
model = rmvd.create_model("robust_mvd")
dataset = rmvd.create_dataset("eth3d", "mvd", input_size=(384, 576))
sample = dataset[0]
pred, aux = model.run(**sample)
Within this package, we use the following conventions:
- all data is in float32 format
- all data on an image grid uses CHW format (e.g. images are 3HW, depth maps 1HW); batches use NCHW format (e.g. images are N3HW, depth maps N1HW)
- models output predictions potentially at a downscaled resolution
- resolutions are indicated as (height, width) everywhere
- if the depth range of a scene is unknown, we consider a default depth range of (0.1m, 100m)
- all evaluations use numpy arrays as input and outputs
- all evaluations use a batch size of 1 for consistent runtime measurements
- GT depth values / inverse depth values of <=0 indicate invalid values
- predicted depth / inverse depth values of ==0 indicate invalid values
The Robust Multi-view Depth Benchmark (Robust MVD) aims to evaluate robust multi-view depth estimation on arbitrary real-world data. As proxy for this, it uses test sets based on multiple diverse, existing datasets and evaluates in a zero-shot fashion.
It supports multiple different input modalities:
- images
- intrinsics
- ground truth poses
- ground truth depth range (minimum and maximum values of the ground truth depth map)
It optionally supports alignment of predicted and ground truth depth maps to account for scale-ambiguity of some models.
Depth and uncertainty estimation performance is measured with the following metrics:
- Absolute relative error (rel)
- Inliers with a threshold of 1.03
- Sparsification Error Curves
- Area Under Sparsification Error (AUSE)
The following describes how to evaluate on the benchmark.
Evaluation on the benchmark is done with the script eval.py
, e.g.:
python eval.py --model robust_mvd --eval_type robustmvd --inputs poses intrinsics --output /tmp/eval_benchmark --eth3d_size 768 1152 --kitti_size 384 1280 --dtu_size 896 1216 --scannet_size 448 640 --tanks_and_temples_size 704 1280
The script eval_all.sh
allows evaluation of all models in the rmvd
framework on the benchmark:
./eval_all.sh -o /tmp/eval_benchmark
It is also possible to run evaluation on the benchmark from python code, for example with:
import rmvd
model = rmvd.create_model("robust_mvd", num_gpus=1) # call with num_gpus=0 for CPU usage
eval = rmvd.create_evaluation(evaluation_type="robustmvd", out_dir="/tmp/eval_benchmark", inputs=["intrinsics", "poses"])
results = eval(model=model, eth3d_size=(768, 1152), kitti_size=(384, 1280), dtu_size=(896, 1216), scannet_size=(448, 640), tanks_and_temples_size=(704, 1280))
With the programmatic evaluation described above, it is possible to evaluate custom models on the Robust MVD Benchmark.
The only constraint is that the custom model that is passed to eval(model=model, ..)
needs to have the following
functions:
- a
input_adapter
function - a
__call__
function (intorch
basically equivalent to theforward
function) - a
output_adapter
function
These functions are basically used to convert data between the formats of the rmvd
framework and the model-specific
format and to call the model. For details about these functions, see rmvd/models/README.md.
- add models that were evaluated in the publication to the
rmvd
framework - add code to gather and visualize benchmark results
- add project page including an overview of the benchmark and a leaderboard
- improve Tanks and Temples GT Depth
- add code to train the models
- add code to run inference directly from COLMAP or Meshroom outputs
This is the official repository for the publication:
A Benchmark and a Baseline for Robust Multi-view Depth Estimation
Philipp Schröppel, Jan Bechtold, Artemij Amiranashvili, Thomas Brox
3DV 2022
If you find our work useful, please consider citing:
@inproceedings{schroeppel2022robust,
author = {Philipp Schr\"oppel and Jan Bechtold and Artemij Amiranashvili and Thomas Brox},
booktitle = {Proceedings of the International Conference on {3D} Vision ({3DV})},
title = {A Benchmark and a Baseline for Robust Multi-view Depth Estimation},
year = {2022}
}