Published in: 2024 IEEE International Conference on Robotics and Automation (ICRA)
Conference link: ICRA 2024.
In short, this depth prediction model estimates depth maps from underwater RGB images. To solve the problem of scale ambiguity, the network additionally fuses sparse depth priors, e.g. coming from a SLAM pipeline.
Paper document: IEEE Xplore, arXiv, arXiv PDF, Google Drive
Paper video (3min): Google Drive
Paper graphic: Google Drive
rgb_heatmap_demo.mp4
Video: RGB (left) vs. Depth Prediction (right) on FLSea dataset at 10 Hz (Full Video: Google Drive).
Clone the repository, and navigate into its root folder. From there:
# create venv and activate
python3 -m venv venv
source venv/bin/activate
# install pip dependencies
pip3 install -r dependencies.txt
# add repo root to pythonpath
export PYTHONPATH="$PWD:$PYTHONPATH"
While in the repository root folder, run
python3 inference.py
The results will be available under data/out
The training, test and inference scripts are made available in the repository root folder and serve as examples on how you can train and monitor your custom training runs.
The training script produces logfiles in the current working directory for usage with tensorboard, a visualization tool to see what is going on during training, test and inference. To inspect the logfiles (at runtime or post-training), use tensorboard --logdir your_logfile_path
, the visualizations can then be seen in your browser.
The depth_estimation
module contains python packages with the code for setting up the model as well as utils to load data, compute losses and visualize data during training.
data/example_dataset
folder contains an example dataset which can be used to run the demo as well as an inspiration on how to setup your own custom dataset. Inside, the dataset.py
script provides a convenient get_example_dataset()
method which is reading a list of path tuples from dataset.csv
.
The helper_scripts
folder contains useful scripts which can be used for preprocessing of datasets, such as extracting visual features for usage as sparse depth measurements or creating train/test splits. In general, every data point in a dataset needs:
- RGB image (see
data/example_dataset/rgb
) - keypoint location with corresponding depth (see
data/example_dataset/features
) * - depth image ground truth (for training / evaluation only, see
data/example_dataset/depth
)
* check out helper_scripts/extract_dataset_features.py
for a simple example on how such features can be generated if ground truth is available. If not, you could use e.g. SLAM.
Then, the .csv
file defines the tuples, see data/example_dataset/dataset.csv
.
Make sure that you also load your data correctly via the dataloader, e.g. depending on your dataset, images can be in uint8, uint16 or float format (see data/example_dataset/dataset.py
)