Skip to content
/ PaGeR Public

The official repository of the paper "Panorama Geometry Estimation using Single-Step Diffusion Models"

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE-MODEL
Notifications You must be signed in to change notification settings

prs-eth/PaGeR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸ“Ÿ PaGeR: Panorama Geometry Estimation using Single-Step Diffusion Models

This project implements PaGeR, a Computer Vision method for estimating geometry from monocular panoramic ERP images implemented in the paper Panorama Geometry Estimation using Single-Step Diffusion Models.

[Website](website here) [Paper](paper here) Demo [Dataset(coming soon)](dataset link here) Depth Model Normals Model Metric Depth Model

Team: Vukasin Bozic, Isidora Slavkovic, Dominik Narnhofer Nando Metzger, Denis Rozumny, Konrad Schindler, Nikolai Kalischek

We present PaGeR, a diffusion-based model for panoramic geometry reconstruction that extends monocular depth estimation to full 360Β° scenes. PaGeR is a one-step diffusion model trained directly in pixel space, capable of predicting high-resolution panoramic depth and surface normals with strong generalization to unseen environments. Leveraging advances in panorama generation and diffusion fine-tuning, PaGeR is trained on PanoInfinigen, a newly introduced synthetic dataset of indoor and outdoor scenes with metric depth and normals, producing coherent, metrically accurate geometry. It outperforms prior approaches across standard, few-shot, and zero-shot scenarios.

teaser_all

πŸ“’ News

05-02-2026: Full training, inference, and evaluation code added, along with the arXiv paper, interactive demo and depth, metric depth and normals model checkpoints. Full dataset release coming soon.

πŸš€ Usage

There are several ways to interact with PaGeR:

  1. A quick start is to use our HF-hosted demo:

  2. Run the demo locally (requires a 24VRAM GPU) -> see instructions below.

  3. Some interactive examples are also available at our project page:

  4. Finally, local development instructions with this codebase are given below.

πŸ› οΈ Setup

The code was tested on:

  • Debian GNU/Linux 12, Python 3.10.16, PyTorch 2.2.0, and CUDA 12.1.

πŸ“¦ Repository

Clone the repository (requires git):

git clone https://github.com/prs-eth/PaGeR.git
cd PaGeR

πŸ’» Dependencies

Create the Conda environment and install the dependencies:

conda env create -f environment.yaml

🏁 Prepare the checkpoints

The model checkpoints are hosted on Hugging Face:

  • Depth: prs-eth/PaGeR-depth
  • Metric Depth: prs-eth/PaGeR-metric-depth
  • Normals: prs-eth/PaGeR-normals You can either download them automatically by specifying the HF checkpoint name in the arguments, or download them manually and load from a local path. If you choose the latter, please preserve the original folder structure, as in the Hugging Face repository.

πŸ“₯ Download the datasets

For training, testing or evaluation, you would need to choose and download one or more of the following datasets:

For download instructions, terms of use, and dataset description, please refer to the webpages of the respective datasets. We provide the dataloaders for all of these datasets. You just need to choose the respective dataset in the config file or command line argument.

πŸ“· Local Gradio Demo

The easiest way to test PaGeR locally is to run the Gradio demo. Make sure you have installed the dependencies as described above, then run:

python app.py

Now you can test the model, explore interactive 3D visualizations on both provided examples and your own images, or download the results.

πŸ”§ Configuration settings

We use OmegaConf and argparse for configuration management in all our scripts and models. The parameters for running the script could be influenced by either setting it in the config script, or directly providing a parameter in the CLI. The latter will always take precedence. Note that the model loading parameters will always be loaded from a YAML config file stored along with the model checkpoint, and they won't be overwritten by the local config or CLI args.

πŸš€ Run inference

If you want to test models in the regular inference regime

# Depth
python inference.py \
    --configs "path/to/config" \
    --checkpoint_path "path/to/checkpoint" \
    --enable_xformers \
    --data_path "path/to/dataset" \
    --dataset "dataset-choice" \
    --results_path "path/to/save/results" \
    --pred_only \

TODO: generate_point_cloud explanation

βš™οΈ Inference settings

The behavior of the code can be customized in the following ways:

Argument Description
config Path to the YAML configuration file.
checkpoint_path Model checkpoint to load (local path or HuggingFace repo ID).
results_path Output directory where predictions are saved.
dataset Dataset to use (list given above).
data_path Root directory of the dataset.
scenes Scene type to use: indoor, outdoor, or both (if supported).
img_report_frequency Save an example output image every N samples.
pred_only Save only the prediction image (otherwise saves an RGB + prediction mosaic).
generate_eval Save predictions as .npz files for later evaluation.
enable_xformers Enable memory-efficient attention (recommended).

πŸ“Š Run Evaluation (for academic comparisons)

In order to run evaluation of inference results of our (or some other) model with the standard set of depth estimation metrics

# Depth
python evaluation/depth_evaluation.py \
    --pred_path "path/to/preds/folder" \
    --data_path "path/to/dataset" \
    --dataset "dataset-choice" \
    --alignment_type "alignment-type-to-apply" \
    --save_error_maps

Evaluation of the surface normals estimation could be done, similar to the PanoNormal paper, by running the following command:

# Normals
python evaluation/normals_estimation.py \
    --pred_path "path/to/preds/folder" \
    --data_path "path/to/dataset" \
    --dataset "dataset-choice" \
    --alignment_type "alignment-type-to-apply" \
    --save_error_maps
# Edges
python script/iid/run.py \
    --checkpoint prs-eth/marigold-iid-appearance-v1-1 \
    --denoise_steps 4 \
    --ensemble_size 1 \
    --input_rgb_dir input/in-the-wild_example \
    --output_dir output/in-the-wild_example

Evaluation Settings

The behavior of the code can be customized in the following ways:

Argument Type Choices Description
--data_path str – Root directory of the dataset containing ground-truth depth and metadata.
--dataset str PanoInfinigen, Matterport3D360, Stanford2D3DS, Structured3D, Structured3D_ScannetPP Dataset to evaluate on. Use PanoInfinigen for the synthetic dataset.
--pred_path str – Directory containing the predicted depth maps to be evaluated.
--alignment_type str metric, scale, scale_and_shift Alignment strategy applied between prediction and ground truth before evaluation.
--save_error_maps flag – If set, saves per-sample error maps during evaluation.
--error_maps_saving_frequency int – Frequency (in number of batches) at which error maps are saved.

πŸ‹πŸ» Run training

🦿 Evaluation on test datasets

Install additional dependencies:

pip install -r requirements+.txt -r requirements.txt

Set data directory variable (also needed in evaluation scripts) and download the evaluation datasets (depth, normals) into the corresponding subfolders:

export BASE_DATA_DIR=<YOUR_DATA_DIR>  # Set target data directory

# Depth
wget -r -np -nH --cut-dirs=4 -R "index.html*" -P ${BASE_DATA_DIR} https://share.phys.ethz.ch/~pf/bingkedata/marigold/evaluation_dataset/

# Normals
wget -r -np -nH --cut-dirs=4 -R "index.html*" -P ${BASE_DATA_DIR} https://share.phys.ethz.ch/~pf/bingkedata/marigold/marigold_normals/evaluation_dataset.zip
unzip ${BASE_DATA_DIR}/evaluation_dataset.zip -d ${BASE_DATA_DIR}/
rm -f ${BASE_DATA_DIR}/evaluation_dataset.zip

For download instructions of the intrinsic image decomposition test data, please refer to iid-appearance instructions and iid-lighting instructions.

Run inference and evaluation scripts, for example:

# Depth
bash script/depth/eval/11_infer_nyu.sh  # Run inference
bash script/depth/eval/12_eval_nyu.sh   # Evaluate predictions
# Normals
bash script/normals/eval/11_infer_scannet.sh  # Run inference
bash script/normals/eval/12_eval_scannet.sh   # Evaluate predictions
# IID
bash script/iid/eval/11_infer_appearance_interiorverse.sh  # Run inference
bash script/iid/eval/12_eval_appearance_interiorverse.sh   # Evaluate predictions

bash script/iid/eval/21_infer_lighting_hypersim.sh  # Run inference
bash script/iid/eval/22_eval_lighting_hypersim.sh   # Evaluate predictions
# Depth (the original CVPR version)
bash script/depth/eval_old/11_infer_nyu.sh  # Run inference
bash script/depth/eval_old/12_eval_nyu.sh   # Evaluate predictions

Note: although the seed has been set, the results might still be slightly different on different hardware.

πŸ‹οΈ Training

Based on the previously created environment, install extended requirements:

pip install -r requirements++.txt -r requirements+.txt -r requirements.txt

Set environment parameters for the data directory:

export BASE_DATA_DIR=YOUR_DATA_DIR        # directory of training data
export BASE_CKPT_DIR=YOUR_CHECKPOINT_DIR  # directory of pretrained checkpoint

Download Stable Diffusion v2 checkpoint into ${BASE_CKPT_DIR} (backup link)

Prepare for training data

Depth

Prepare for Hypersim and Virtual KITTI 2 datasets and save into ${BASE_DATA_DIR}. Please refer to this README for Hypersim preprocessing.

Normals

Prepare for Hypersim, Interiorverse and Sintel datasets and save into ${BASE_DATA_DIR}. Please refer to this README for Hypersim preprocessing, this README for Interiorverse and this README for Sintel.

Intrinsic Image Decomposition

Appearance model: Prepare for Interiorverse dataset and save into ${BASE_DATA_DIR}. Please refer to this README for Interiorverse preprocessing.

Lighting model: Prepare for Hypersim dataset and save into ${BASE_DATA_DIR}. Please refer to this README for Hypersim preprocessing.

Run training script

# Depth
python script/depth/train.py --config config/train_marigold_depth.yaml
# Normals
python script/normals/train.py --config config/train_marigold_normals.yaml
# IID (appearance model)
python script/iid/train.py --config config/train_marigold_iid_appearance.yaml

# IID (lighting model)
python script/iid/train.py --config config/train_marigold_iid_lighting.yaml

Resume from a checkpoint, e.g.:

# Depth
python script/depth/train.py --resume_run output/marigold_base/checkpoint/latest
# Normals
python script/normals/train.py --resume_run output/train_marigold_normals/checkpoint/latest
# IID (appearance model)
python script/iid/train.py --resume_run output/train_marigold_iid_appearance/checkpoint/latest

# IID (lighting model)
python script/iid/train.py --resume_run output/train_marigold_iid_lighting/checkpoint/latest

Compose checkpoint:

Only the U-Net and scheduler config are updated during training. They are saved in the training directory. To use the inference pipeline with your training result:

  • replace unet folder in Marigold checkpoints with that in the checkpoint output folder.
  • replace the scheduler/scheduler_config.json file in Marigold checkpoints with checkpoint/scheduler_config.json generated during training. Then refer to this section for evaluation.

Note: Although random seeds have been set, the training result might be slightly different on different hardwares. It's recommended to train without interruption.

✏️ Contributing

Please refer to this instruction.

πŸŽ“ Citation

Please cite our paper:

Put citations here

🎫 License

This code of this work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).

The models are licensed under RAIL++-M License (as defined in the LICENSE-MODEL)

By downloading and using the code and model you agree to the terms in LICENSE and LICENSE-MODEL respectively.

Acknowledgements

This project builds upon and is inspired by the following repositories and works:

We thank the authors and maintainers for making their code publicly available.

About

The official repository of the paper "Panorama Geometry Estimation using Single-Step Diffusion Models"

Resources

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE-MODEL

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages