Javier Tirado-Garín Javier Civera
I3A, University of Zaragoza

Camera calibration from a single perspective/edited/distorted image using a freely chosen camera model
The only requirements are Python (≥3.10) and PyTorch. The project, in development mode, can be installed with:
git clone https://github.com/javrtg/AnyCalib.git && cd AnyCalib
pip install -e .
Alternatively, and optionally, a compatible version of xformers
can also be installed for better efficiency by running the following instead of pip install -e .
:
pip install -e .[eff]
import numpy as np
import torch
from PIL import Image # the library of choice to load images
from anycalib import AnyCalib
dev = torch.device("cuda")
# load input image and convert it to a (3, H, W) tensor with RGB values in [0, 1]
image = np.array(Image.open("path/to/image.jpg").convert("RGB"))
image = torch.tensor(image, dtype=torch.float32, device=dev).permute(2, 0, 1) / 255
# instantiate AnyCalib according to the desired model_id. Options:
# "anycalib_pinhole": model trained with *only* perspective (pinhole) images,
# "anycalib_gen": trained with perspective, distorted and strongly distorted images,
# "anycalib_dist": trained with distorted and strongly distorted images,
# "anycalib_edit": Trained on edited (stretched and cropped) perspective images.
model = AnyCalib(model_id="anycalib_pinhole").to(dev)
# Alternatively, the weights can be loaded from the huggingface hub as follows:
# NOTE: huggingface_hub (https://pypi.org/project/huggingface-hub/) needs to be installed
# model = AnyCalib().from_pretrained(model_id=<model_id>).to(dev)
# predict according to the desired camera model. Implemented camera models are detailed further below.
output = model.predict(image, cam_id="pinhole")
# output is a dictionary with the following key-value pairs:
# {
# "intrinsics": (D,) tensor with the estimated intrinsics for the selected camera model,
# "fov_field": (N, 2) tensor with the regressed FoV field by the network. N≈320^2 (resolution close to the one seen during training),
# "tangent_coords": alias for "fov_field",
# "rays": (N, 3) tensor with the corresponding (via the exponential map) ray directions in the camera frame (x right, y down, z forward),
# "pred_size": (H, W) tuple with the image size used by the network. It can be used e.g. for resizing the FoV/ray fields to the original image size.
# }
The weights of the selected model_id
, if not already downloaded, will be automatically downloaded to the:
- torch hub cache directory (
torch.hub.get_dir()
) ifAnyCalib(model_id=<model_id>)
is used, or - huggingface cache directory if
AnyCalib().from_pretrained(model_id=<model_id>)
is used.
Additional configuration options are indicated in the docstring of AnyCalib
:
help(AnyCalib)
"""AnyCalib class.
Args for instantiation:
model_id: one of {'anycalib_pinhole', 'anycalib_gen', 'anycalib_dist', 'anycalib_edit'}.
Each model differes in the type of images they seen during training:
* 'anycalib_pinhole': Perspective (pinhole) images,
* 'anycalib_gen': General images, including perspective, distorted and
strongly distorted images, and
* 'anycalib_dist': Distorted images using the Brown-Conrady camera model
and strongly distorted images, using the EUCM camera model,
* 'anycalib_edit': Trained on edited (stretched and cropped) perspective
images.
Default: 'anycalib_pinhole'.
nonlin_opt_method: nonlinear optimization method: 'gauss_newton' or 'lev_mar'.
Default: 'gauss_newton'
nonlin_opt_conf: nonlinear optimization configuration.
This config can be used to control the number of iterations and the space
where the residuals are minimized. See the classes `GaussNewtonCalib` or
`LevMarCalib` under anycalib/optim for details. Default: None.
init_with_sac: use RANSAC instead of nonminimal fit for initializating the
intrinsics. Default: False.
fallback_to_sac: use RANSAC if nonminimal fit fails. Default: True.
ransac_conf: RANSAC configuration. This config can be used to control e.g. the
inlier threshold or the number of minimal samples to try. See the class
`RANSAC` in anycalib/ransac.py for details. Default: None.
rm_borders: border size of the dense FoV fields to ignore during fitting.
Default: 0.
sample_size: approximate number of 2D-3D correspondences to use for fitting the
intrinsics. Negative value -> no subsampling. Default: -1.
"""
AnyCalib can also be executed in batch and using possibly different camera models for each image. For example:
images = ... # (B, 3, H, W)
# NOTE: if cam_ids is a list, then len(cam_ids) must be equal to B
cam_ids = ["pinhole", "radial:1", "kb:4"] # different camera models for each image
cam_ids = "pinhole" # same camera model across images
output = model.predict(images, cam_id=cam_ids)
# corresponding batched output dictionary:
# {
# "intrinsics": List[(D_i,) tensors] for each camera model "i",
# "fov_field": (B, N, 2) tensor,
# "tangent_coords": alias for "fov_field",
# "rays": (B, N, 3) tensor,
# "pred_size": (H, W).
# }
cam_id
represents the camera model identifier(s) that can be used in thepredict
method.D
corresponds to the number of intrinsics of the camera model. It determines the length of eachintrinsics
tensor in the output dictionary.
cam_id |
Description | D |
Intrinsics |
---|---|---|---|
pinhole |
Pinhole camera model | 4 | |
simple_pinhole |
pinhole with one focal length |
3 | |
radial:k |
Radial (Brown-Conrady) [1] camera model with k |
4+k
|
|
simple_radial:k |
radial:k with one focal length |
3+k
|
|
kb:k |
Kannala-Brandt [2] camera model with k |
4+k
|
|
simple_kb:k |
kb:k with one focal length |
3+k
|
|
ucm |
Unified Camera Model [3] | 5 |
|
simple_ucm |
ucm with one focal length |
4 |
|
eucm |
Enhanced Unified Camera Model [4] | 6 |
|
simple_eucm |
eucm with one focal length |
5 |
|
division:k |
Division camera model [5] with k |
4+k
|
|
simple_division:k |
division:k with one focal length |
3+k
|
|
In addition to the original works, we recommend the works of Usenko et al. [6] and Lochman et al. [7] for a comprehensive comparison of the different camera models.
The evaluation and training code is built upon the siclib
library from GeoCalib, which can be installed as:
pip install -e siclib
Running the evaluation commands will write the results to outputs/results/
.
Running the evaluation commands will download the dataset to data/lamar2k
which will take around 400 MB of disk space.
AnyCalib trained on
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Running the evaluation commands will download the dataset to data/megadepth2k
which will take around 2 GB of disk space.
AnyCalib trained on
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Running the evaluation commands will download the dataset to data/tartanair
which will take around 1.7 GB of disk space.
AnyCalib trained on
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Running the evaluation commands will download the dataset to data/stanford2d3d
which will take around 844 MB of disk space.
AnyCalib trained on
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Running the evaluation commands will download the dataset to data/megadepth2k-radial
which will take around 1.4 GB of disk space.
AnyCalib trained on
python -m siclib.eval.megadepth2k_radial_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Running the evaluation commands will download the dataset to data/monovo2k
which will take around 445 MB of disk space.
AnyCalib trained on
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist data.cam_id=ucm
AnyCalib trained on
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen data.cam_id=ucm
To comply with ScanNet++ license, we cannot directly share its data.
Please download the ScanNet++ dataset following the official instructions and indicate the path to the root of the dataset in the following evaluation command.
This needs to be provided only the first time the evaluation is run. This first time, the command will automatically copy the evaluation images under data/scannetpp2k
which will take around 760 MB of disk space.
AnyCalib trained on
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist scannetpp_root=<path_to_scannetpp>
AnyCalib trained on
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen scannetpp_root=<path_to_scannetpp>
Running the evaluation commands will download the dataset to data/lamar2k_edit
which will take around 224 MB of disk space.
AnyCalib trained following WildCam [8] training protocol:
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
Running the evaluation commands will download the dataset to data/tartanair_edit
which will take around 488 MB of disk space.
AnyCalib trained following WildCam [8] training protocol:
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
Running the evaluation commands will download the dataset to data/stanford2d3d_edit
which will take around 420 MB of disk space.
AnyCalib trained on
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
We extend the OpenPano dataset from GeoCalib with panoramas that not need to be aligned with the gravity direction. This extended version consists of tonemapped panoramas from The Laval Photometric Indoor HDR Dataset, PolyHaven, HDRMaps, AmbientCG and BlenderKit.
Before sampling images from the panoramas, first download the Laval dataset following the instructions on the corresponding project page and place the panoramas in data/indoorDatasetCalibrated
. Then, tonemap the HDR images using the following command:
python -m siclib.datasets.utils.tonemapping --hdr_dir data/indoorDatasetCalibrated --out_dir data/laval-tonemap
To download the rest of the panoramas and organize all the panoramas in their corresponding splits data/openpano_v2/panoramas/{split}
, execute:
python -m siclib.datasets.utils.download_openpano --name openpano_v2 --laval_dir data/laval-tonemap
The panoramas from PolyHaven, HDRMaps, AmbientCG and BlenderKit can be alternatively manually downloaded from here.
Afterwards, the different training datasets mentioned in the paper: device=cuda
as this significantly speeds up the creation of the datasets, but if no GPU is available, the flag can be omitted.
data/openpano_v2/openpano_v2
):
python -m siclib.datasets.create_dataset_from_pano --config-name openpano_v2 device=cuda
data/openpano_v2/openpano_v2_gen
):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_gen device=cuda
data/openpano_v2/openpano_v2_radial
):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_radial device=cuda
data/openpano_v2/openpano_v2_dist
):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_dist device=cuda
As with the evaluation, the training code is built upon the siclib
library from GeoCalib. Here we adapt their instructions to AnyCalib. siclib
can be installed executing:
pip install -e siclib
Once (at least one of) the extended OpenPano Dataset (openpano_v2
) has been downloaded and prepared, we can train AnyCalib with it.
For training with
python -m siclib.train anycalib_op_p --conf anycalib --distributed
Feel free to use any other experiment name. By default, the checkpoints will be written to outputs/training/
. The default batch size is 24 which requires at least 1 NVIDIA Tesla V100 GPU with 32GB of VRAM. If only one GPU is used, the flag --distributed
can be omitted. Configurations are managed by Hydra and can be overwritten from the command line.
For example, for training with
python -m siclib.train anycalib_op_g --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_gen'
For training with
python -m siclib.train anycalib_op_d --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_dist'
For training with
python -m siclib.train anycalib_op_r --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_radial'
For training with
python -m siclib.train anycalib_op_e --conf anycalib --distributed \
data.dataset_dir='data/openpano_v2/openpano_v2' \
data.im_geom_transform.change_pixel_ar=true \
data.im_geom_transform.crop=0.5
After training, the model can be evaluated using its experiment name:
python -m siclib.eval.<benchmark> --checkpoint <experiment_name> --tag <experiment_tag> --conf anycalib
Thanks to the authors of GeoCalib for open-sourcing the comprehensive and easy-to-use siclib
which we use as the base of our evaluation and training code.
Thanks to the authors of the The Laval Photometric Indoor HDR Dataset for allowing us to release the weights of AnyCalib under a permissive license.
Thanks also to the authors of The Laval Photometric Indoor HDR Dataset, PolyHaven, HDRMaps, AmbientCG and BlenderKit for providing high-quality freely-available panoramas that made the training of AnyCalib possible.
If you use any ideas from the paper or code from this repo, please consider citing:
@InProceedings{tirado2025anycalib,
author={Javier Tirado-Gar{\'\i}n and Javier Civera},
title={{AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration}},
booktitle={ICCV},
year={2025}
}
Code and weights are provided under the Apache 2.0 license.
[1] Close-Range Camera Calibration. D.C. Brown, 1971.
[2] A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses. J. Kannala, S.S. Brandt, TPAMI 2006.
[3] Single View Point Omnidirectional Camera Calibration from Planar Grids. C. Mei, P. Rives, ICRA, 2007.
[4] An Enhanced Unified Camera Model. B. Khomutenko, at al., IEEE RA-L, 2016.
[5] Simultaneous Linear Estimation of Multiple View Geometry and Lens Distortion. A.W. Fitzgibbon, CVPR, 2001.
[6] The Double Sphere Camera Model. V. Usenko, et al., 3DV, 2018.
[7] BabelCalib: A Universal Approach to Calibrating Central Cameras. Y. Lochman, et al., ICCV, 2021.
[8] Tame a Wild Camera: In-the-Wild Monocular Camera Calibration. S. Zhu, et al., NeurIPS, 2023.