This is the official repository for the CVPR 2024 paper "CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition".
This repo follows the framework of GSV-Cities for training, and the Visual Geo-localization Benchmark for evaluation. You can download the GSV-Cities datasets HERE, and refer to VPR-datasets-downloader to prepare test datasets.
The test dataset should be organized in a directory tree as such:
├── datasets_vg
└── datasets
└── pitts30k
└── images
├── train
│ ├── database
│ └── queries
├── val
│ ├── database
│ └── queries
└── test
├── database
└── queries
Before training, you should download the pre-trained foundation model DINOv2(ViT-B/14) HERE.
python3 train.py --eval_datasets_folder=/path/to/your/datasets_vg/datasets --eval_dataset_name=pitts30k --foundation_model_path=/path/to/pre-trained/dinov2_vitb14_pretrain.pth --epochs_num=10
To evaluate the trained model:
python3 eval.py --eval_datasets_folder=/path/to/your/datasets_vg/datasets --eval_dataset_name=pitts30k --resume=/path/to/trained/model/CricaVPR.pth
To add PCA:
python3 eval.py --eval_datasets_folder=/path/to/your/datasets_vg/datasets --eval_dataset_name=pitts30k --resume=/path/to/trained/model/CricaVPR.pth --pca_dim=4096 --pca_dataset_folder=pitts30k/images/train
You can directly download the trained model HERE.
Our another work (two-stage VPR based on DINOv2) SelaVPR achieved SOTA performance on several datasets. The code is released at HERE.
Parts of this repo are inspired by the following repositories:
Visual Geo-localization Benchmark
If you find this repo useful for your research, please consider leaving a star⭐️ and citing the paper
@inproceedings{lu2024cricavpr,
title={CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition},
author={Lu, Feng and Lan, Xiangyuan and Zhang, Lijun and Jiang, Dongmei and Wang, Yaowei and Yuan, Chun},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month={June},
year={2024}
}