GitHub - Lu-Feng/ImAge: Official repository for the NeurIPS 2025 paper "Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era".

🆕 [Feb 2026] The code for obtaining the unified dataset have been released at HERE.

This is the official repository for the NeurIPS 2025 paper "Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era".

[Paper on ArXiv | Paper on HF | Model on HF]

ImAge is an implicit aggregation method to get robust global image descriptors for visual place recognition, which neither modifies the backbone nor needs an extra aggregator. It only adds some aggregation tokens before a specific block of the transformer backbone, leveraging the inherent self-attention mechanism to implicitly aggregate patch features. Our method provides a novel perspective different from the previous paradigm, effectively and efficiently achieving SOTA performance.

The difference between ImAge and the previous paradigm is shown in this figure:

To quickly use our model, you can use Torch Hub:

import torch
model = torch.hub.load("Lu-Feng/ImAge", "ImAge")

Getting Started

This repo follows the framework of GSV-Cities for training, and the Visual Geo-localization Benchmark for evaluation. You can download the GSV-Cities datasets HERE, and refer to VPR-datasets-downloader to prepare test datasets.

The test dataset should be organized in a directory tree as such:

├── datasets_vg
    └── datasets
        └── pitts30k
            └── images
                ├── train
                │   ├── database
                │   └── queries
                ├── val
                │   ├── database
                │   └── queries
                └── test
                    ├── database
                    └── queries

Before training, you should download the pre-trained foundation model DINOv2-register(ViT-B/14) HERE.

Train

python3 train.py --eval_datasets_folder=/path/to/your/datasets_vg/datasets --eval_dataset_name=pitts30k --backbone=dinov2 --freeze_te=8 --num_learnable_aggregation_tokens=8 --train_batch_size=120 --lr=0.00005 --epochs_num=20 --patience=20 --initialization_dataset=msls_train --training_dataset=gsv_cities --foundation_model_path=/path/to/pre-trained/dinov2_vitb14_reg4_pretrain.pth

If you don't have the MSLS-train dataset, you can also set --initialization_dataset=gsv_cities. Additionally, --training_dataset can be chosen as gsv_cities or unified_dataset (See Here to get it).

Test

python3 eval.py --eval_datasets_folder=/path/to/your/datasets_vg/datasets --eval_dataset_name=pitts30k --backbone=dinov2 --freeze_te=8 --num_learnable_aggregation_tokens=8 --resume=/path/to/trained/model/ImAge_GSV.pth

Trained Model

Training set	Pitts30k	MSLS-val	Nordland	Download
GSV-Cities	94.0	93.0	93.2	LINK
Unified dataset	94.1	94.5	97.7	LINK

Others

This repository also supports training NetVLAD, SALAD, and BoQ on the GSV-Cities dataset with PyTorch (not pytorch-lightning in other repos) and using Automatic Mixed Precision.

Acknowledgements

Parts of this repo are inspired by the following repositories:

GSV-Cities

Visual Geo-localization Benchmark

DINOv2

Citation

If you find this repo useful for your research, please consider leaving a star⭐️ and citing the paper

@inproceedings{ImAge,
title={Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era},
author={Feng Lu and Tong Jin and Canming Ye and Xiangyuan Lan and Yunpeng Liu and Chun Yuan},
booktitle={The Annual Conference on Neural Information Processing Systems},
year={2025}
}

@ARTICLE{selavprpp,
author={Lu, Feng and Jin, Tong and Lan, Xiangyuan and Zhang, Lijun and Liu, Yunpeng and Wang, Yaowei and Yuan, Chun}, 
  title={SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2026},
  volume={48},
  number={3},
  pages={2731-2748},
  doi={10.1109/TPAMI.2025.3629287}}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
aggregators		aggregators
backbone		backbone
dataloaders		dataloaders
figures		figures
LICENSE		LICENSE
README.md		README.md
commons.py		commons.py
datasets_ws.py		datasets_ws.py
eval.py		eval.py
hubconf.py		hubconf.py
initialize_agg_tokens.py		initialize_agg_tokens.py
loss.py		loss.py
network.py		network.py
parser.py		parser.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

Train

Test

Trained Model

Others

Acknowledgements

Citation

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

License

Lu-Feng/ImAge

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Train

Test

Trained Model

Others

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages