This repository provides the implementation of the SNSM approach. Some of the code snippets are stolen from AnyLoc and Patch-NetVLAD works.
No worries about collection of training data, boring preprocessing techniques, frustrating long training hours, and even more frustrating hyper-param tuning. Just sit back, relax, and run the command below. When lazy to do so much work, this kind of ideas come to mind. Fewer worries, less stress, and better health 😄.
We introduce a new training-free technique, called SNSM, for the all-day Visual Place Recognition (VPR) problem, where it specifically addresses the illumination variation. The SNSM function accepts a feature map from the backbone model and aggregates it into a modality-invariant (RGB and thermal) feature map called an SNSM map. Essentially, this attempts to capture the support value of each selected patch from its neighbourhood, which retains the homogeneous structural details and suppresses the heterogeneous modality-specific features. Support value is the correlation between a selected patch with its neighbouring patches. For further details, please refer to the full paper. Interestingly, the simple and training-free SNSM improves upon popular VPR models and various unsupervised methods by a considerable margin.
The SNSM directory contains the All-Backbones-VLAD_bl.py file, which can produce recall rates for the choice of dataset and aggregator. Currently, only unsupervised feature extraction techniques are included. However, VPR models reported in the paper are off-the-shelf, and models are open-source.
Inference datasets: RGB-T Datasets Drive link
Run the below command for inference.
sh All-Backbones-VLAD_bl.sh snsm aggregator is activated by default. Please provide appropriate arguments in the bash file regarding the choice of aggregator. The available aggregators include VLAD, VLAD-API, GeM, GAP, GMP, and SNSM. More information about these is available in the main function in All-Backbones-VLAD_bl.py.
Please use the below BibTeX to cite if you use the code.
@INPROCEEDINGS{10889993,
author={Uggi, Anuradha and Channappayya, Sumohana},
booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Training-free Adapter for Multi-Modal Image Matching for All-Day Visual Place Recognition},
year={2025},
volume={},
number={},
pages={1-5},
keywords={Computer vision;Adaptation models;Image recognition;Correlation;Source coding;Speech recognition;Signal processing;Acoustics;Speech processing;Visual place recognition;Multi-modal image retrieval;RGB;thermal;and visual place recognition},
doi={10.1109/ICASSP49660.2025.10889993}}