Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From InfoNCE to Kernel-Based Losses
InfoNCE variants demonstrate direct and indirect coupling between the alignment and uniformity terms thus hurting optimisation. We introduce the Decoupled Hyperspherical Energy Loss (DHEL) that completly decouples alignment from uniformity. We also revisit Kernel Contrastive Losses (KCL) that also decouple these terms.
DHEL and KCL:
- outperform other InfoNCE variants, such as SimCLR and DCL, even with smaller batch sizes
- demonstrate robustness against hyperparameters
- effectively utilize more dimensions, mitigating the dimensionality collapse problem
Also, KCL possesses several intriguing properties:
- the expected loss remains unaffected by the number of negative samples
- its minima can be identified non-asymptotically.
This repository provides implementations for DHEL and KCL, as presented in the paper available here.
Additionally, it includes the metrics applied on the learned representations, such as the introduced Wasserstein distance, which measures uniformity and effective rank. The introduced metric measures the Wasserstein distance between learned and optimal similarity distributions. Unlike the conventional uniformity metric, it accurately estimates uniformity without underestimation.
Our paper's experiments were conducted using the codebase provided in this repository.
@InProceedings{pmlr-v235-koromilas24a,
title = {Bridging Mini-Batch and Asymptotic Analysis in Contrastive Learning: From {I}nfo{NCE} to Kernel-Based Losses},
author = {Koromilas, Panagiotis and Bouritsas, Giorgos and Giannakopoulos, Theodoros and Nicolaou, Mihalis and Panagakis, Yannis},
booktitle = {Proceedings of the 41st International Conference on Machine Learning},
pages = {25276--25301},
year = {2024},
editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
volume = {235},
series = {Proceedings of Machine Learning Research},
month = {21--27 Jul},
publisher = {PMLR},
pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/koromilas24a/koromilas24a.pdf},
url = {https://proceedings.mlr.press/v235/koromilas24a.html}
}