Skip to content

Latest commit

 

History

History
140 lines (109 loc) · 7.96 KB

README.md

File metadata and controls

140 lines (109 loc) · 7.96 KB

Mind the Gap Between Prototypes and Images in Cross-domain Finetuning

Paper Conf Liscence Slides Poster CN_Video EN_Video

This repository contains the source codes for reproducing the results of NeurIPS'24 paper: Mind the Gap Between Prototypes and Images in Cross-domain Finetuning.

Author List: Hongduan Tian, Feng Liu, Zhanke Zhou, Tongliang Liu, Chengqi Zhang, Bo Han.

Introduction

In cross-domain few-shot classification (CFC), recent works mainly focus on adapting a simple transformation head on top of a frozen pre-trained backbone with few labeled data to project embeddings into a task-specific metric space where classification can be performed by measuring similarities between image instance and prototype representations. Technically, an assumption implicitly adopted in such a framework is that the prototype and image instance embeddings share the same representation transformation. However, in this paper, we find that there naturally exists a gap, which resembles the modality gap, between the prototype and image instance embeddings extracted from the frozen pre-trained backbone, and simply applying the same transformation during the adaptation phase constrains exploring the optimal representations and shrinks the gap between prototype and image representations. To solve this problem, we propose a simple yet effective method, contrastive prototype-image adaptation (CoPA), to adapt different transformations respectively for prototypes and images similarly to CLIP by treating prototypes as text prompts. Extensive experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently. Meanwhile, further analyses also indicate that CoPA can learn better representation clusters, enlarge the gap, and achieve minimal validation loss at the enlarged gap.

Dependencies

In our experiments, the main dependences required are the following libraries:

Python 3.6 or greater (Ours: Python 3.8)
PyTorch 1.0 or greater (Ours: torch=1.7.1, torchvision=0.8.2)
TensorFlow 1.14 or greater (Ours: TensorFlow=2.10)
tqdm (Ours: 4.64.1)
tabulate (0.8.10)

Dataset

  • Follow Meta-Dataset repository to prepare ILSVRC_2012, Omniglot, Aircraft, CU_Birds, Textures (DTD), Quick Draw, Fungi, VGG_Flower, Traffic_Sign and MSCOCO datasets.

  • Follow CNAPs repository to prepare MNIST, CIFAR-10 and CIFAR-100 datasets.

Backbone Pretraining

In this paper, we follow URL and use ResNet-18 as the frozen backbone in all our experiments. For reproduction, two ways are provided:

Train your own backbone. You can train the ResNet-18 backbone from scratch by yourself. The pretraining mainly contains two phases: domain-specific pretraining and universal backbone distillation.

To train the single domain-specific learning backbones (on 8 seen domains), run:

./scripts/train_resnet18_sdl.sh

Then, distill the model by running:

./scripts/train_resnet18_url.sh

Use the released backbones. URL repository has released both universal backbone and single domain backbone. For simplicity, you can directly use the released model.

The backbones can be downloaded with the above links. To download the pretrained URL model, one can use gdown (installed by pip install gdown) and execute the following command in the root directory of this project:

gdown https://drive.google.com/uc?id=1MvUcvQ8OQtoOk1MIiJmK6_G8p4h8cbY9 && md5sum sdl.zip && unzip sdl.zip -d ./saved_results/ && rm sdl.zip  # Universal backbone
gdown https://drive.google.com/uc?id=1Dv8TX6iQ-BE2NMpfd0sQmH2q4mShmo1A && md5sum url.zip && unzip url.zip -d ./saved_results/ && rm url.zip  # Domain specific backbones

In this way, the backbones are donwnloaded. Please create the ./saved_results directory and place the backbone weights in it.

Evaluate CoPA

To evaluate the CoPA, you can run:

./scripts/copa_pa.sh

Specifically, the running command is:

python copa_pa.py --model.name=url \
                  --model.dir ./url \
                  --test.type=standard \
                  --encoder.type=linear \
                  --SCE.tau=2.0 \
                  --seed=42 \
                  --exp_dir_name=linear_all \
                  --experiment.name=seed42

The hyperparameters can be modified for different experiments:

  • model_name: ['sdl', 'url']: sdl means using single domain backbone; url means using universal backbone.
  • model.dir: Path to the backbone weights.
  • test.type ['standard', '5shot', '1shot']: Different task modes. standard means vary-way vary-shot tasks; 5shot means vary-way 5-shot tasks; 1shot means 5-way 1-shot tasks.
  • encoder.type ['linear', 'vit']: Select different transformation modules to run CoPA.
  • SCE.tau: The temperature coefficient used in SCE loss.
  • seed: The random seed. All our results are the average of seed 41-45.

To evaluate the CoPA+TSA, you can run:

./scripts/copa_tsa.sh

Specifically, the running command is:

python copa_tsa.py --model.name=url \
                  --model.dir ./url \
                  --test.type=standard \
                  --encoder.type=linear \
                  --SCE.tau=2.0 \
                  --seed=42 \
                  --exp_dir_name=linear_all \
                  --experiment.name=seed42

Evaluate Pre-classifier Alignment (PA)

To evaluate Pre-classifier Alignment (PA), which is the typical case of URL, run:

./scripts/test_resnet18_pa.sh

To evaluate URL with task-specific adapters (TSA), which is an modified case of URL, run:

./scripts/test_resnet18_tsa.sh

Acknowledgement

The repository is built mainly upon these repositories:

[1] Li et al. Universal representation learning from multiple domains for few-shot classification, ICCV 2021.

[2] Triantafillou et al. Meta-dataset: A dataset of datasets for learning to learn from few examples, ICLR 2020.

Citation

@inproceedings{tian2024mind,
    title={Mind the gap between prototypes and images in cross-domain finetuning},
    author={Hongduan Tian and Feng Liu and Zhanke Zhou and Tongliang Liu and Chengqi Zhang and Bo Han},
    booktitle={Advances of Neural Information Processing Systems (NeurIPS)},
    year={2024}
}