Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models

Minho Park, Sunghyun Park, Jooyeol Yun, and Jaegul Choo
Korea Advanced Institute of Science and Technology (KAIST)

TLDR; To effectively fine-tune CLIP with the generated datasets, robust regularization techniques are essential, including weight-space ensembling and variance-covariance regularization.

(a) Observed significant domain gap and (b) Performance degradation due to the domain gap.

Installation

# python=3.8, torch==2.0.0, etc.
conda env create --file environments.yaml
conda activate regft

Generating classification dataset

cd 1_generate_datasets
python generate_datasets.py --ckpt="stabilityai/stable-diffusion-2-1-base" --dataset="imagenet" --prompt_style_file="prompt_styles.json" --output_dir="/path/to/save/generated_imagenet"

Finetuning CLIP with the generated dataset

Training-time regularization (Variance-Covariance Regularization)
Post-training regularization (Weight-space ensemble)

cd 2_finetune_classifier
export PYTHONPATH="$PYTHONPATH:${PWD}"

vc_reg1=0.16
vc_reg2=0.02
model="ViT-B/16"
eval_dataset="ImageNet"
train_dataset="ImageNetSD"
gpt_prompt_file="gpt_file/imagenet_prompt.json"
save_subdir="/path/to/save"

python train.py --train-dataset=${train_dataset} --model=${model} --eval-datasets=${eval_dataset} --alpha 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 --gpt_prompt_file gpt_file/imagenet_prompt.json --vc_reg ${vc_reg1} ${vc_reg2} --save=${save_subdir}

(Optional) Integration with the adapter method

Replace the image encoder in the CaFo architecture with the fine-tuned version.

Acknowledgement

This repo benefits from diffusers, CLIP, WiSE-FT, and CaFo. Thanks for their wonderful works.

Citation

@article{park2024regularized,
  title={Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models},
  author={Park, Minho and Park, Sunghyun and Yun, Jooyeol and Choo, Jaegul},
  journal={arXiv preprint arXiv:2406.05432},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
1_generate_datasets		1_generate_datasets
2_finetune_classifier		2_finetune_classifier
assets		assets
README.md		README.md
envirionments.yaml		envirionments.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models

Installation

Generating classification dataset

Finetuning CLIP with the generated dataset

(Optional) Integration with the adapter method

Acknowledgement

Citation

About

Languages

pmh9960/regft-for-gen

Folders and files

Latest commit

History

Repository files navigation

Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models

Installation

Generating classification dataset

Finetuning CLIP with the generated dataset

(Optional) Integration with the adapter method

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Languages