PyTorch implementation for CVPR2024 paper of “UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory”.
It is built on top of the VSE-infty, CLIP-ViL, CLIP4Clip, MDETR, LST and Awesome_Pretraining_Transfering.
If any problems, please contact me at r1228240468@gmail.com. (diaohw@mail.dlut.edu.cn is deprecated)
We propose an innovative strategy called UniPT for memory-efficient transfer learning. Specifically, we facilitate the transfer process via a lightweight and learnable parallel network, which consists of 1) A parallel interaction module that decouples the sequential connections and processes the intermediate activations detachedly from the pre-trained network. 2) A confidence aggregation module that learns optimal strategies adaptively for integrating cross-layer features.
The framework and applications of UniPT:
Image-Text Retrieval: VSE-infty with the strongest combination of a BERT-base model and a ResNeXt-101(32×8d) backbone pre-trained on Instagram (WSL).
Video-Text Retrieval: CLIP4Clip with the pre-trained CLIP network using Text Transformer and ViT-B/32 models.
Question Answering: CLIP-ViL that utilizes the CLIP image backbone and encodes the text into the word embedding sequence, followed by a cross-modal Transformer.
Visual Grounding: MDETR with a pre-trained ResNet-101 vision encoder, a RoBERTa-base text encoder, and a query-based encoder-decoder Transformer.
Please refer to their respective README.md file for the detailed settings.
We summarize the positions where UniPT is defined and invoked in each work as follows:
We hope these help you quickly realize your idea beyond UniPT.
-
CLIP-ViL: UniPT is defined and called at class LXRTEncoder(nn.Module) from CLIP-ViL/src/lxrt/modeling.py.
-
CLIP4Clip: UniPT is defined at CLIP4Clip/modules/module_adapter.py, and called at Line 251-261 from CLIP4Clip/modules/modeling.py.
-
VSE-infty: UniPT is defined at VSE-infty/lib/adapter_for_cnn.py and VSE-infty/lib/adapter_for_transformer.py, and called at VSE-infty/lib/encoders.py.
-
MDETR: UniPT is defined and called at class Transformer(nn.Module) from MDETR/models/transformer.py.
If UniPT is useful for your research, please cite the following paper:
@article{Diao2023UniPT,
title={UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory},
author={Diao, Haiwen and Wan, Bo and Zhang, Ying and Jia, Xu and Lu, Huchuan and Chen, Long},
journal={arXiv preprint arXiv:2308.14316},
year={2023}
}