Skip to content

VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT

License

Notifications You must be signed in to change notification settings

YoucanBaby/VTG-GPT

Repository files navigation

VTG-GPT

PWC

This is our implementation for the paper VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT.

VTG-GPT leverages frozen GPTs to enable zero-shot inference without training.

Alt text

Preparation

  1. Install dependencies
conda create -n vtg-gpt python=3.10
conda activate vtg-gpt
pip install -r requirements.txt
  1. Unzip caption files
cd data/qvhighlights/caption/
unzip val.zip

Inference on QVHighlights val split

# inference
python infer_qvhighlights.py val

# evaluation
bash standalone_eval/eval.sh

Run the above code to get:

Metrics R1@0.5 R1@0.7 mAP@0.5 mAP@0.75 mAP@avg
Values 59.03 38.90 56.11 35.44 35.57

MiniGPT-v2 for Image captioning

cd minigpt
conda create --name minigptv python=3.9
pip install -r requirements.txt
python run_v2.py

Baichuan2 for Query debiasing

cd Baichuan2
conda activate vtg-gpt
python rephrase_query.py

Acknowledgement

We thank Youyao Jia for helpful discussions.

This code is based on Moment-DETR and SeViLA. We used resources from MiniGPT-4, Baichuan2, LLaMa2. We thank the authors for their awesome open-source contributions.

Citation

If you find this project useful for your research, please kindly cite our paper.

@article{xu2024vtg,
  title={VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT},
  author={Xu, Yifang and Sun, Yunzhuo and Xie, Zien and Zhai, Benxiang and Du, Sidan},
  journal={Applied Sciences},
  volume={14},
  number={5},
  pages={1894},
  year={2024},
  publisher={MDPI}
}

About

VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published