This repository shares the feature and other data of the ad-hoc video search. Currently, it has the data of three datasets: IACC.3, V3C1 and V3C2. For each video clip, we provide the frame-level feature. Please refer to video2frames.txt to find the corresponding frames for a video clip. For example a video clip shot35903_7 in IACC.3 dataset has six frames extracted features and the format looks like this: 'shot35903_7': ['shot35903_7_0', 'shot35903_7_75', 'shot35903_7_150', 'shot35903_7_225', 'shot35903_7_300', 'shot35903_7_375']. Each frame corresponds to a npy file (e.g., shot35903_7_0.npy under the npy folder of the feature file).
335,944 video clips (clip ids)
Query sets: tv16.avs.txt, tv17.avs.txt, tv18.avs.txt (query and ground truth files)
Available features:
CLIP_ViT-B_32 (query features, frame-level features)
BLIP-2 (query features, frame-level features)
Improved_ITV_features ([query features] [frame-level features])
1,082,649 video clips (clip ids)
Query sets: tv19.avs.txt, tv20.avs.txt, tv21.avs.txt (query and ground truth files)
Available features:
CLIP_ViT-B_32 (query features, frame-level features)
BLIP-2 (query features, frame-level features)
Improved_ITV_features ([query features] [frame-level features])
1,425,451 video clips (clip ids)
Query sets: tv22.avs.txt, tv23.avs.txt (and their narrative) (query and ground truth files)
Available features:
CLIP_ViT-B_32 (query features, frame-level features)
BLIP-2 (query features, frame-level features)
Improved_ITV_features ([query features] [frame-level features])
If you found the data useful, please cite our paper as followed:
@misc{wu2023VIREO_TRECVidAVS,
title={VIREO @ TRECVid 2023 Ad-hoc Video Search},
author={Jiaxin Wu and, Zhixin Ma and Sheng-Hua Zhong and Chong-Wah Ngo},
year={2023},
booktitle = {TRECVID workshop},
}