Improved-ITV

The official implementation of our paper Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank accepted in ICMR2024.

Environment

We used Anaconda to setup a workspace with PyTorch 1.8. Run the following script to install the required packages.

conda create -n IITV python==3.8 -y
conda activate IITV
git clone https://github.com/nikkiwoo-gh/Improved-ITV.git
cd Improved-ITV
pip install -r requirements.txt

Stanford coreNLP server for concept bank construction

./do_install_StanfordCoreNLIP.sh

Downloads

Features for Finetune

training and valiation sets please refer to AVS_data

testing sets please refer to AVS_feature_data

Usages

1. build bag of word vocabulary and concept bank

./do_get_vocab_and_concept.sh $collection

e.g.,

./do_get_vocab_and_concept.sh tgif-msrvtt10k-VATEX

or download from concept_phrase.zip, and unzip to the folder $rootpath/tgif-msrvtt10k-VATEX/TextData/

2. prepare concept annotation

build up video-level concept annotation (script to be released), or download from here

3. (Optional) pretrain the Improved ITV model

./do_pretrain.sh

4. train the Improved ITV model

4.1 train from pre-trained checkpoint

./do_train_from_pretrain.sh

4.2 train without pre-training

./do_train.sh

5. Inference on TRECVid datasets

./do_prediction_iacc.3.sh
./do_prediction_v3c1.sh
./do_prediction_v3c2.sh

6. Evalution

Remember to set the score_file correctly to your own path.

cd tv-avs-eval/
do_eval_iacc.3.sh
do_eval_v3c1.sh
do_eval_v3c2.sh

Citation

@inproceedings{ICMR2024_WU_improvedITV,
author = {Wu, Jiaxin and Ngo, Chong-Wah and Chan, Wing-Kwong},
title = {Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank},
year = {2024},
booktitle = {The Annual ACM International Conference on Multimedia Retrieval},
pages = {1-10},
}

Contact

jiaxin.wu@my.cityu.edu.hk

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
basic		basic
tv-avs-eval		tv-avs-eval
util		util
LICENSE		LICENSE
README.md		README.md
build_concept_phrase.py		build_concept_phrase.py
detect_contrary_relation_wordnet.py		detect_contrary_relation_wordnet.py
do_get_vocab_and_concept.sh		do_get_vocab_and_concept.sh
do_install_StanfordCoreNLIP.sh		do_install_StanfordCoreNLIP.sh
do_prediction_iacc.3.sh		do_prediction_iacc.3.sh
do_prediction_v3c1.sh		do_prediction_v3c1.sh
do_prediction_v3c2.sh		do_prediction_v3c2.sh
do_pretrain.sh		do_pretrain.sh
do_train.sh		do_train.sh
do_train_from_pretrain.sh		do_train_from_pretrain.sh
evaluation.py		evaluation.py
loss.py		loss.py
model.py		model.py
predictor.py		predictor.py
pretrain.py		pretrain.py
readContractPairs.py		readContractPairs.py
requirements.txt		requirements.txt
stopwords_en.txt		stopwords_en.txt
tgif.caption2Videoname.txt		tgif.caption2Videoname.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improved-ITV

Environment

Stanford coreNLP server for concept bank construction

Downloads

Pretraining Dataset

Concept bank for tgif-msrvtt10k-VATEX

Video-level concept annoation for tgif-msrvtt10k-VATEX

Model Checkpoints

Features for Finetune

Usages

1. build bag of word vocabulary and concept bank

2. prepare concept annotation

3. (Optional) pretrain the Improved ITV model

4. train the Improved ITV model

4.1 train from pre-trained checkpoint

4.2 train without pre-training

5. Inference on TRECVid datasets

6. Evalution

Citation

Contact

About

Releases

Packages

Languages

License

nikkiwoo-gh/Improved-ITV

Folders and files

Latest commit

History

Repository files navigation

Improved-ITV

Environment

Stanford coreNLP server for concept bank construction

Downloads

Pretraining Dataset

Concept bank for tgif-msrvtt10k-VATEX

Video-level concept annoation for tgif-msrvtt10k-VATEX

Model Checkpoints

Features for Finetune

Usages

1. build bag of word vocabulary and concept bank

2. prepare concept annotation

3. (Optional) pretrain the Improved ITV model

4. train the Improved ITV model

4.1 train from pre-trained checkpoint

4.2 train without pre-training

5. Inference on TRECVid datasets

6. Evalution

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages