Implementation of the active learning experiment on UCF-101 video dataset proposed in:
Alireza Zaeemzadeh, Mohsen Joneidi ( shared first authorship) , Nazanin Rahnavard, Mubarak Shah: Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. link
Most of the training code is used form here.
Tested on:
- Python 2.7
- cuda 9.1 (2 GPUs)
- torch 0.4.1
- torchvision 0.2.1
- irlbpy 0.1.0 code
- FFmpeg, FFprobe
wget http://johnvansickle.com/ffmpeg/releases/ffmpeg-release-64bit-static.tar.xz
tar xvf ffmpeg-release-64bit-static.tar.xz
cd ./ffmpeg-3.3.3-64bit-static/; sudo cp ffmpeg ffprobe /usr/local/bin;
- Download videos and train/test splits here.
- Convert from avi to jpg files using
utils/video_jpg_ucf101_hmdb51.py
python utils/video_jpg_ucf101_hmdb51.py avi_video_directory jpg_video_directory
- Generate n_frames files using
utils/n_frames_ucf101_hmdb51.py
python utils/n_frames_ucf101_hmdb51.py jpg_video_directory
- Generate annotation file in json format similar to ActivityNet using
utils/ucf101_json.py
annotation_dir_path
includes classInd.txt, trainlist0{1, 2, 3}.txt, testlist0{1, 2, 3}.txt
python utils/ucf101_json.py annotation_dir_path
Pre-trained models are available here.
Info on pretraining available here.
python main.py --root_path data/ --video_path frames/ --annotation_path ucfTrainTestlist/ucf101_01.json --result_path results/ --pretrain_path pretrained/resnet-18-kinetics.pth --model resnet --resnet_shortcut A --model_depth 18 --test --test_subset val
A 3DResNet18 model, pretrained on Kinetics, is fine tuned at each active learning cycle and is used to select the most informative samples.
t-SNE visualization of two classes of UCF-101 dataset and their representatives selected by IPM. (left) Decision function learned by using all the data. The goal of selection is to preserve the structure with only a few representatives. (right) Decision function learned by using representatives selected by IPM.
If you use IPM in your research, please use the following BibTeX entry.
@inproceedings{zaeemzadeh2019ipm,
title = {{Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision}},
year = {2019},
booktitle = {Computer Vision and Pattern Recognition, 2019. CVPR 2019. IEEE Conference on},
author = {Zaeemzadeh, Alireza and Joneidi, Mohsen and Rahnavard, Nazanin and Shah, Mubarak}
}
UCF Center for Research in Computer Vision (CRCV)