input
is a text containing a list of target videos. You can make one by:
cd /path/to/video
ls -R *.mp4 > input
We use video-classification-3d-cnn-pytorch to extract features from video.
python3 main.py --input ./input --video_root path/to/video --output ./output.json --model resnet-101-kinetics.pth --mode feature --model_name resnet --model_depth 101 --resnet_shortcut B --batch_size 16
# on our 8GB 2070super, the max batch size is 16
use c3djson_to_npy.py
Use mfcc_feats.py
to do that.
Use openpose_feats.py
to do that. You need to have openpose installed.
prepro_feats.py
, prepro_vocab.py