Skip to content

Latest commit

 

History

History
7 lines (7 loc) · 299 Bytes

README.md

File metadata and controls

7 lines (7 loc) · 299 Bytes

3VGC

A Tri-Modal Video Genre Classification Dataset 0. Regroup for data loading- Friday

  1. Audio - LSTM(Extract features manually) and 2d CNN(CNN Extraction for features)
  2. Video - 3dCNN(Exists) , Tune hyperparameters etc.
  3. Maybe text (optional)- Train CNN,Transformer,LSTM.
  4. Speech to text