This repository contains the source code of my Thesis in MSc Data Science, entitled: "Multimodal summarization of user-generated videos from wearable cameras"
The proposed video summarization technique is based on the audio and visual features extracted using pyAudioAnalysis and multimodal_movie_analysis respectively.
For the purpose of my thesis, I also created a dataset, provided it here, which contains the audio and visual features accompanied with the ground truth annotation files. In order to construct the ground truth for the videos, user-created video summaries was collected using the video annotator tool and then with the aggregation process we build the final labels.
In order to run from the experiments and train the model from the beginning you have to download the aforementioned dataset, otherwise you can use a video collection of your own.
https://github.com/theopsall/video-summarization.git
cd Video-summarization
chmod +x install.sh
./install.sh
python3 video_summarization.py extractAndTrain -v /home/theo/VIDEOS -l /home/theo/LABELS -o /home/theo/videoSummary -d
-v
: The directory containing the video files.
-l
: The directory containing the annotations files.
-o
: The directory to store the final model.
(-d)
: Optional, in case you want to download and use the video files from the experiment.
python3 video_summarization.py train -v /home/theo/visual_features -a /home/theo/aural_features -l /home/theo/LABELS -o /home/theo/videoSummary
-v
: The directory with the visual features.
-a
: The directory with the aural features.
-l
: The directory containing the annotations files.
-o
: The directory to store the final model.
python3 video_summarization.py predict -v /home/theo/sample.mp4
-v
: The path of the video file.
python3 video_summarization.py featureExtraction -v /home/theo/VIDEOS
-v
: The directory containing the video files.
Annotation contains the proper script to handle the multiple annotations for the same video file, considering the aggregation agreement between the annotations.g
@article{psallidas2021multimodal,
title={Multimodal summarization of user-generated videos from wearable cameras},
author={Psallidas, Theodoros},
year={2021}
}
Enjoy the video summarization tool & feel free to bother me in case you need help. You can reach me at Theo Psallidas
DISCLAIMER
I have made some utilities scripts, as command line executables, in case you want to use some tools arbitrary out of the main pipeline, you are able to call them from the command line.