This tool generates video summaries using four state-of-the-art summarization models:
- PGL-SUM
- CA-SUM
- DSNet anchor based
- DSNet anchor free
The models are pretrained on the TVSum and SumMe datasets.
-
Set up a virtual environment:
python -m venv .summarization
-
Activate the virtual environment:
source .summarization/bin/activate
-
Install required packages:
pip install -r requirements.txt
-
Navigate to the source folder:
cd src
-
Generate summary for a single video:
python inference.py pglsum --source ../custom_data/videos/source_video_name.mp4 --save-path ./output/summary_video_name.mp4 --sample-rate 30 --final-frame-length 30
-
Generate summaries for a folder of videos:
python inference.py pglsum --source ../custom_data/videos/source_video_folder --save-path ./output/summary_videos_folder --sample-rate 30 --final-frame-length 30
pglsum
- PGL-SUMcasum
- CA-SUMdsnet_ab
- DSNet anchor baseddsnet_af
- DSNet anchor free
--sample-rate 30
: The model analyzes every 30th frame.--final-frame-length 30
: The resulting video summary will contain around 30 frames, roughly equivalent to 27 seconds. This duration can vary between 23-31 seconds depending on the frames per second of the original video.--max-shot-length 8
: A single shot in the summary won't exceed 8 frames.--min-penalty-shot-length 5
: Shots that are 5 frames or shorter will incur a length penalty, thus making them less likely to appear in the final summary.