With use of recurrent neural networks, optical flow, image segmentation and machine learning methods, the trained model is capable of detecting violence on sequence of frames with accuracy of around 88% (with 3 percentage points of error).
Every module is described separately in module directory.
We manage to achieve:
- 88% of accuracy with our solution
- prediction every 1.5 second (with 30 FPS and 4 threads on CPU)
- 3 percentage points of error
Violence Recognition Network that uses VGG16 network as base and LSTM as one of the top layers
Flow Gated Network module based on Violence Detection project
Dangerous Sound Detection Network module for gunshot detection using VGG16 and transformation to spectrograms
Dangerous Item Detection Network based trained with YOLOv3 and translated to Tensorflow library with usage of tool used for translation
We used RWF-2000 dataset for training our models.
- doc – documentation
- VRS_-_Praca_Dyplomowa.pdf– system documentation (Polish)
- src – source code
- Module.DIDN – Dangerous Item Detection Network, YOLOv3
- Module.DSDN – Dangerous Sound Detection Network
- Module.FGN – Flow Gate Network
- Module.Main – Main module
- Module.VRN – Violence Recognition Network
- Module.Boilerplate – Abandoned API boilerplate
- tools – different tools used in implementation/learning process
- docker-compose.yml – configure file
docker-compose.yml - start file with configuration options
CAM_ADDRESS
- Video stream adressFGN_ENABLED
- Enable/disable FGN moduleVRN_ENABLED
- Enable/disable VRN moduleDIDN_ENABLED
- Enable/disable DIDN module
docker-compose up --build
- Starting up the containers
localhost:5341
- URL adress for Seq