This project aims to harness the capabilities of MoViNet models to accurately detect instances of violence in video streams. By employing strategies such as transfer learning and fine-tuning, the objective is to develop a high-performance model that can function efficiently on edge devices (like a raspberry pi or other SBCs), which often have limited computational resources. Specifically, the MoViNet-A3 model has demonstrated promising results, achieving an accuracy rate of 85%. This level of precision underscores the model's potential for real-time applications in environments where quick and reliable video analysis is critical.
- Model Training: Utilizes MoViNets, an advanced architecture from Google Research (Kondratyuk et al., 2021) known for its efficiency in mobile and edge computing environments. Employs transfer-learning on this pre-trained models on human action recognition to enhance learning efficacy and reduce the necessity for extensive computational resources. The code is available in 'movinet_training.ipynb'. The training and the evaluation metrics are availible in the folders above, as well as a futher analysis on those results.
- Real-time Operation: Optimized for real-time applications, ensuring swift and accurate violence detection, the inference can be performed through 'movinet_inference.ipynb'
More examples on the 'example_videos' folder
You can use directly the Colab Notebook here (RECOMENDED)
Or you can run it on the python script in a virtual environment:
- Python 3.10+
- TensorFlow 2.15+
- linux distro
- Other dependencies listed in
requirements.txt
- Clone the repository
git clone https://github.com/engares/MoViNets-for-Violence-Detection-in-Live-Video-Streaming.git cd MoViNets-for-Violence-Detection-in-Live-Video-Streaming
- Install the required packages
pip install -r requeriments.txt sudo apt update && sudo apt install -y ffmpeg
- Download the models (1.8 GB, this may take time)
git clone https://huggingface.co/engares/MoViNet4Violence-Detection
- Run 'movinet_inference.py' indicating the path to the video and one ofselecting one of the trained models based on the hyperparameters. (The best model is chosen by default)
python movinet_inference.py [/path/to/video.mp4] --model_id a3 --lr 0.001 --bs 64 --dr 0.3 --trly 0
The full list of models with its performance metrics is available is on this .csv
Note. This Tensorflow implementation does not work for tf-lite
Tested on an Orange pi 5
- Python 3.10+
- TensorFlow 2.15+
- Jupyter-notebook
- linux distro based Single-board Computer
- Other dependencies listed in
requirements_tf_lite.txt
- Downloading the repo
git clone https://github.com/engares/MoViNets-for-Violence-Detection-in-Live-Video-Streaming.git cd MoViNets-for-Violence-Detection-in-Live-Video-Streaming
- Install the required packages
sudo apt-get install pkg-config libhdf5-dev pip install -r ./MoViNets-for-Violence-Detection-in-Live-Video-Streaming/requeriments_tflite.txt
- Open the 'movinet_tf_lite_inference.ipynb' and select the file. Also available here on colab, if you want to connect it to your local Runtime
Note. The default tflite model used is the one avilable in this repo, corresponfing to the best model trained. If you want to export your own tf-lite model, check the last section of this Colab Notebook