Pitch2Data

A deep learning framework leveraging YOLO to transform football videos into data insights
View Demo · Report Bug · Request Feature
Authors: Giulio Fantuzzi & Valentinis Alessio

About the project

AI and data science are currently revolutioning football, from transfer market strategies to real-time match analysis and advanced probabilistic metrics (I bet you have heard about Expected Goals!).

A key requirement for these tasks is spatial data, which is often difficult to obtain as it is mostly either private or accessible on a pay-per-demand basis. While professional teams may have access to GPS systems to track their players, what options are available for someone wanting to perform these analyses at home?

This project offers an accessible solution: a pipeline that extracts spatial data directly from raw football videos. Leveraging a deep learning framework based on the YOLO architecture and fine-tuned on football-specific datasets, this system detects players, the ball (with interpolation capabilities), and identifies key pitch points. Additionally, a TeamAssigner module automatically assigns detected players to their respective teams.

About the dataset

Building a custom dataset for object detection requires a huge manual effort for bounding boxes annotation. In this context, it cames into help Roboflow: a platform that facilitates the dataset whole creation-preprocessing-annotation pipeline.

Roboflow offers a semi-automatic annotation tool that streamlines the entire process, along with easy-to-use Python APIs for seamless integration. Additionally, it provides access to a large collection of pre-existing public datasets for various computer vision tasks. For this project, we relied on 3 datasets:

Pitch Detections (for detection of players, goalkeepers and referees);
Ball Detection (for detection of the ball);
Pitch Keypoints Detection (for detection of pitch keypoints)

Warning

All the datasets were built on frames from the DFL - Bundesliga Data Shootout Kaggle competition. This choice impacts performance, particularly because all clips were captured with wide-angle cameras. Consequently, applying the model to standard highlight views may result in reduced performance due to the significant difference in perspective.

Note

With a bit of manual effort, the dataset can be augmented to enhance its representativeness. We are confident that retraining the model with this augmented dataset will significantly improve performance when applied to standard highlight video perspectives.

Getting Started

To get started, first clone the repository (you may also want to fork it):

git clone https://github.com/giuliofantuzzi/Pitch2Data.git

Warning

The command above will download the full repository, including all the pre-trained weights. If you prefer a faster download without the weights, clone the light-setup branch by executing:

git clone https://github.com/giuliofantuzzi/Pitch2Data.git --branch light-setup --single-branch --depth 1

I recommend creating a virtual environment to avoid conflicts with your system's Python packages:

python -m virtualenv Pitch2Data_env
source Pitch2Data_env/bin/activate

Once activated the environment, install the required dependencies with:

pip install -r requirements.txt

In order to run the demo, one last configuration step is required

sh setup.sh

Finally, few sample videos can be downloaded from the terminal with:

python download_videos.py

Demo

The script demo.py is provided to showcase the model's capabilities. To run it, execute:

python demo.py <YOUR_VIDEO_NAME> --[options]

To know more about the options (STRONGLY SUGGESTED):

python demo.py --help

Note

If you have cloned the light-setup, you will need to fine-tune the model on your own. To do so, you can use the notebooks provided in the tuning/ folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Pitch2Data

About the project

About the dataset

Getting Started

Demo

Files

README.md

Latest commit

History

README.md

File metadata and controls

Pitch2Data

About the project

About the dataset

Getting Started

Demo