A repository for exploratory analysis of European football league data using SQL in Python.
- Jupyter Notebook
- SQLite
- pandas
- NumPy
- SciPy
- Matplotlib
- seaborn
I utilized the "European Soccer Database" from Kaggle. The database tracks fixture data from the 2008/09 season through to 2015/16, detailing over 25,000 matches and 11,000 players across 11 European top-flight leagues. There is also information about each of 296 teams, as well as player attributes originating from the FIFA video game series.
-
Things to consider:
- What does the data look like?
- What is interesting about this data?
- What can we learn from the data?
-
Data cleaning, visualization and analysis
- Win/draw/loss percentage of each team, and goals scored and conceded home and away per season
- Individual seasons with highest (and lowest) margin of victory by country; average margin of victory across 8 seasons for each country
- Addressing the "eye test": Are left-footed players really more creative/technically gifted?