Data Wrangling:
- missing data
- outliers
- duplicate data
- drop unnecessary columns
Exploration:
- check for relationships with numeric values (scatterplot)
- Bar charts for categorical/numerical relationships
- boxplots for statistical data for each variable
Feature Engineering
- Stack home and away data
- Engineer OPS for each team (y variable)
- Generate boolean variable for home and away team
- Create Feature matrix and target variable