Skip to content
This repository has been archived by the owner on Feb 22, 2024. It is now read-only.

Latest commit

 

History

History
21 lines (15 loc) · 481 Bytes

notes.md

File metadata and controls

21 lines (15 loc) · 481 Bytes

Workflow

Data Wrangling:

  • missing data
  • outliers
  • duplicate data
  • drop unnecessary columns

Exploration:

  • check for relationships with numeric values (scatterplot)
  • Bar charts for categorical/numerical relationships
  • boxplots for statistical data for each variable

Feature Engineering

  1. Stack home and away data
  2. Engineer OPS for each team (y variable)
  3. Generate boolean variable for home and away team
  4. Create Feature matrix and target variable