NBA Hall of Fame Prediction Model

Authors: Chaz Frazer & Ryan Lewis

Overview

This project applies machine learning classification modeling to predict whether or not eligible NBA players will be voted into the Naismith Memorial Basketball Hall of Fame. For our purpose, we will be more focused on predictive vs inference models. Precision will be a key metric for our final model, as we want to maximize our positive predictive value, meaning that when our model predicts that an NBA player is an HOFer, they actually are one.

Business Problem

Out of the 4,374 eligible NBA Players on 177 of them have been voted into the Hall of Fame. Based on this, can you accurately predict if eligible NBA players will be elected into the Naismith Memorial Basketball Hall of Fame based on career statistics and accolades?

Data

The data used in this project was originally pulled from Kaggle (via basketball-reference.com), it consisted of NBA player seasonal statistics from 1950 to 2017. This data set included box score statistics such as Points & Rebonds, to games played and some more advanced statistics such as Win Share. We aggregated these statistics down to create career statistics for each player. Additionally, we sourced data related to all-star selections, MVP awards, and NBA finals appearances. We then merged this into our aggregated data set and then finally subset our data for only those players eligible to be voted into the Hall of Fame. (Retired before 2014 -- players must be retired for 3 years in order to be eligible.)

Exploratory Data Analysis

Once our data was properly cleaned and formatted, we began to explore our data and investigate which of our features correlate most with our target variable of 'HOF'. It became evident that all-star appearances had a major correlation and should be considered for our models.

Further analyzing our data we can start to see the relationship both Win Share and Points have with identifying HOFers. In the scatterplot graph below you can see has both metrics increase, more HOF players appear on the plot.

Feature Engineering

During EDA we created several custom features in order to better identify NBA HOFers, below are the ones used in our final model:

'Title_Star' -- NBA players who have won a championship and have an all-star appearance
'20/10_B' -- NBA players who average over 20 points and 10 rebounds in their career
'Leader_Pts' -- NBA players who have scored more than 20,000 points in their career

Modeling & Results

After our EDA and feature engineering, we were ready to begin our modeling process. Below is a summary of results from multiple different classification models.

Final Model - KNN

Most of our models above show great accuracy, but due to class imbalance in our dataset, we cannot solely rely on accuracy to grade our model. As mentioned earlier, we're focused on having a high precision score in order to reduce our model's false positives predictions. This lead to choosing our KNN model as our final model as it has the highest precision score at 95.2% and our second strongest F1 score, showing this model performs well overall.

Next Steps

Update data set to include data from 2018-2020 seasons
Incorporate additional advanced analytics statistics, foreign-born status
Use our final KNN model to predict the current & not yet eligible NBA players

References

Kaggle Data Set -- https://www.kaggle.com/drgilermo/nba-players-stats
Additional Award Data -- https://www.basketball-reference.com/

Repository Structure

├── data
├── img
├── trials
│   └── nba_hof_workbook.ipynb
├── .gitignore
├── README.md
├── final_nba_hof.ipynb
└── nba_hof_predictions.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NBA Hall of Fame Prediction Model

Overview

Business Problem

Data

Exploratory Data Analysis

Feature Engineering

Modeling & Results

Final Model - KNN

Next Steps

References

Repository Structure

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
img		img
trials		trials
.gitignore		.gitignore
README.md		README.md
final_nba_hof.ipynb		final_nba_hof.ipynb
nba_hof_predictions.pdf		nba_hof_predictions.pdf

Mynusjanai/nba-hof-pred

Folders and files

Latest commit

History

Repository files navigation

NBA Hall of Fame Prediction Model

Overview

Business Problem

Data

Exploratory Data Analysis

Feature Engineering

Modeling & Results

Final Model - KNN

Next Steps

References

Repository Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages