This project predicts whether a diabetic patient will be readmitted to the hospital after discharge using machine learning models.
Hospital readmissions are costly and often preventable.
The goal of this project is to identify patients at high risk of readmission
so hospitals can improve discharge planning and follow-up care.
- Diabetes Readmission Dataset (UCI / Kaggle)
- ~100,000 patient encounters
- Target: Any readmission (Yes / No) Dataset available from UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008
- Data cleaning and preprocessing
- Feature engineering from clinical records
- Training and comparing multiple ML models
- Model evaluation using ROC-AUC and Recall
- Logistic Regression
- Random Forest
- XGBoost
- LightGBM
- Support Vector Machine
XGBoost achieved the best overall performance with strong recall for identifying patients who were readmitted.
model_experiments.py– Main training and evaluation pipelineinteractive_analysis.py– Interactive and exploratory analysis utilitiesComplete Final Code.pdf– Final project report
- Clone the repository
- Install required libraries
- Run the Python script 'interactive_analysis.py'
- Gowtham
- Hye Eunkg