Skip to content

End-to-end heart disease risk prediction system using ML and Streamlit

Notifications You must be signed in to change notification settings

Ankush-22/Heart-Disease-Risk-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Heart Disease Risk Prediction System

This project is an end-to-end machine learning system that predicts the risk of heart disease using clinical features. It is built using the UCI Cleveland Heart Disease dataset and deployed as an interactive Streamlit web application.

⚠️ This project is a research prototype and not a medical diagnostic tool.


Problem Statement

Early detection of heart disease is critical for preventive healthcare. This project aims to estimate the probability of heart disease based on patient clinical attributes using machine learning.


Dataset

  • Source: UCI Cleveland Heart Disease Dataset
  • Samples: 304 patients
  • Target:
    • 0 → No heart disease
    • 1 → Heart disease present

Only the Cleveland dataset was used to avoid data leakage and corrupted labels present in other variants.


Machine Learning Pipeline

  1. Data filtering (Cleveland-only)
  2. Target binarization
  3. Feature selection & cleanup
  4. Encoding:
    • Binary: sex, fbs, exang
    • One-hot: chest pain (cp), restecg
  5. Train/test split (stratified)
  6. Models:
    • Logistic Regression (baseline)
    • XGBoost (final model)
  7. Probability calibration (Isotonic Regression)
  8. Model explainability using SHAP
  9. Deployment using Streamlit

Model Performance (Test Set)

Model Accuracy F1 Score ROC-AUC
Logistic Regression 0.87 0.85 0.92
XGBoost (Calibrated) 0.89 0.88 0.92

Calibration improved probability reliability (Brier score: 0.094).


Key Insights (SHAP)

Top predictive features:

  • Sex
  • Oldpeak (ST depression)
  • Maximum heart rate (thalach)
  • Age
  • Chest pain type

Tech Stack

  • Python
  • scikit-learn
  • XGBoost
  • SHAP
  • Streamlit
  • pandas, numpy
  • joblib

How to Run Locally

pip install -r requirements.txt
streamlit run app/app.py

About

End-to-end heart disease risk prediction system using ML and Streamlit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages