This repository contains the code and documentation for a project focused on predicting heart disease using machine learning techniques. The project was developed as part of the Data Science and Machine Learning course by Zero to Mastery (ZTM).
The goal of this project was to build a machine learning model that can predict whether or not a patient has heart disease based on their medical attributes. The project involved several key steps:
Problem Definition: Can we predict heart disease based on clinical parameters?
Data Collection: The dataset used is sourced from the Cleveland database available on the UCI Machine Learning Repository and was obtained in a formatted way from Kaggle.
Exploratory Data Analysis (EDA): Conducted a thorough EDA to understand the dataset and discover meaningful patterns.
Modeling: Three different models were tested: Logistic Regression, K-Nearest Neighbors, and Random Forest.
Evaluation: The models were evaluated using metrics such as accuracy, precision, recall, F1-score, and AUC.
Hyperparameter Tuning: Fine-tuned the models to achieve optimal performance.
The Random Forest model achieved the highest accuracy, but all models were assessed for various strengths and weaknesses.**
*Technologies Used
- Python
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Scikit-learn
This project was developed as a milestone project under the guidance of Zero to Mastery (ZTM). Special thanks to the ZTM community for their support and feedback.