BIOF509 Final Project

Can heart attacks be predicted? This project attempts to answer that using machine learning algorithms? This project first implements 3 unsupervised learning algorithms (K-Means, density based clustering, and hierarchical clustering) to group the studied population into observable clusters. Then 3 supervised learning algorithms (SVM, Decision Tree and Bayesian) are used to predict at risk patients. The implemented algorithms efficiency are compared.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

This tool uses numpy, pandas, seaborn, umap, matplotlib and sklearn. If you do not have all the needed libraries to run the project, please install using pip install.

pip install missing_library

example to install seaborn
pip install seaborn

Getting latest version of the tool on your local environment

Step by step instruction on how to get a development env running

Clone project on your local machine.

git clone https://github.com/halekpetigo/BIOF509

Open FAES-BIOF-509-Hale_Kpetigo-FinalProject.ipynb with Jupyter Lab.

There are two classes in the tool. BIOF509Unsupervised and BIOF509Supervised.

BIOF509Unsupervised is used to analyze and classify the two example datasets provided. BIOF509Supervised is used to analyze and classify the two example datasets provided.

The two datasets provided are heart.csv and heart 2.csv. heart.csv is the complete Heart Disease Dataset from the University of California, Irvine. It contains 1025 records of patients from Cleveland, Hungarian, Switzerland and Long Beach VA. heart 2.csv contains 303 records which are the dataset from Cleveland.

BIOF509Unsupervised has four main methods.

data_analysis: Used to provide general analysis of the dataset
BIOF509Kmeans: Used to classify data using K-Means
BIOF509Density: Used to classify data using Density clustering
BIOF509Hierarchical: Used to classify data using Hierarchical Clustering

BIOF509Supervised has three main methods.

BIOF509SVM: Used to predict using SVM
BIOFDecisionTree: Used to predict using Decision Tree
BIOFBayesian: Used to predict using Gaussian Naive Bayes

BIOF509Unsupervised

Running BIOF509Unsupervised data_analysis. Play Cell 2 and Cell 3. In Cell 4 run the code below

# Load Cleveland only dataset
FinalProject = BIOF509Unsupervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
# FinalProject = BIOF509Unsupervised("heart.csv")
FinalProject.data_analysis()

Running BIOF509Unsupervised BIOF509Kmeans. Play Cell 2 and Cell 3. In Cell 4 run the code below

# Load Cleveland only dataset
FinalProject = BIOF509Unsupervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
# FinalProject = BIOF509Unsupervised("heart.csv")

FinalProject.BIOF509Kmeans()

Running BIOF509Unsupervised BIOF509Density. Play Cell 2 and Cell 3. In Cell 4 run the code below

# Load Cleveland only dataset
FinalProject = BIOF509Unsupervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
# FinalProject = BIOF509Unsupervised("heart.csv")

FinalProject.BIOF509Density()

Running BIOF509Unsupervised BIOF509Hierarchical. Play Cell 2 and Cell 3. In Cell 4 run the code below

# Load Cleveland only dataset
FinalProject = BIOF509Unsupervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
# FinalProject = BIOF509Unsupervised("heart.csv")

FinalProject.BIOF509Hierarchical()

BIOF509Supervised

Running BIOF509Supervised BIOF509SVM. Play Cell 2 and Cell 5. In Cell 20 run the code below

# Load Cleveland only dataset
FinalSupProject = BIOF509Supervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
#FinalSupProject = BIOF509Supervised("heart.csv")

FinalSupProject.BIOF509SVM()

Running BIOF509Supervised BIOFDecisionTree. Play Cell 2 and Cell 5. In Cell 20 run the code below

# Load Cleveland only dataset
FinalSupProject = BIOF509Supervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
#FinalSupProject = BIOF509Supervised("heart.csv")

FinalSupProject.BIOFDecisionTree()

Running BIOF509Supervised BIOFBayesian. Play Cell 2 and Cell 5. In Cell 20 run the code below

# Load Cleveland only dataset
FinalSupProject = BIOF509Supervised("heart 2.csv")
# Second dataset used for comparision. Uncomment to use whole 1025 dataset
#FinalSupProject = BIOF509Supervised("heart.csv")

FinalSupProject.BIOFBayesian()

Built With

Jupyter Lab - The data science IED used
Anaconda - Python Environment Manager

Datasets

Authors

Hale Kpetigo - Github Repo

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Special thanks to James Anibal and Christina

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Demo - Presentation.pdf		Demo - Presentation.pdf
Demo - Presentation.pptx		Demo - Presentation.pptx
FAES-BIOF-509-Hale_Kpetigo-FinalProject.ipynb		FAES-BIOF-509-Hale_Kpetigo-FinalProject.ipynb
FAES-BIOF-509-Hale_Kpetigo-FinalProjectReport.docx		FAES-BIOF-509-Hale_Kpetigo-FinalProjectReport.docx
FAES-BIOF-509-Hale_Kpetigo-FinalProjectReport.pdf		FAES-BIOF-509-Hale_Kpetigo-FinalProjectReport.pdf
README.md		README.md
heart 2.csv		heart 2.csv
heart.csv		heart.csv
pairplot.png		pairplot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BIOF509 Final Project

Getting Started

Prerequisites

Getting latest version of the tool on your local environment

Built With

Datasets

Authors

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

halekpetigo/BIOF509

Folders and files

Latest commit

History

Repository files navigation

BIOF509 Final Project

Getting Started

Prerequisites

Getting latest version of the tool on your local environment

Built With

Datasets

Authors

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages