Stroke Prediction using Machine Learning

About:

This project aims to predict whether people across all age groups is likely to get strokes, based on eleven distinct features such as gender, age, and diseases etc. It trains various binary classification models using different methods, on Jupyter Notebook.

Installation Instructions:

Setup Conda Environment

To install / clone this project onto your machine, you should:

Run the following command:

conda update conda
git clone https://github.com/JARVIS843/Stroke_Prediction_ML_Project.git
cd Stroke_Prediction_ML_Project
conda env create -f environment.yml
conda activate StrokePredictionML

Add the environment (StrokePredictionML) to Jupyter:

python -m ipykernel install --user --name=StrokePredictionML --display-name "Python (StrokePredictionML)"

(Optional): Confirm the kernel was added successfully:

jupyter kernelspec list

You would then need to manually select the kernel from the Jupyter Interface

Setup Tensorflow with CUDA (Only for Neural Network Models)

It took me 2 hours to setup everything up correctly, so I decided to put the instructions here.

Since this project relies on tensorflow 2.18.0, so Windows Native WOULD NOT WORK!!! (TF gave up its development since 2.11). However, WSL still does.

Before everything, you need to make sure you have installed the newest driver. If you are using Nvidia graphics card that supports CUDA 12.5, you need to make sure your driver version is at least 555.42.02 for Linux (NOT WSL), and 555.85 for Windows. You may manually download specific driver on Nvidia's website, but I recommend using Nvidia Geforce Experience. If you are using WSL2, then you only have to install the newest driver on Windows side, and ABSOLUTELY, DO NOT INSTALL IT ON WSL LINUX, as it will mess up everything.

Then, according to this, install CUDA 12.5. Note, if you are using WLS2 Ubuntu, choose WSL-Ubuntu in the Distribution section (as the Ubuntu version includes driver and may mess up the driver installed on Windows). After installation, check your installation with:

nvcc --version

If it's not found, then you have to add it to PATH with:

export PATH=/usr/local/cuda-12.5/bin${PATH:+:${PATH}}

Finally, you need to install cuDNN 9.3 and follow the instruction on the webpage.

To verify that you have done everything correctly, use the following code (not command) in your jupyter notebook, with the previously established environment and kernel (StrokePredictionML). If setup correctly, it should not be zero (unless your GPU does not support CUDA 12.5 to begin with)

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

And you should be done. If you ever encounter any problems, please consult the following links, as they helped me a bunch when I'm doing this myself:

https://www.tensorflow.org/install/source?hl=en#gpu
https://www.tensorflow.org/install/pip#linux
https://www.tensorflow.org/install/pip#linux
https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/ (Windows, but again, Windows Native doesn't work)
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html (Linux)
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

Project Structure:

The project is divided into two parts: Non-nerual-network models, and Neural Network models

Non-Neural-Network Models:

This part of the project is in ML_project.ipynb, and it's responsible for data clean up, data analysis, data preprocessing, and training a variety of models (Extra Trees Classifier, Gradient boosting, Random Forest, and XGboost). The accuracies of the models are displayed with visualized confusion matrices.

Neural Network Models:

This is the other part of the project, and can be found in Neural_Network_Model.ipynb. In order to be consistent, it uses the same data clean up and preprocessing procedures. It employs a tensorflow 6 layered neural network (5 hidden 1 output and specific specs can be found in Models), and cross tests its accuracies with various optimizers (RMSprop, Nadam,and Adam), learning rates (0.01, 0.001,and 0.0001), batch sizes (32 ,and 64), as well as number of epochs (50 ,and 100). Adapted Learning Rate strategies are also attempted.

Models:

If you would like to use our pre-trained models, or to see the performances of them, all of them can be found: Here

*Note: The models are serialized and exported using Pickle

Dataset Used:

All of the models for this project are trained using the Kaggle Stroke Prediction Dataset.

There's a full pre-dowloaded (downloaded on 12/13/2024) dataset in the Dataset folder to save your time re-downloading it from Kaggle. The beginning of the ML_project.ipynb automatically downloads the newest dataset from Kaggle, whilst Neural_Network_Model.ipynb relies on the pre-downloaded dataset.

Authors & Background:

This project is co-developed by: Jarvis Yang (responsible for Neural Network Model), and Jegyeong An (responsible for Non-Neural-Network Models).

The project was intended to be the final project for the Introduction to Machine Learning course, provided by Professor Sundeep Rangan.

License (MIT):

See License File

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.vscode		.vscode
Dataset		Dataset
LOG		LOG
Models		Models
.gitignore		.gitignore
EYE_Bench.py		EYE_Bench.py
EYE_Model_Prep.ipynb		EYE_Model_Prep.ipynb
EYE_Neural_Network_Model.ipynb		EYE_Neural_Network_Model.ipynb
EYE_ONNX2RKNN.py		EYE_ONNX2RKNN.py
LICENSE		LICENSE
ML_project.ipynb		ML_project.ipynb
README.md		README.md
SKIN_Bench.py		SKIN_Bench.py
SKIN_Model_Prep.ipynb		SKIN_Model_Prep.ipynb
SKIN_Neural_Network_Model Good.ipynb		SKIN_Neural_Network_Model Good.ipynb
SKIN_Neural_Network_Model Test.ipynb		SKIN_Neural_Network_Model Test.ipynb
SKIN_Neural_Network_Model.ipynb		SKIN_Neural_Network_Model.ipynb
SKIN_Neural_Network_Model_IMGONLY.ipynb		SKIN_Neural_Network_Model_IMGONLY.ipynb
SKIN_Neural_Network_Model_METAONLY.ipynb		SKIN_Neural_Network_Model_METAONLY.ipynb
SKIN_ONNX2RKNN.py		SKIN_ONNX2RKNN.py
SP_Bench.py		SP_Bench.py
SP_Model_Prep.ipynb		SP_Model_Prep.ipynb
SP_Neural_Network_Model.ipynb		SP_Neural_Network_Model.ipynb
SP_ONNX2RKNN.py		SP_ONNX2RKNN.py
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stroke Prediction using Machine Learning

About:

Installation Instructions:

Setup Conda Environment

Setup Tensorflow with CUDA (Only for Neural Network Models)

Project Structure:

Non-Neural-Network Models:

Neural Network Models:

Models:

Dataset Used:

Authors & Background:

License (MIT):

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

JARVIS843/Stroke_Prediction_ML_Project

Folders and files

Latest commit

History

Repository files navigation

Stroke Prediction using Machine Learning

About:

Installation Instructions:

Setup Conda Environment

Setup Tensorflow with CUDA (Only for Neural Network Models)

Project Structure:

Non-Neural-Network Models:

Neural Network Models:

Models:

Dataset Used:

Authors & Background:

License (MIT):

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages