COGS118A Final Project

Binary Classification: Diabetes Prediction

Comparing logistic regression, decision trees, k-nearest neighbors, and SVMs

Group members:

Alex Yu
Kevin (Jungwoo) Park
Connor McManigal
Shivani Suthar

Abstract:

This project aims to solve the difficulty associated with making accurate diagnoses of diabetes in patients. If a patient is incorrectly diagnosed, it could lead to dire consequences, such as additional health issues or even death. Our goal is to solve this problem by designing machine learning algorithms that will accurately predict whether a patient has diabetes. Our data encompasses eight features such as age, gender, body mass index(BMI), hypertension, heart disease, smoking history, HbA1c levels, and blood glucose levels, along with their diabetes status: positive or negative. These electronic health records are collected through surveys, medical records, and laboratory tests from individuals by healthcare providers in hospitals or clinics. With this data, we will train multiple binary classification algorithms and select the algorithm that provides the highest sensitivity. We will compare the performances of logistic regression, decision tree, k-nearest neighbor, and support vector machines to see which algorithm best suits our needs. We will measure performance using sensitivity, precision, specificity, ROC-AUC, and precision-recall curves with a heavy emphasis on high recall, as it is important to detect all the positive diabetes cases in order to provide immediate treatment.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
Checkpoint_Group002-SP23.ipynb		Checkpoint_Group002-SP23.ipynb
DecisionTree.joblib		DecisionTree.joblib
FinalProject_Group002-SP23.ipynb		FinalProject_Group002-SP23.ipynb
KNN.joblib		KNN.joblib
LogisticClassification.joblib		LogisticClassification.joblib
Proposal_Group002-SP23.ipynb		Proposal_Group002-SP23.ipynb
README.md		README.md
RandomForest.joblib		RandomForest.joblib
SVC_with_BigC.joblib		SVC_with_BigC.joblib
SVC_with_smallC.joblib		SVC_with_smallC.joblib
diabetes_prediction_dataset.csv		diabetes_prediction_dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COGS118A Final Project

Binary Classification: Diabetes Prediction

About

Releases

Packages

Contributors 4

Languages

COGS118A/Group002-SP23

Folders and files

Latest commit

History

Repository files navigation

COGS118A Final Project

Binary Classification: Diabetes Prediction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages