Heart-Attack-Prediction

Abstract

The project was prepared and submitted within the Brazilian "Bootcamp Data Science na prática" by Neuron.

Methods

In order to perform the typical classification procedure for the binary output, following metrics are required to select the optimal model: misclassification rate or MR, F1, Generalized R2 (Nagelkerke or Craig and Uhler R2), Mean Abs Dev (the average of the absolute values of the differences between the real output and the predicted output). The project is supported by figures found in the Issues section. The information about R2 for classification is here

The data were stratified and divided : 75% for training, 25% for validation.

The models for machine learning: Naive Bayes, K-Nearest Neighbors or KNN (155 Neighbors estimated through euclidean distances at the uniform weights of points), Multiple Logistic Regression, Generalized Regression techniques (Lasso, Elastic Net, Ridge, Double Lasso), SVM (linear kernel function, cost = 1), Classification Tree , Boosted Tree and Bootstrap Forest (10 trees, 3 terms sampled per split, learning rate 0.1).

Results and Discussions

The most effective trained model was Bootstrap Forest among other classification models. While validating, it was discovered that majority of models gave MR 7,69% at equal F1 (0.92), whereas their Generaized R2-s were also identical. Thus, among the models, the minimal Mean Abs Dev had the logistic regression. Its Fit data, ROC, confusion matrix are given here.

As for non-parametric KNN, MR and F1 became higher at the validation at K = 92 (number of neighbors of the minimal MR on validation): MR = 5.77% and F1 = 0.939. See the graphical summary here and the table of all 155 neighbors vs. MR is found in the uploaded file "KNN-iterations summary by MR.xlsx".

An overall comparison of the values for actual target vs. predicted target by the set models on Validation is here and in general is here

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Code for KNN.jsl		Code for KNN.jsl
Code for Logistic Regression.jsl		Code for Logistic Regression.jsl
INFO_DATA-HEARTATTACK.txt		INFO_DATA-HEARTATTACK.txt
KNN-iterations summary by MR.xlsx		KNN-iterations summary by MR.xlsx
Profiler-Logistic Regression.htm		Profiler-Logistic Regression.htm
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart-Attack-Prediction

Abstract

Methods

Results and Discussions

About

Releases

Packages

Languages

Nazarkovsky/Heart-Attack-Prediction

Folders and files

Latest commit

History

Repository files navigation

Heart-Attack-Prediction

Abstract

Methods

Results and Discussions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages