Classification technique (Logistic Regression & K-Nearest Neighbors)
Performing classification analysis on a data combining 5 popular heart disease datasets already available independently but not combined before. In this dataset, 5 heart datasets are combined over 11 common features which makes it the largest heart disease dataset available so far for research purposes.
The five datasets used for its curation are: Cleveland, Hungarian, Switzerland, Long Beach VA, and Statlog (Heart) Data Set.
Cross validation is performed to evaluate the test accuracy for predictions from the logistic regression and K-Nearest neighbors models.