This repository contains the work I did for a group project during my Master's degree. We compared various ensemble decision-tree-based methods to predict heart disease using R and Weka.
The Data
subdirectory contains the original data files, the cleaned data files, and the cleaned data files split into training and test sets. The cleaned datsets and training/test sets are available in *.csv format and the Weka format (*.arff). Since we compared models built in both R and Weka, we created the training/test sets so that the same data are used.
If you find this work useful, please cite our conference paper (a copy may be found in the PACISE
subdirectory):
I. P. Steinke, C. Chesley, J. Jean, and R. Seetan, “Comparison of Ensemble Tree-Based Techniques for Predicting Heart Disease,” 36th Annual Spring Conference of the Pennsylvania Computer and Information Science Educators (PACISE), April 9–10, 2021, pp. 57–63.