This paper looks into how different machine learning algorithms such as K-Nearest Neighbors (KNN), Gradient Boosting, Decision Tree and Random Forest can effectively predict the onset of liver disease from a comprehensive dataset. The data includes 1700 records, each containing complex patient attributes and health indicators, such as demographic details like age and gender; lifestyle factors such as BMI, alcohol intake, smoking habits; family history of diseases; exercise levels in addition to detailed medical profiles showing diabetes status and hypertension together with liver function test results. We then evaluate these algorithms meticulously to identify the best method for accurately predicting liver disease, whereby Gradient Boosting is identified as the most effective predictive model.
KNN:
DECISION TREE:
GRADIENT BOOSTING:
RANDOM FOREST:
Accuracy ranked:
The code will work on jupyter as well as Colab, but Colab is reccomended to be able to see the Confusion Matrix perfectly.