- Introduction
- Dataset Overview
- Data Preprocessing
- Feature Engineering
- Modeling
- Model Evaluation
- Results
- Conclusion
- License
This project involves applying ensemble learning techniques to predict medical test results, categorized as Normal, Abnormal, and Inconclusive. The dataset, which simulates healthcare records, is used to predict the outcomes of medical tests based on patient information, such as age, gender, medical conditions, and billing amounts.
The machine learning models used for classification include Bagging Classifier, Random Forest, AdaBoost, and Stacking Classifier. The goal is to understand which machine learning technique provides the best predictive accuracy for classifying test results.
The dataset consists of 10,000 observations with 15 features, including patient demographics, medical conditions, healthcare billing information, and the target variable (test results). The key features are:
- Name: Name of the patient
- Age: Age of the patient
- Gender: Gender of the patient
- Medical Condition: Diagnosis (e.g., Diabetes, Asthma)
- Billing Amount: Total healthcare bill for the patient
- Medication: Medication prescribed
- Test Results: Outcome of the medical tests (Normal, Abnormal, Inconclusive)
- etc.
The dataset is structured to simulate patient records, providing insight into various aspects of healthcare diagnostics.
Data preprocessing is crucial to prepare the dataset for modeling:
- Handling Missing Values: Any missing or anomalous values were removed or imputed, particularly in critical columns like
Billing Amount
andTest Results
. - Feature Encoding: Non-numeric columns like
Gender
,Medical Condition
, andMedication
were encoded using One-Hot Encoding and Label Encoding, making the data usable for machine learning algorithms. - Scaling: Numerical features like
Age
andBilling Amount
were normalized using Min-Max scaling, ensuring that all features are on the same scale and contributing equally to model performance.
Feature engineering was performed to derive meaningful information from the dataset:
- Duration of Stay: Calculated from the Admission Date and Discharge Date, indicating the duration of hospital stay.
- Categorical Features: Features such as
Gender
,Medical Condition
, andMedication
were converted into binary or one-hot encoded format for compatibility with machine learning models.
The features were then organized and split into training and test datasets, which were used for model training and evaluation.
For this project, several ensemble learning models were trained and compared:
- Bagging Classifier: Base estimators include Decision Trees, SVC, and KNN. It helps in reducing variance and improving stability by training multiple models.
- Random Forest: An ensemble of decision trees, trained using bootstrapping and random feature selection.
- AdaBoost: A boosting algorithm that combines weak learners to create a strong classifier.
- Stacking Classifier: A model that combines different classifiers (like Random Forest and SVC) in a way that each classifier’s strengths complement one another.
The models are trained using the training dataset and evaluated on the test dataset.
Each model's performance was evaluated using key metrics:
- Confusion Matrix: To visualize how well the model performs across the different categories of test results.
- F1-Score: To measure the model’s ability to correctly classify instances across the classes.
- Precision and Recall: To assess the trade-off between false positives and false negatives.
Here is a table summarizing the performance of each model and the parameters used:
- Bagging Classifier: The best performance was achieved with Decision Tree as the base estimator, with an F1-Score of 0.35.
- Random Forest: Max depth set to 10 with Gini criterion, achieving an F1-Score of 0.34.
- AdaBoost: Performed reasonably well with an F1-Score of 0.33, but lagged behind the other models.
- Stacking Classifier: Combining Random Forest and SVC, this model achieved the best F1-Score of 0.35.
The ensemble learning models demonstrated decent performance in classifying medical test results:
- Bagging Classifier and Stacking Classifier performed the best with an F1-Score of 0.35.
- Random Forest and AdaBoost performed slightly worse, though still providing valuable insights into the classification task.
Based on the results, Stacking Classifier and Bagging Classifier are the top-performing models, yielding the highest F1-Score of 0.35. These models were able to effectively classify the medical test results into Normal, Abnormal, and Inconclusive categories.
This project demonstrates the use of ensemble learning methods to classify healthcare data. While the models performed well, further improvements can be made by tuning hyperparameters, adding more relevant features, or exploring other advanced techniques like deep learning.
- Bagging and Stacking classifiers provide the best predictive performance for medical test classification.
- AdaBoost and Random Forest performed moderately well but can benefit from further tuning.
Future work will focus on improving the model’s accuracy by incorporating more features, fine-tuning model parameters, and exploring alternative classification techniques.
This project is licensed under the MIT License - see the LICENSE file for details.