Anemia Detection with Machine Learning

Overview

This repo consist of python machine learning code for the detection of the anemi using the data from Kaggle, provided by the username Biswa Ranjan Rao, and the direct URL to the dataset is https://www.kaggle.com/datasets/biswaranjanrao/anemia-dataset.

The dataset consists of 1421 samples with six attributes: gender, hemoglobin, mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean corpuscular volume (MCV), and result.

The result attribute(class), represented by the binary values 0 for non-anemic and 1 for anemic in the data set, was selected as the response variable. The gender attribute being binary, all other attributes were continuous variables, and the memory size consumed by the dataset was 66.7 MB.

Introduction to Anemia

Anemia is a medical condition characterized by a deficiency of healthy red blood cells in the body or a reduction in the amount of hemoglobin in the blood. Hemoglobin is the protein in red blood cells responsible for carrying oxygen throughout the body. Anemia can occur due to various reasons such as a lack of iron or other essential nutrients, chronic diseases, genetic conditions, blood loss, or a malfunction in the bone marrow.

Symptoms of anemia include fatigue, weakness, shortness of breath, dizziness, pale skin, irregular heartbeat, and headaches. Treatment for anemia depends on the underlying cause, but it may involve dietary changes, supplements, medication, or, in severe cases, blood transfusions.

It is important to identify and treat anemia promptly, as it can lead to complications such as heart problems, impaired cognitive function, and delayed growth and development in children.

What's Inside

Exploratory Data Analysis
Statistical test with t-test, Odd ratio, and Chi-square test for association
Feature Selection
- Correlation
- SelectKBest
- Extra Tree Classifier
Scaling feature
- log
- Standardization
- Normalization
Class imbalance handling
- Random Undersampling
- Random Oversampling
- SMOTE
- ADASYN
Data Leakage handling
Algorithms employed
- Decision Tree (DT)
- Random Forest (RF)
- Logistic Regression (LG)
- K-Nearest Neighbors (KNN)
- Support Vector Machine (SVM)
- Gaussian Naive Bayes (NB)
Performance measured
- Accuracy
- Area Under the Curve
- Precision
- Recall
- F1 Score
- Kappa Stat
Hyperparameter tuning with GridsearchCV
5 fold cross validation

Usages

Run Live App

Contributing

Contributions are what makes the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Anemic Detection with ML with frontend.ipynb		Anemic Detection with ML with frontend.ipynb
Anemic Detection with ML.ipynb		Anemic Detection with ML.ipynb
LICENSE		LICENSE
README.md		README.md
anemia data from Kaggle.csv		anemia data from Kaggle.csv
anemia.py		anemia.py
random_forest_model.pkl		random_forest_model.pkl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anemia Detection with Machine Learning

Overview

Introduction to Anemia

What's Inside

Usages

Contributing

About

Releases

Packages

Languages

License

maladeep/anemia-detection-with-machine-learning

Folders and files

Latest commit

History

Repository files navigation

Anemia Detection with Machine Learning

Overview

Introduction to Anemia

What's Inside

Usages

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages