A Machine Learning as well as Deep Learning approach to solve classification problem.
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
- Table of Contents
- About The Project
- Built With
- Getting Started
- Accuracy Screenshots
- Roadmap
- Contributing
- License
- Contact
The aim of this project is to select appropriate feature to train model and find out which classification algorithm work best for the given dataset. This dataset of cancer cells characteristics was created by the University of Wisconsin. The features from the data set describe characteristics of the cell nuclei and are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. The dataset has 569 instances and 32 attributes out of which 5 attributes are chosen for model : Mean Radius Mean Perimeter Mean Area Mean Concavity Mean Concave Points The Objective is to train a classifier model to predict whether the cell is malignant or benign.
For this the Machine Learning models used were:
- Logistic Regression
- K-Nearest Neighbors(KNN)
- Naïve Bayesian Algorithm
Results obtained were:
-
Logistic Regression : Accuracy: 92.98% Cross validation score: 90.87% (+/- 5.91%)
-
K-Nearest Neighbors (KNN): Accuracy: 92.11% Cross validation score: 88.23% (+/- 7.06%)
-
Naive Bayes : Accuracy: 94.74% Cross validation score: 90.87% (+/- 5.91%)
Out of all Naive Bayes worked best for this particular data set.
Further when this was implemented by Simple Neural Network the results obtained were :
Accuracy: 87.72% Loss: 0.4984
And to improve this, when dense neural network with some regularization techniques were used the results improved to:
Accuracy: 90.35% Loss: 0.2877
To get a local copy up and running follow these simple example steps.
You will need:
- Python
- Tensorflow
- scikit-learn
- Make sure you have python3 setup on your system
- Clone the repo
git clone https://github.com/ctrl-gaurav/Breast-Cancer-Cell-Type-Classification.git
- Install requirements
pip install -r requirements.txt
- Run script.py
python script.py
Accuracy : With Single Neural Network Architecture
Accuracy : With Multiple Hidden Layers Architecture
Accuracy : With Multiple Hidden Layers and with Regularization Techniques
See the open issues for a list of proposed features (and known issues).
To add your contributions to this project follow these steps :
- Fork the Project
- Create your improvements Branch (
git checkout -b improvements/myimprovements
) - Commit your Changes (
git commit -m 'Done some Improvements'
) - Push to the Branch (
git push origin improvements/myimprovements
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
- Gaurav
- Project Link: https://github.com/ctrl-gaurav/Breast-Cancer-Cell-Type-Classification