Case study of machine learning and datamining techniques

Using the Breast Cancer datasets

Add photos

This project was done in the course of Introduction of artificial intelligence in the University Politechnika Warszawa.

Main task

The task of this project is to make a case study of the different classifiers, In order to do that we will using a Breast Cancer Dataset.

Description of the dataset

This Breast Cancer Database. Features were computed from a digitized image of a breast mass. They describe characteristics of the cell nuclei present in the image. This dataset is formed by 10 attributes and each instance has one of possible classes: benign or malignant.

attribute	value
Sample code number	id number
Clump Thickness	1 - 10
Uniformity of Cell Size	1 - 10
Uniformity of Cell Shape	1 - 10
Marginal Adhesion	1 - 10
Single Epithelial Cell Size	1 - 10
Bare Nuclei	1 - 10
Bland Chromatin	1 - 10
Normal Nucleoli	1 - 10
Mitoses	1 - 10
Class	B or M

In this dataset we have this class distribution:

Benign: 458 65.5%
Malignant: 241 34.5%

training set and test set

Splitting the data is one of the most important steps and concepts in Machine Learning. we need to split the dataset into training set and testing set.

TRAINING SET: used to build and train the model.
TESTING SET: this is the data used to check the model previously created from the training set. We will generate ROC curse to measure the performance of the different classifiers.

For this case study we will separate the data into 75% training set and 25% testing set. Also we need to make sure that both sets contains data that belong to both class.

conclusion

Model	Accuracy
ctrees	96,28
rpart trees	96,27
random Forest	94,68
SVM	96,28
Bayer	92,02

Add photos

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
__pycache__		__pycache__
input		input
output		output
source		source
README.md		README.md
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Case study of machine learning and datamining techniques

Using the Breast Cancer datasets

This project was done in the course of Introduction of artificial intelligence in the University Politechnika Warszawa.

Main task

Description of the dataset

training set and test set

conclusion

About

Releases

Packages

Languages

alexciru/Data_Mining_Proyect

Folders and files

Latest commit

History

Repository files navigation

Case study of machine learning and datamining techniques

Using the Breast Cancer datasets

This project was done in the course of Introduction of artificial intelligence in the University Politechnika Warszawa.

Main task

Description of the dataset

training set and test set

conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages