Skip to content

RohanChatt98/CancerPrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

  1. Importing Libraries: The code begins by importing necessary libraries such as pandas for data manipulation, numpy for numerical operations, matplotlib.pyplot for plotting graphs, and various modules from scikit-learn for machine learning tasks.

  2. Loading Data: The code loads the cancer data from a CSV file named "cancerdata.csv" into a pandas DataFrame named cancerdata.

  3. Data Preprocessing & EDA:

Converting the diagnosis column from 'B' to 'Benign' and from 'M' to 'Malignant'. Dropping the 'id' column. Printing information about the DataFrame using info() method. Describing the DataFrame using describe() method. Splitting Data: Splitting the DataFrame into features (cancerdata_X) and target (cancerdata_y).

4.Pipeline Creation:

Defining numeric and categorical features. Creating pipelines for numerical and categorical preprocessing. Combining the pipelines using ColumnTransformer to handle different types of features. Fitting and transforming the data using fit() and transform() methods. Scaling Features: Creating a pipeline to scale the features using MinMaxScaler().

Splitting Data into Train and Test Sets: Using train_test_split() function from scikit-learn to split the data into training and testing sets.

  1. Model Training:

Iterating over different numbers of neighbors for KNN classifier. Training KNN classifiers for each number of neighbors. Calculating training and testing accuracies and storing them in a list. Plotting Accuracy: Plotting the training and testing accuracies against the number of neighbors.

  1. Model Evaluation:

Training a KNN classifier with a chosen number of neighbors. Predicting the target labels for the test data. Calculating accuracy using accuracy_score() function. Creating a confusion matrix using crosstab() function from pandas.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages