GitHub - Shashank911/Task-7-SVM-Breast-Cancer-Classification: Support Vector Machine (SVM) models on Kaggle Breast Cancer dataset. Includes data cleaning, linear vs RBF comparison, hyperparameter tuning with GridSearchCV, ROC-AUC, and confusion matrix.

Breast Cancer Classification using Support Vector Machines (SVM)

Overview

This project implements Support Vector Machine (SVM) models to classify breast cancer tumors as benign or malignant using the Breast Cancer Dataset from Kaggle. The goal is to:

Compare Linear and RBF kernel SVM models.
Optimize hyperparameters for improved performance.
Evaluate and rank models based on multiple metrics.
Generate visualizations for interpretability.

About the Dataset

Source: Kaggle Breast Cancer Dataset [https://www.kaggle.com/datasets/yasserh/breast-cancer-dataset] Description:

Contains measurements from breast mass cell nuclei obtained via digitized images.
Features: 30 numerical attributes describing tumor cell properties (e.g., radius, texture, smoothness).
Target variable:
- 0 → Malignant (cancerous)
- 1 → Benign (non-cancerous)
No missing values; ready for preprocessing.

Workflow

Data Loading & Exploration
- Load dataset into Pandas DataFrame.
- Inspect data shape, missing values, and class distribution.
Data Preprocessing
- Encode categorical labels (Malignant/Benign → 0/1).
- Feature scaling using StandardScaler.
Model Building
- Train baseline SVM models (Linear and RBF kernels).
- Perform hyperparameter tuning using GridSearchCV.
Model Evaluation
- Calculate Accuracy, Precision, Recall, F1-score, and ROC-AUC.
- Plot Confusion Matrix and ROC Curves.
- Save evaluation plots in outputs/ directory.
Model Selection
- Rank models primarily by F1-score and ROC-AUC.
- Select best model for predictions.
Results & Insights
- Compare tuned vs baseline performance.
- Discuss findings and practical applications.

Tools & Libraries Used

Python 3.x
Jupyter Notebook — interactive coding environment.
pandas — data manipulation and analysis.
numpy — numerical computations.
scikit-learn — SVM models, preprocessing, metrics, and hyperparameter tuning.
matplotlib — data visualization.
seaborn — advanced plotting and styling.

Project Structure

breast-cancer-svm/ │ ├── SVM_BreastCancer.ipynb # Main Jupyter Notebook ├── outputs/ # Saved plots and results ├── breast_cancer.csv # Dataset (download from Kaggle) └── README.txt # Project documentation

Usage

Place breast_cancer.csv in the project directory.
Open SVM_BreastCancer.ipynb in Jupyter Notebook.
Run cells sequentially to:
- Load and preprocess data.
- Train and evaluate SVM models.
- View plots and metrics in the outputs/ folder.

Results

Example performance (may vary depending on train/test split):

Model	Accuracy	Precision	Recall	F1-score	ROC-AUC
Linear SVM	96.4%	95%	97%	96%	0.98
RBF SVM (Tuned)	97.1%	96%	98%	97%	0.99

Conclusion

SVM models, when properly tuned, deliver highly accurate and robust results for medical classification tasks. This work highlights the potential of machine learning to assist healthcare professionals in early and reliable diagnosis.

Acknowledgment - This project is part of my AI/ML Internship to gain hands-on experience with classification algorithms using real-world datasets.

Author

Shashank Chauhan Email: cshashank899@gmail.com.com GitHub: https://github.com/Shashank911

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Task 7 Support_Vecor_Machines		Task 7 Support_Vecor_Machines
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

About the Dataset

Workflow

Tools & Libraries Used

Project Structure

Usage

Results

Conclusion

Author

About

Uh oh!

Releases

Packages

Languages

Shashank911/Task-7-SVM-Breast-Cancer-Classification

Folders and files

Latest commit

History

Repository files navigation

Overview

About the Dataset

Workflow

Tools & Libraries Used

Project Structure

Usage

Results

Conclusion

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages