Machine Learning Classification and Clustering Project

A comprehensive collection of machine learning algorithms implemented using scikit-learn, covering both supervised and unsupervised learning techniques.

Overview

This Repo demonstrates various machine learning algorithms for classification, regression, and clustering tasks. Each algorithm is implemented with detailed examples, proper error handling, and comprehensive output analysis.

Project Structure

ML-Lab/
├── supervised_learning/
│   ├── classification/
│   │   ├── svm_comparison.py          # SVM vs Random Forest comparison
│   │   ├── logistic_regression.py     # Logistic regression on Iris dataset
│   │   └── classification_metrics.py  # Comprehensive metrics evaluation
│   └── regression/
│       └── linear_regression.py       # Linear regression for classification
├── unsupervised_learning/
│   └── clustering/
│       ├── k_means.py                 # K-means clustering analysis
│       └── agglomerativ.py           # Agglomerative clustering methods
├── pyproject.toml                     # Project dependencies
├── README.md                          # This file
└── uv.lock                           # Dependency lock file

Installation

This project uses uv for dependency management. Follow these steps to set up:

# Clone the repository
git clone https://github.com/AyanQuadri/ML-Lab.git
cd ML-Lab

# Create virtual environment
uv venv

# Activate virtual environment (optional, uv run handles this)
source .venv/bin/activate  # Linux/Mac
# or
.venv\Scripts\activate     # Windows

# Install dependencies
uv sync

Algorithms Implemented

Supervised Learning

Classification Algorithms

1. SVM vs Random Forest Comparison

Purpose: Compare Support Vector Machine and Random Forest performance
Dataset: Iris dataset
Features:
- Linear SVM implementation
- Random Forest with 100 estimators
- Accuracy comparison
Run: uv run python supervised_learning/classification/svm_comparison.py

2. Logistic Regression

Purpose: Multi-class classification using logistic regression
Dataset: Iris dataset (3 classes)
Features:
- Complete classification report
- Dataset information display
- Convergence optimization
Run: uv run python supervised_learning/classification/logistic_regression.py

3. Classification Metrics Analysis

Purpose: Comprehensive evaluation of classification algorithms
Algorithms: Logistic Regression, Decision Tree
Metrics Calculated:
- Accuracy, Precision, Recall, F1-Score
- Confusion Matrix
- Per-class TP, FP, TN, FN values
Run: uv run python supervised_learning/classification/classification_metrics.py

Regression Algorithms

1. Linear Regression for Classification

Purpose: Demonstrate linear regression for binary classification
Features:
- Simple binary classification example
- Model parameter extraction
- Threshold-based classification
Run: uv run python supervised_learning/regression/linear_regression.py

Unsupervised Learning

Clustering Algorithms

1. K-Means Clustering

Purpose: Cluster analysis using K-means algorithm
Dataset: Iris dataset
Features:
- 3-cluster analysis
- Inertia calculation
- Species distribution per cluster
- Clustering accuracy assessment
Run: uv run python unsupervised_learning/clustering/k_means.py

2. Agglomerative Clustering

Purpose: Hierarchical clustering analysis
Datasets: Synthetic blob data + Iris dataset
Features:
- Multiple linkage methods (ward, complete, average, single)
- Synthetic and real data comparison
- Adjusted Rand Index calculation
Run: uv run python unsupervised_learning/clustering/agglomerativ.py

Usage

Run Individual Algorithms

# Classification algorithms
uv run python supervised_learning/classification/svm_comparison.py
uv run python supervised_learning/classification/logistic_regression.py
uv run python supervised_learning/classification/classification_metrics.py

# Regression algorithms
uv run python supervised_learning/regression/linear_regression.py

# Clustering algorithms
uv run python unsupervised_learning/clustering/k_means.py
uv run python unsupervised_learning/clustering/agglomerativ.py

Expected Outputs

Classification Algorithms

SVM vs Random Forest: Accuracy scores comparison (typically both achieve 1.0 on Iris)
Logistic Regression: Detailed classification report with precision/recall for each species
Classification Metrics: Comprehensive confusion matrix and per-class metrics

Regression Algorithms

Linear Regression: Predicted probability, class assignment, and model parameters

Clustering Algorithms

K-Means: Cluster centers, species distribution, clustering accuracy (~89%)
Agglomerative: Multiple linkage method results, cluster purity analysis

Note: All algorithms use the Iris dataset for consistency and comparison purposes, except where synthetic data provides better demonstration of specific algorithmic properties.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Learning Classification and Clustering Project

Table of Contents

Overview

Project Structure

Installation

Algorithms Implemented

Supervised Learning

Classification Algorithms

1. SVM vs Random Forest Comparison

2. Logistic Regression

3. Classification Metrics Analysis

Regression Algorithms

1. Linear Regression for Classification

Unsupervised Learning

Clustering Algorithms

1. K-Means Clustering

2. Agglomerative Clustering

Usage

Run Individual Algorithms

Expected Outputs

Classification Algorithms

Regression Algorithms

Clustering Algorithms

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
supervised_learning		supervised_learning
unsupervised_learning/clustering		unsupervised_learning/clustering
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

AyanQuadri/ML-Lab

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Classification and Clustering Project

Table of Contents

Overview

Project Structure

Installation

Algorithms Implemented

Supervised Learning

Classification Algorithms

1. SVM vs Random Forest Comparison

2. Logistic Regression

3. Classification Metrics Analysis

Regression Algorithms

1. Linear Regression for Classification

Unsupervised Learning

Clustering Algorithms

1. K-Means Clustering

2. Agglomerative Clustering

Usage

Run Individual Algorithms

Expected Outputs

Classification Algorithms

Regression Algorithms

Clustering Algorithms

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages