GitHub - tomasmajercik/ML-basics: 6 first ML/AI subprojects

Introduction to ML basics: set of 3 projects (evolutionary algorithm, clustering and basic neural network design)

TL;DR

I completed 6 hands-on projects during an Introduction to Machine Learning course.

An evolutionary algorithm that evolved programs to collect treasures on a grid in as few steps as possible.
Clustering
- k-means clustering using centroids to group 40,000+ spatially biased points into tight clusters
- k-means clustering using medoids (more robust to outliers),
- divisive hierarchical clustering, also using centroids.
Neural networks
- Built a multilayer perceptron in PyTorch to classify handwritten digits with over 97% accuracy, testing different optimizers (SGD, momentum, Adam).
- I implement backpropagation from scratch - fully functional backprop in NumPy, used to train a network to solve XOR with modular layers and manual gradient updates

📝 Project description

🧬 Project 1: Evolutionary Algorithm - Treasure Hunt

This project was the first part of a three-part school course introducing machine learning principles. It focused on evolutionary algorithms applied to a simple, gamified problem: finding the best set of instructions for a player to collect all treasures on a 7x7 grid map using as few steps as possible. The map contained a starting point, five hidden treasures, and empty spaces. The player could move in four directions (up, down, left, right), and every time they landed on a treasure, it was added to their score. The challenge was to evolve a set of "programs" (sequences of moves) that would maximize collected treasures while minimizing steps.

The solution used a classic genetic algorithm, including: random population initialization, fitness evaluation based on number of treasures and steps, selection of top individuals, tournament-style crossover, and basic mutation.

Once no improvement was observed after a fixed number of generations, the algorithm stopped and printed the best-performing solution:

===========================
The best program ended with fitness value of 5.064
It took 18 steps
And found 5 treasures
The map:
['o', 'o', 'o', 'o', 'o', 'o', 'o']
['o', 'o', 'o', 'o', 'x', 'o', 'o']
['o', 'o', 'x', 'x', 'x', 'x', 'x']
['o', 'o', 'x', 'o', 'o', 'o', 'x']
['o', 'x', 'x', 'o', 'o', 'o', 'o']
['o', 'x', 'x', 'x', 'x', 'o', 'o']
['o', 'o', 'x', 'x', 'o', 'o', 'o']
The moves:
['L', 'H', 'P', 'P', 'L', 'L', 'L', 'H', 'P', 'H', 'H', 'P', 'P', 'H', 'D', 'P', 'P', 'D', 'D', 'L', 'D', 'P']
===========================

⠍⠇ Project 2: Clustering

This was the second part of our school course on machine learning, focusing on unsupervised learning - specifically, clustering algorithms in a simulated 2D space. We started by generating a 2D plane of size [-5000, 5000] in both axes, populated with 20 initial seed points, placed at random but unique coordinates, and 40,000 additional points that were generated with a bias. The core task was to analyze this large 2D space and implement multiple clustering algorithms to detect the hidden groupings:

k-means clustering using centroids,
k-means clustering using medoids (more robust to outliers),
divisive hierarchical clustering, also using centroids.

The main challenge was balancing algorithm efficiency with clustering quality, especially on a dataset this large. Each method was evaluated based on the average intra-cluster distance, and only clusters with average distance under 500 were considered successful. I implemented all three clustering methods from scratch and added visualizations to plot the clustered data points, color-coded and optionally labeled. This helped verify not just correctness, but also provided intuition for how each algorithm behaves.

🧠 Project 3: Neural networks

MNIST - digits recognizing

The final stage of the course brought everything together with a hands-on classification task: recognizing handwritten digits from the MNIST dataset using feedforward neural networks (MLP). The network was trained on 60,000 grayscale 28×28 pixel images and evaluated on 10,000 test samples. The architecture was built in PyTorch, and I explored multiple optimization strategies: SGD (Stochastic Gradient Descent), SGD with momentum, and the ADAM optimizer. The project emphasized: dataset preprocessing and normalization, tuning hyperparameters like learning rate, batch size, and hidden layer sizes, visualizing training progress and confusion matrices, and evaluating generalization performance across optimizers. This was my first real classification task using a deep learning framework, and it helped me understand both the practical training process and model evaluation in a controlled setting.

XOR problem

As a side challenge to reinforce theoretical understanding, we were asked to manually implement backpropagation using only NumPy, with no automatic differentiation libraries allowed. I created a modular architecture with layers like: linear (fully connected), activation functions (ReLU, sigmoid, tanh), and MSE loss. Each module supported both forward() and backward() passes, and parameter updates were performed either with vanilla SGD or with momentum. This project helped demystify how gradients actually flow through a network, how layers interact, and how parameter updates gradually reduce error — without relying on high-level libraries. It gave me a deeper appreciation of what libraries like PyTorch do under the hood.

================================================
Epoch 100, loss: 0.16732414845108892 # not very confident
0 0 | [0.12886275] → ([0.])
0 1 | [0.55949593] → ([1.])
1 0 | [0.65761626] → ([1.])
1 1 | [0.58431201] → ([1.])
================================================
Epoch 500, loss: 0.0017811014684536426 # absolutely confident
0 0 | [0.00359476] → ([0.])
0 1 | [0.94166895] → ([1.])
1 0 | [0.93956583] → ([1.])
1 1 | [0.00752886] → ([0.])
================================================

🌱 Skills gained & problems overcomed

Core Machine Learning Concepts
Implemented ML algorithms manually
Optimization Techniques
Neural Network Fundamentals
Data Preprocessing and Representation
Visualization and Evaluation

⚙️ How to run

1. Genetic algorithm

cd evolutionary-algorithm/
python main.py

2. Clustering

Centroid k means clustering

cd centroid-k-means/
python main.py

Divisive clustering

cd divisive-clustering/
python main.py

Medoid k means clustering

cd medoid-k-means/
python main.py

3a. MNIST

cd NN-basics/number-recognition/
python main.py

3b. XOR problem

cd NN-basics/xor-problem/
python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
NN-basics		NN-basics
clustering		clustering
evolutionary-algorithm		evolutionary-algorithm
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction to ML basics: set of 3 projects (evolutionary algorithm, clustering and basic neural network design)

TL;DR

📝 Project description

🧬 Project 1: Evolutionary Algorithm - Treasure Hunt

⠍⠇ Project 2: Clustering

🧠 Project 3: Neural networks

MNIST - digits recognizing

XOR problem

🌱 Skills gained & problems overcomed

⚙️ How to run

1. Genetic algorithm

2. Clustering

3a. MNIST

3b. XOR problem

About

Uh oh!

Releases

Packages

Languages

tomasmajercik/ML-basics

Folders and files

Latest commit

History

Repository files navigation

Introduction to ML basics: set of 3 projects (evolutionary algorithm, clustering and basic neural network design)

TL;DR

📝 Project description

🧬 Project 1: Evolutionary Algorithm - Treasure Hunt

⠍⠇ Project 2: Clustering

🧠 Project 3: Neural networks

MNIST - digits recognizing

XOR problem

🌱 Skills gained & problems overcomed

⚙️ How to run

1. Genetic algorithm

2. Clustering

3a. MNIST

3b. XOR problem

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages