This project implements a Convolutional Neural Network (CNN) from scratch using PyTorch to classify handwritten digits from the famous MNIST dataset.
It is designed with a modular software engineering approach, separating data loading, model architecture, and training logic, making it more robust and scalable than typical notebook-based implementations.
-
src/data_loader.py: Handles downloading MNIST, normalization ($\mu=0.1307, \sigma=0.3081$ ), and creating Train/Validation/Test DataLoaders. -
src/model.py: Defines the CNN architecture and implements custom manual weight initialization (Gaussian distribution). -
src/trainer.py: Contains the training loop, validation logic, and evaluation metrics. -
main.py: The entry point that orchestrates device selection (CPU/GPU), hyperparameter configuration, training, and plotting.
The network follows a specific architecture designed for high accuracy:
- Input: 28x28x1 Grayscale Image.
-
Conv Layer 1: 25 filters,
$12 \times 12$ kernel, stride 2, no padding. -
Conv Layer 2: 64 filters,
$5 \times 5$ kernel, stride 1, padding 'same'. -
Max Pooling:
$2 \times 2$ kernel. - Flatten: Converts 3D feature maps to 1D vector.
-
Fully Connected 1: 1024 units + ReLU + Dropout (
$p=0.2$ ). - Output Layer: 10 units (Softmax via CrossEntropyLoss).
Unlike default PyTorch initialization, this model manually initializes weights to demonstrate understanding of neural network internals:
-
Weights: Gaussian distribution (
$\mu=0, \sigma^2=0.0025$ ). -
Biases: Constant value (
$0.1$ ).
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
(Requires
torch,torchvision,matplotlib,torchsummary)
Run the complete training and evaluation pipeline:
python main.pyThe script will automatically:
- Detect if CUDA (GPU) is available.
- Download the dataset.
- Train the model for the defined number of epochs.
- Save training accuracy/loss plots to
output/. - Save the trained model weights to
models/mnist_cnn.pth.
Typical performance metrics after 5 epochs (~6000 iterations):
- Training Accuracy: > 98%
- Validation Accuracy: > 98%
- Test Set Accuracy: ~99%
Check the output/ folder for the loss and accuracy curves generated during training.