Multi class classification neural netwrok coded from scratch for the fashion-mnist dataset.
-
Clone the GitHub repository by running the command:
git clone https://github.com/Nikshay-Jain/DA6401-Assign-1.git
-
Select the main directory as your working directory which has the following contents:
A1_MM21B044 ├── src/ # source folder for all codes │ ├── dataset.py # prepares dataset │ ├── model_arch.txt # contains code for the model | |── supporting_funcs.py # contain functions necessary to be used │ ├── train.py # trains and tests the model │ ├── wandb_setup.py # Sets up wandb.ai ├── wandb/ # Contains the log files for wandb │ ├── <folders1>/ # folders for logs │ ├── <folders1>/ # folders for logs ├── .gitignore # Ignoring the venv files ├── fashion-mnist.npz # fashion-mnist dataset ├── requirements.txt # Dependencies for the project ├── Readme.md # This file -
Requirements:
Python version: 3.9+
pip install -r requirements.txt
It has the following libraries
- tqdm - to help judge the time taken
- wandb - to keep track of the experiments
- keras - to get the dataset
- tensorflow - as a supporting library for keras
- numpy - for matrix operations
- matplotlib - to plot figures
- argparse - to parse arguments
-
Setup wandb.ai:
wandb login
This would prompt you for API key. Just insert it in the command line and the server gets connected.
-
Starting execution:
python .\src\dataset.py python .\src\train.py
Extra arguments can be passed while executing train in the command line itself as tabulated in the Assignment sheet.
Here is the snippet mentioned:
Argument Default Value Description -wp, --wandb_projectDL-Assign-1Project name used to track experiments in Weights & Biases dashboard. -we, --wandb_entitymm21b044W&B Entity used to track experiments in the Weights & Biases dashboard. -d, --datasetfashion_mnistDataset to use. Choices: ["mnist", "fashion_mnist"]-e, --epochs10Number of epochs to train the neural network. -b, --batch_size32Batch size used to train the neural network. -l, --losscross_entropyLoss function to use. Choices: ["mean_squared_error", "cross_entropy"]-o, --optimizersgdOptimizer to use. Choices: ["sgd", "momentum", "nag", "rmsprop", "adam", "nadam"]-lr, --learning_rate1e-4Learning rate used to optimize model parameters. -m, --momentum0.9Momentum used by Momentum and NAG optimizers. -beta, --beta0.9Beta used by RMSprop optimizer. -beta1, --beta10.9Beta1 used by Adam and Nadam optimizers. -beta2, --beta20.999Beta2 used by Adam and Nadam optimizers. -eps, --epsilon1e-10Epsilon used by optimizers. -w_d, --weight_decay0.0Weight decay used by optimizers. -w_i, --weight_initXavierWeight initialization method. Choices: ["random", "Xavier"]-nhl, --num_layers3Number of hidden layers used in the feedforward neural network. -sz, --hidden_size512Number of hidden neurons in a feedforward layer. -a, --activationreluActivation function to use. Choices: ["identity", "sigmoid", "tanh", "ReLU"]
Contains the code to download and store the dataset as a .npz file for seamless future usage. It makes a wandb entry for the same too.
This module contains essential functions for data preprocessing, activation functions, loss functions, and their derivatives to support neural network training.
-
Dataset Loading & Preprocessing:
load_dataset(): Loads the Fashion-MNIST dataset.one_hot(inp): Converts labels to one-hot encoded format.Preprocess(X, y): Normalizes input data and applies one-hot encoding to labels.train_val_split(X, y, splits=0.1): Splits data into training and validation sets.
-
Activation Functions & Their Derivatives:
get_activation(activation): Returns the activation function (sigmoid,softmax,ReLU,etc).diff_activation(activation): Returns the derivative of the activation function.
-
Loss Functions & Their Derivatives:
get_loss(loss): Computes the loss function (cross_entropy,mean_squared_error).get_diff_loss(loss): Computes the gradient of the loss function.
This module is designed for efficient implementation of neural networks with custom activation and loss functions.
Contains the entire architecure for the neural network.
Structure of this file includes:- Includes:-
- layer class - Each layer can be initialized to different sizes, activations, initializations (He/Xavier/Random)
- Model class - It is the complete model.
- Contains list of layers, loss metric, derivatives of loss mertic, regularization parameter lambda, batch size.
- Opimizer class - Contains code for the optimisers
- wandb_logger: logs to wandb
- iterate: the control iterates over epochs, performs forward, backprop
- Calls updator, loss_calc methods
- Has model object, implements ealry stopping and returns the best model class trough Optimizer.model
- uses a single layer size, activation, and initialization for each layer in Optimizer.model
Features
- Supports multiple optimizers: SGD, Momentum, NAG, RMSprop, Adam, Nadam
- Dataset loading & preprocessing: Supports Fashion-MNIST and MNIST
- Hyperparameter tuning: Uses WandB sweeps for optimization
- Training & validation modes: Supports both full training and train-validation split
- Metrics tracking: Logs loss, accuracy, confusion matrix, and visualizations in WandB
Workflow
-
Argument Parsing
- The script accepts various hyperparameters via command-line arguments:
--wandb_project <project_name>
- The script accepts various hyperparameters via command-line arguments:
-
WandB Setup
- Initializes Weights & Biases (WandB) for experiment logging and tracking.
-
Dataset Loading & Preprocessing
- Loads Fashion-MNIST or MNIST dataset, applies normalization, and prepares training/testing sets.
-
Optimizer Selection
- Selects an optimizer dynamically from the supported types:
- SGD
- Momentum
- NAG (Nesterov Accelerated Gradient)
- RMSprop
- Adam
- Nadam
- Selects an optimizer dynamically from the supported types:
-
Training & Validation
- The dataset is split into train:val = 90:10, while the test set is maintained seperate.
- After completeing all epochs, the model is run on the unseen test set to log the metrics.
-
Training Process
- Applies the selected optimizer iteratively.
- Logs loss and accuracy at each step using WandB.
-
Hyperparameter Tuning with WandB Sweeps
- Supports hyperparameter tuning via grid search using WandB sweeps.
- Optimizes for minimum loss by exploring different configurations.
-
Evaluation & Metrics Logging
- After training, logs test accuracy and additional metrics.
- Generates confusion matrix and loss comparison plots.
This module provides functions to integrate Weights & Biases (W&B) for experiment tracking, logging metrics, and visualizing model performance.
-
W&B Setup:
setup_wandb(project_name, run_name, args): Initializes W&B for logging experiments. -
Logging Metrics:
log_metrics(epoch, loss, accuracy): Logs training loss and accuracy for each epoch. -
Model Evaluation Logging:
log_evaluation(model, X_test, Y_test): Logs test accuracy, confusion matrix, and loss comparison plots. -
Finishing the W&B Session:
finish_wandb(): Ends the W&B logging session.
This module simplifies tracking and evaluating deep learning experiments using W&B.
The WandB report for the assignment can be accessed through: https://wandb.ai/mm21b044-indian-institute-of-technology-madras/DL-Assign-1/reports/DA6401-Assignment-1--VmlldzoxMTgzMDg0OQ?accessToken=z54kzkplm6ggnn7dn6rx71m2w9g0ce2v6fmtpcai4iaab0sns7ty7yhacusndfzt