This project aims to implement and compare two deep learning models—Convolutional Neural Networks (CNN) and Multi-Layer Perceptrons (MLP)—for image classification on the MNIST and Fashion MNIST datasets. I experimented with different architectures, tuned hyperparameters, and analyzed model performance through visualizations and confusion matrices.
Problem Statement: The goal of this project is to
- Implement a CNN and MLP architecture for image classification.
- Train the models on MNIST and Fashion MNIST datasets.
- Explore different hyperparameters and configurations to improve model performance.
- Compare the models in terms of accuracy, training time, and common misclassifications.
I loaded the MNIST and Fashion MNIST datasets using TensorFlow/Keras. Visualized samples from both datasets to understand the image characteristics.
We built a CNN architecture with two convolutional layers, max-pooling layers, and fully connected layers. The model was trained on both datasets.
- Convolutional Layer with 32 filters (3x3) and ReLU activation.
- MaxPooling Layer (2x2).
- Convolutional Layer with 32 filters (3x3) and ReLU activation.
- MaxPooling Layer (2x2).
- Flatten Layer.
- Dense Layer with 128 neurons and ReLU activation.
- Output Layer with 10 neurons (softmax activation for classification).
- Batch size: 64
- Epochs: 10
- Learning rate: 0.001
To further improve model performance, we experimented with the following hyperparameters
- Increased filter sizes to 64.
- Adjusted kernel size to (5x5) for better feature extraction.
- Added dropout layers (0.5 rate) to reduce overfitting.
- Lowered the learning rate to 0.0001 for more stable training.
The tuned CNN model improved performance, particularly on the Fashion MNIST dataset.
We implemented a Multi-Layer Perceptron (MLP) with the following architecture
- Flatten Layer to transform the image data.
- Dense Layer with 128 neurons and ReLU activation.
- Dropout Layer (0.5) for regularization.
- Dense Layer with 64 neurons and ReLU activation.
- Output Layer with 10 neurons (softmax activation for classification).
- Batch size: 64
- Epochs: 10
- Learning rate: 0.001
To improve the MLP's performance, we adjusted the following hyperparameters
- Increased the number of neurons in the fully connected layers (256, 128).
- Reduced dropout rate to 0.3.
- Lowered learning rate to 0.0001.
- We trained both models (CNN and MLP) on the MNIST and Fashion MNIST datasets.
- Generated training/validation accuracy and loss curves to visualize model performance over epochs.
- Generated confusion matrices to analyze common misclassifications in both models.
- CNN performed better on both datasets, achieving 99% accuracy on MNIST and 88% accuracy on Fashion MNIST.
- MLP performed well but was slightly behind the CNN, with 97% accuracy on MNIST and 87% accuracy on Fashion MNIST.
The CNN model outperformed the MLP model, particularly on the Fashion MNIST dataset, which contains more complex image patterns. Hyperparameter tuning further improved both models' performance, and the analysis of confusion matrices highlighted areas of improvement, such as reducing misclassifications between similar classes.
├── README.md
├── Image_Classification.ipynb
├── Problem Statement
├── Images
└── LICENSE
I. README.md: Contains project overview and explanation.
II. Image_Classification.ipynb: Contains the implementation and experiments CNN model & MLP model. Also Contains code for dataset loading and visualization.
III. Problem Statement: contains all the mere details on how to approach to the project
IV. Images: Folder containing visualization outputs (accuracy/loss curves, confusion matrices).
V. LICENSE: A short and simple permissive license with conditions only requiring preservation of copyright and license notices.
This project was a part of the Internship (Machine Learning Intern at Skolar) which I have successfully completed during April 2024 to June 2024. 🎉
During this internship, I had the amazing opportunity to work on an Image Classification project using CNN and MLP on the MNIST and Fashion MNIST datasets. This project helped me enhance my skills in deep learning, hyperparameter tuning, and model analysis. The experience was both challenging and incredibly rewarding.
💡 Key Takeaways:
- Mastered the application of CNNs and MLPs in image classification.
- Gained a deeper understanding of hyperparameter tuning and model optimization.
- Improved my ability to analyze and interpret model results using visualizations and confusion matrices.
📢 P.S. I have provided the intership certificate in the files.