Welcome to the Digit Recognition with Object Detection project! This exciting project combines classification and localization to not only identify digits (0-9) but also pinpoint their locations in images using bounding boxes. Built with TensorFlow and the MNIST dataset, it showcases deep learning in action with stunning visualizations of predicted and true bounding boxes, complete with Intersection over Union (IoU) metrics to evaluate performance. ๐
๐ Digit Classification: Accurately identifies digits (0-9) using a convolutional neural network.
๐ Bounding Box Localization: Predicts precise coordinates for digit locations in images.
๐ผ๏ธ Visualizations: Displays digits with red (predicted) and green (true) bounding boxes, highlighting classification accuracy and IoU scores.
โ๏ธ Data Preprocessing: Loads and preprocesses the MNIST dataset with TensorFlow Datasets, padding images to 75x75 and normalizing coordinates.
๐ Metrics: Evaluates model performance with classification accuracy and IoU for localization.
Get started in just a few steps! ๐
Prerequisites
Python 3.11 or later ๐
A passion for machine learning! ๐
Steps
Clone the Repository:
git clone (https://github.com/mdowais-39/Digit-Recognition)
cd digit-recognition-object-detection
Set Up a Virtual Environment (recommended)
Install Dependencies:
pip install -r requirements.txt
Dataset Download:
The MNIST dataset is automatically downloaded via TensorFlow Datasets when you run the notebook.
Dive into the project with these simple steps:
Launch the Jupyter Notebook:
jupyter notebook Object_Detection.ipynb
Explore the Notebook:
๐ Section 1: Imports libraries like TensorFlow, NumPy, Matplotlib, and PIL.
๐๏ธ Section 2: Defines visualization utilities for drawing bounding boxes and displaying digits.
๐๏ธ Section 3: Loads and preprocesses the MNIST dataset, padding images and generating normalized bounding box coordinates.
๐ Visualization: Shows predicted vs. true labels with bounding boxes and IoU metrics.
The notebook uses a pre-trained model (model.predict). To train your own, add model definition and training code (see Future Improvements).
Run the notebook to see predictions and visualizations.
๐ท Visualizations of digits with red (predicted) and green (true) bounding boxes.
๐ด Incorrect predictions are highlighted in red.
๐ IoU scores indicate localization accuracy (red if below 0.6).
The project uses the MNIST dataset via TensorFlow Datasets. Images are padded to 75x75 pixels, and bounding box coordinates are normalized for localization tasks.
The display_digits_with_boxes function creates stunning visuals:
Digits: Displayed with predicted (red) and true (green) bounding boxes.
Labels: Predicted labels turn red if incorrect.
IoU Scores: Localization accuracy is shown, with low scores (<0.6) in red.
๐ ๏ธ Add model definition and training code for a self-contained notebook.
๐ Implement data augmentation for improved robustness.
๐ฌ Experiment with advanced architectures like Yoconvolutional neural networks for better localization.
๐ท Support multiple digits per image for real-world applications.
๐ก Follow these steps to contribute:
Fork the repository ๐ด
Create a feature branch (git checkout -b feature-branch)
Commit your changes (git commit -m "Add cool feature")
Push to the branch (git push origin feature-branch)
Open a pull request ๐
TensorFlow for the awesome deep learning framework.
MNIST Dataset for the classic digit dataset.
TensorFlow Datasets for seamless data loading.
Star this repo if you found it helpful! ๐ Let's make digit recognition even better together!