This project demonstrates how to fine-tune a Faster R-CNN model, pre-trained on the COCO dataset, for object detection on the Pascal VOC 2012 dataset. The model is adjusted to handle the 20 classes present in VOC, down from the 90 classes in COCO.
This project aims to provide a practical example of fine-tuning a pre-trained Faster R-CNN model for a specific object detection task. By leveraging a pre-trained model and adapting it to a new dataset, we can achieve effective object detection with limited computational resources.
- 🏋️♂️ Faster R-CNN Backbone: Initially uses MobileNet v3 and ResNet50 to experiment with lightweight and larger architectures.
- 📦 Pascal VOC 2012 Dataset: Custom loader for VOC dataset with 20 object categories.
- 📊 Training & Evaluation: Custom training script to evaluate the model and store the checkpoint with the highest mAP.
┌──────────────────────┐
│ Pascal VOC │
│ (20 Object Classes)│
└─────────▲─────────────┘
│
┌─────────▼─────────────┐
│ Pre-trained Faster │
│ R-CNN Model │
│ (MobileNet v3/ResNet)│
└─────────▲─────────────┘
│
┌─────────▼─────────────┐
│ Fine-tune Specific │
│ Layers & Output Heads │
└─────────▲─────────────┘
│
┌─────────▼─────────────┐
│ Training & Evaluation │
│ (Save Checkpoint with │
│ Best mAP) │
└───────────────────────┘
- 🔧 Fine-tuning: Adapts a pre-trained Faster R-CNN model for the Pascal VOC 2012 dataset.
- 🏗️ Multiple Backbones: Explores performance with MobileNet v3 and ResNet50 backbones.
- ⚡ Resource-Aware Training: Demonstrates training on both Google Colab (limited resources) and Kaggle (enhanced resources) environments.
- 📈 Evaluation: Provides scripts for testing the model on new images and reports mAP scores.
The project consists of two main implementations:
1. 🖥️ Google Colab Implementation:
- 📄
FRCNN_main.ipynb
: Main Jupyter Notebook to execute the training and testing process. - 🧪
test_new_data.py
: Script to test the trained model on new images. - 🛠️
training.py
: Contains the model definition, training loop, and saves the best model checkpoint based on mAP. - 📦
voc_dataset_format.py
: Handles loading and preprocessing the Pascal VOC dataset.
2. 📊 Kaggle Implementation:
- 📄
frcnn.ipynb
: Single Jupyter Notebook integrating training and evaluation using MobileNet v3 and ResNet50 backbones.
Google Colab:
- Upload the project files to your Google Colab environment.
- Open
FRCNN_main.ipynb
and run all cells.
Kaggle:
- Create a new Kaggle Notebook and upload the project files.
- Run the
frcnn.ipynb.ipynb
notebook.
The achieved mAP (mean Average Precision) scores for different configurations are:
Platform | Backbone | Trainable Backbone Layers | Epochs | mAP | Notes |
---|---|---|---|---|---|
Colab | MobileNet v3 | 0 | 2 | 0.0299 | Limited by resources |
Kaggle | MobileNet v3 | 3 | 10 | 0.1366 | |
Kaggle | MobileNet v3 | 6 | 10 | 0.0957 | |
Kaggle | ResNet50 | 2 | 5 | 0.0084 | Requires more epochs for better results |
Q: Why is the mAP score so low? A: Due to the limited computational resources on Colab, the model is trained for only a few epochs, resulting in lower mAP scores. On Kaggle, more epochs are used, and larger backbones are tested.
Q: How can I improve the performance? A: You can try increasing the number of epochs, using a more powerful backbone, or fine-tuning more layers in the backbone.
Q: How do I load my custom dataset?**
A: With Colab, push your image into your Google Drive, connect to the Drive, and insert the path into the test component in FRCNN_main.ipynb
.
For Kaggle, create your own dataset, upload your images, and paste the image path into the test component.
See the demo video in the Colab folder and the prediction output of my model in the Kaggle folder.
- 🔍 Experiment with different backbone architectures (e.g., EfficientNet, Inception).
- ⚙️ Optimize training parameters (learning rate, batch size) for improved performance.
- 🎨 Implement data augmentation techniques to enhance model robustness.
- 📚 Explore transfer learning from other object detection datasets.
Thank you for taking the time to explore this project! Your interest and feedback are greatly appreciated. If you have any questions or suggestions, feel free to reach out!