This repository contains my work on detection and segmentation using YOLOv8. The project focuses on experimenting with the YOLOv8 architecture for object detection and segmentation, specifically aimed at detecting guns in images. Throughout this project, I explored how model fine-tuning, segmentation, and detection work, while analyzing the impact of modifications on model performance.
- Base Model: YOLOv8n.pt
- Dataset: The dataset consists of images with guns as the primary object class, with detection based on a single class: 'gun'.
- Experimented with various hyperparameters.
- Added C2f layers at the end of the model.
- Removed C2f and Conv layers in pairs from both the initial and final stages.
- Impact: All modifications affected Precision and metrics, mostly leading to a decrease in performance.
- Best Results: Detection Best Results
- Metrics:
- Precision: 0.8706
- Recall: 0.8000
- mAP@50: 0.8273
- mAP@50-95: 0.4708
- Base Model: YOLOv8n-seg.pt
- Dataset: The same dataset was used, but it was annotated by creating masks for each image using RoboFlow.
- Experimented with different hyperparameters similar to detection.
- Evaluated the model's ability to segment the 'gun' class effectively.
- Best Results: Segmentation Best Results
- Metrics:
- Precision: 0.9937
- Recall: 0.7
- mAP@50: 0.8432
- mAP@50-95: 0.5908
- Box Precision: 0.8679
- Box Recall: 0.8
- Mask Precision: 0.9177
- Mask Recall: 0.3290
-
Input Image through Conv and C2f Layers
- Conv: Extracts key features.
- C2f: Refines these features and improves gradients.
-
SPPF for Multi-scale Feature Extraction
- Scales the feature map at different levels.
- Produces a concatenated feature map (feature pyramid) from different scales.
-
Detection Module
- Predicts bounding boxes at each level/scale.
- Uses NMS to retain the highest confidence bounding box for each object.
-
Proto Layer for Pixel-wise Object Scores
- Assigns pixel-wise object scores for the entire image, producing segmentation masks on the feature map.
-
Segmentation Module
- Extracts the segmentation mask corresponding to the predicted bounding box.
- Converts the segmentation mask into a binary mask.
- Upscales the mask to match the original image dimensions before outputting the final result.
This project demonstrates the versatility of YOLOv8 in both detection and segmentation tasks. Through various modifications to the model architecture and hyperparameters, I aimed to optimize performance for detecting and segmenting firearms.