Detect retail products via the YOLOv8 object recognition engine
Demo: https://www.youtube.com/watch?v=yIRT5nHoH78
Go to the correct directory testing
and run one of the following commands:
python3 video_object_detection.py
for video
python3 image_object_detection.py
for image
python3 webcam_object_detection.py
for webcam
Correct version of pytorch (Win10/11) pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
As we are on windows we'll also have to download the correct cuda-combatible versions for torch and torchvision.
Heavily inspired by this article and this Kaggle, but applied to YOLOv8 instead of YOLOv5 (GitHub and model of YOLOv5 trained on same data).
Training data is taken from the SKU110k dataset (download from kaggle), which holds several gigabytes of prelabeled images of the subject matter.
After installing CUDA correctly run the following command to begin training:
yolo task=detect mode=train model=yolov8n.pt data=custom.yaml epochs=300 imgsz=320 workers=4 batch=8
Models with exceptional performance used in the field. Versions 0.2.0-0.2.1 used YOLOv8m, versions 0.2.2-Onwards use YOLOv8l.
Example predictions (mislabeled) from a 0.2.1 run:
Model(s) used to test the capabilities of the models in some example scenarios. Used YOLOv8s as base model.
Example predictions (mislabeled) from a (0.1.3) run:
Model(s) used to test whether it was possible to actually train on this dataset. Used YOLOv8n as base model.
Our findings were somewhat dissatisfactory when it came to the actual results of the training, however they did result in some models that were not completely useless. Thus we went on to invest more resources into training better models.
Example predictions form a (0.0.1) run:
Model that instead of getting the retail objects, gets the empty shelves where retail items should be.