automated pipeline for training custom YOLOv5 object detection models using the YouTube Bounding Box dataset
- simple notebook interface
- supports 23 different object classes (person, cat, dog, vehicles, etc.)
- processes YouTube videos into training-ready datasets
- supports cloud or local GPU resources for training models
- set parameters in
yolobook.ipynb - run cells sequentially -> get trained model + performance metrics
- upload a photo or video to test model inference
data processing (src/utility/process-data.py):
- downloads YouTube videos (added multithreading, ~3x faster now)
- extracts frames efficiently (a single ffmpeg call per video)
- generates YOLO-format bounding box labels
- remaps class IDs to zero-indexed format
- splits data into train/validation/test sets
- handles cleanup and error cases
YOLOv5 integration:
- pre-configured for accepted dataset formats
- automated YAML config generation
- ready-to-run training pipeline
utilities:
visual-check.py- randomized dataset inspection with bounding box overlaysdatapaths-to-txt.py- generates image path files for training
