Skip to content

A modular utility toolkit for managing datasets in machine learning and AI applications

License

Notifications You must be signed in to change notification settings

Dhaboav/dataset-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


dataset-tools

A modular utility toolkit for managing datasets in machine learning and AI applications


Overview

dataset-tools is a modular utility toolkit for managing datasets in machine learning and AI applications. It simplifies common tasks such as:

  • Converting labels between formats (e.g., YOLO TXT.).
  • Visualizing bounding boxes on images from annotation files.
  • Renaming and organizing dataset files.

Tech used in this repository:

  • isort and black keep the code consistent and clean.
  • OpenCV handles computer vision tasks.
  • python-dotenv handles any settings in project.

Installation Guide

Follow these steps to set up the project locally:

  1. Clone the repository:

    git clone https://github.com/Dhaboav/dataset-tools.git
  2. Install Python dependencies:

    Install the required Python packages using pip:

    pip install -r requirements.txt
  3. Create .env file:

    Copy .env.example using cmd prompt and change its values:

    copy .env.example .env

Examples

If you want to see how to use specific functions, navigate to the examples folder:

  • draw_bboxes_demo.py: Demonstrates how to draw bounding boxes from YOLO TXT format annotations onto images using OpenCV and save the resulting images.
  • rename_demo.py: Demonstrates how to renaming files.
  • split_dataset_demo.py: Demonstrates how to split datasets into training, validation, and test sets with proportions of 70%, 20%, and 10%, respectively.

Troubleshooting

Resolving Module Import Errors in VSCode

To fix module import errors in the examples folder, set PYTHONPATH in your VSCode settings so Python recognizes the project's root directory.

  1. Open Preferences: Open User Settings (JSON) via Ctrl+Shift+P in VSCode.

  2. Add this configuration:

    {
        "terminal.integrated.env.windows": {
            "PYTHONPATH": "${workspaceFolder}"
        }
    }

About

A modular utility toolkit for managing datasets in machine learning and AI applications

Resources

License

Stars

Watchers

Forks

Languages