Dataset-Tools is a desktop application designed to help users browse and manage their image and text datasets, particularly those used with AI art generation tools like Stable Diffusion. Developed using PyQt6, it provides a simple and intuitive graphical interface for browsing images, viewing metadata, and examining associated text prompts. As of recently this has also extended it's use case to reading metadata from LoRa safetensor file formats, as well as reading metadata from sites such as Civitai. This project is inspired by tools within the AI art community (☮️receyuki🤍) and aims to empower users in improving their dataset curation workflow. If you're interested in getting involved, feel free to fork and contribute!
Screen.Recording.2025-01-10.at.22.33.34.mov
To run the program, you will need the following software:
-
Python.org or Try
uv
(Optional) -
Requires at least Python 3.10; older versions may not react well with the installation commands.
-
You'll also note that certain Ubuntu systems may not install the required packages correctly. If you're having problems with this, please follow the guide below, or let us know in the issues section above!
-
uv
is available and usable on Linux, Windows, and macOS. It's extremely fast and written in rust! It is also optional.
git clone https://github.com/Ktiseos-Nyx/Dataset-Tools.git
cd Dataset-Tools
pip install .
Note
uv
users
cd Dataset-Tools
uv pip install .
dataset-tools
The application window has the following main components:
- Current Folder: Displays the path of the currently loaded folder.
- Open Folder: A button to select a folder containing images and text files, as well as if you have safetensors files.
- Image List: Displays a list of images and text files found in the selected folder.
- Image Preview: An area to display a selected image.
- Metadata Box: A text area to display the extracted metadata from the selected image or safetensors file (including Stable Diffusion prompt, settings, etc.).
- Selecting Images: Click on an image or text file in the list to display its preview, metadata, and associated text content.
- Viewing Metadata: Metadata associated with the selected image is displayed on the text area, such as steps, samplers, seeds, and more.
- Viewing Text: The content of any text file associated with the selected image is displayed on the text box.
- Graphical User Interface (GUI): Built with PyQt6 for a modern and cross-platform experience.
- Resizeable Interface Easily stretch the interface like goo!
- Image Previews: Quickly view images in a dedicated preview area.
- Metadata Extraction: Extract and display relevant metadata from PNG image files, especially those generated from Stable Diffusion.
- Now including support for Safetensors files, please note this at the moment includes LoRA, and NOT Embeddings.
- This at the moment also includes for support beyond SDXL base models, as well as Flux, Aura, SD3 and more to come!
- We've also recently added support for images from Civitai, supporting their EXIF formats!
- Text Viewing: Display the content of text files.
- Clear Layout: A simple and intuitive layout, with list view on the left, and preview on the right.
- Filtering/Sorting: Options to filter and sort files.
- Thumbnail Generation: Implement thumbnails for faster browsing.
- Themes: Introduce customizable themes for appearance.
- Better User Experience: Test on different operating systems and screen resolutions to optimize user experience.
- Video Tutorials: Create video tutorials to show users how to use the program.
- Text Tutorials: Create detailed tutorials in text and image to show the user how to use the program.
is a creator collective consisting of
- The use of Gemini, ChatGPT, Claude/Anthropic as well as Llama and other tools that structured K/N's base of this tool.
- Support of our peers, and the community at Large.
- Inspired by receyuki/stable-diffusion-prompt-reader
- The ever growing taunts & support by Anzhc
- Civitai for giving us the space to learn and grow in the open source community!
...and more to come!