Thank you for your interest in this project.
This document explains one way to support Hinode-AI: contributing via pre-trained models.
Before proceeding, ensure you have the following:
- Time
- Electricity costs
- A basic understanding of Python
- A relatively powerful machine
- Familiarity with Hugging Face (for publishing)
This approach requires a machine with at least 16GB of GPU memory (VRAM).
The weaker your GPU, the longer the process will take.
For reference, training using the Azuki-2n dataset on an M3 chip with 16GB of shared RAM takes at least 30 minutes and can take up to 2 hours.
If you use the same machine for other tasks during training, it may take even longer.
Thus, using a secondary PC is recommended.
If Python is not installed, the easiest method is to use Anaconda or Miniconda.
For advanced users, Pyenv can also be used on Linux or macOS.
We recommend Python versions 3.10 or 3.11 for stability.
Clone this repository using the Git command, GitHub CLI, or a GUI tool.
The most time-consuming step is training the model.
To make this easier, a training.py
script is located in the root directory.
After launching training.py
, you will be prompted to make a selection. The main options are:
- empty: Train from scratch.
- v2: Train based on GPT-2 (standard size). This is faster for datasets like OpenO1.
- v2-medium: Train based on a larger GPT-2 model. The Azuki-2n dataset is optimized for this size.
Next, you will be asked to provide the path to the dataset. Hinode-AI uses a custom JSON format.
We recommend selecting one of the provided templates.
Hinode-AI includes the following datasets as of this writing:
data_templates/OpenO1-SFT.json
data_templates/azuki-2n.json
Here is a table summarizing the characteristics of the included datasets:
Dataset Name | Characteristics | Recommended Base Model | Estimated Training Time | Notes | Dataset Path |
---|---|---|---|---|---|
OpenO1 | Published by the OpenO1 team on Hugging Face. Includes reasoning processes for better answers. High information density. |
empty or v2 | Long (~12 hours) | May yield high-quality answers. Training time tends to be long due to high data volume. |
data_templates/OpenO1-SFT.json |
Azuki 2n | Created for version 2n of the Azuki.ai project. Lower answer quality than Hinode-AI. |
v2-medium | Moderate (~30 min–2 hrs) | May lack information density, so v2-medium or higher is recommended. | data_templates/azuki-2n.json |
Intermediate results will be saved in the results
folder.
The final model will be saved in the trained_model
folder.
Finally, upload the contents of the trained_model
folder to your own account as a model.
Include the following details in the description:
- Information about Hinode-AI
- The base model used
- The name of the dataset used