This repository implements the Flux2 model from scratch, specifically focusing on training the Flux2 Transformer. To simplify the process, I'm leveraging the existing AutoEncoder and Text Encoder.
The base implementation is taken from the official black-forest-labs/flux2 repository.
Note: I'll explain the entire implementation in detail on my blog once the project is complete.
The following datasets will be used for this project:
-
Fredtt3/Flux2-Image: Base dataset with the images and prompts
-
Fredtt3/Flux2-Image-Processed: Dataset already processed for transformer training (Not yet available)
uv pip install torch==2.9 transformers==4.57.6 https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.7.0/flash_attn-2.8.3+cu128torch2.9-cp312-cp312-linux_x86_64.whl flashinfer-python https://github.com/FredyRivera-dev/Flux2-from-scratch.gitNote: I'm providing a pre-compiled version of Flash Attention for PyTorch 2.9, so you don't have to wait to compile it from scratch.
