Skip to content

Latest commit

 

History

History
109 lines (80 loc) · 4.35 KB

README.md

File metadata and controls

109 lines (80 loc) · 4.35 KB

FLUX-Fill-LoRa-Training

This repository provides a fork of the 🤗 Diffusers library with an example script for LoRA training on the new FLUX.1-Fill models. The script isn't optimized and was just tested on an NVIDIA A100 GPU. If anyone has a similar script for frameworks like SimpleTuner or SD-scripts, that run on consumer hardware, I would be more than happy to hear!

Overview

The provided script implements a specific masking strategy, in my case applying a mask to the right half of the image. If your use case requires a different masking approach, you’ll need to adapt the random_mask function accordingly.

Note:
Validation images and masks are currently hardcoded in the script. You will need to modify these to suit your dataset. See the lines:

val_image = load_image("https://huggingface.co/datasets/sebastianzok/validationImageAndMask/resolve/main/image.png")
val_mask = load_image("https://huggingface.co/datasets/sebastianzok/validationImageAndMask/resolve/main/mask.png")

Known Issue Validation only works at the start and end of training. During intermediate validation steps, only black images occur (See this open issue). Luckily the LoRa was able to catch my concept just with 300 steps, so I did not really depend on the validation images.

Installation

git clone https://github.com/huggingface/diffusers
cd diffusers
pip install -e .

Then cd in the examples/research_projects/dreambooth_inpaint folder and run

pip install -r requirements_flux.txt

And initialize an 🤗Accelerate environment with:

accelerate config

Or for a default accelerate configuration without answering questions about your environment

accelerate config default

Or if your environment doesn't support an interactive shell (e.g., a notebook)

from accelerate.utils import write_basic_config
write_basic_config()

When running accelerate config, if we specify torch compile mode to True there can be dramatic speedups. Note also that we use PEFT library as backend for LoRA training, make sure to have peft>=0.6.0 installed in your environment.

Load your Dataset

For my case the dataset consisted of just plain images without image captions. Since I trained the LoRa on a specific task, I used the instance_prompt parameter for all generations. This is much more convinient than the in-context LoRa approach, that I used to learn concepts using the normal FLUX.1-dev model. Also there are no mask images, since it was hard coded for my use case (see random_mask).

Train

Now, we can launch training using:

export MODEL_NAME="black-forest-labs/FLUX.1-Fill-dev"
export INSTANCE_DIR="dog"
export OUTPUT_DIR="trained-flux"

accelerate launch train_dreambooth_inpaint_lora_flux.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="bf16" \
  --instance_prompt="A character turnaround 45-degreed to the left" \
  --resolution=1024 \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=4 \
  --optimizer="prodigy" \
  --learning_rate=1. \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A character turnaround 45-degreed to the left" \
  --validation_epochs=25 \
  --seed="0" \
  --push_to_hub

Contributions and Feedback

As you might have noticed there is a lot of room for improvement 🙃. Feel free to open issues or submit pull requests to improve this project. If you have insights on adapting this script for other frameworks like SimpleTuner, please share your experiences!