Skip to content
This repository was archived by the owner on May 14, 2024. It is now read-only.

Commit ae2149d

Browse files
authored
Merge pull request #9 from Linaqruf/experimental
Publish Kohya Trainer V8
2 parents b422fdb + c897f87 commit ae2149d

23 files changed

+6027
-2218
lines changed

README.md

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Kohya Trainer V6 - VRAM 12GB
1+
# Kohya Trainer V8 - VRAM 12GB
22
### The Best Way for People Without Good GPUs to Fine-Tune the Stable Diffusion Model
33

44
This notebook has been adapted for use in Google Colab based on the [Kohya Guide](https://note.com/kohya_ss/n/nbf7ce8d80f29#c9d7ee61-5779-4436-b4e6-9053741c46bb). </br>
@@ -13,7 +13,33 @@ You can find the latest update to the notebook [here](https://github.com/Linaqru
1313
- By default, does not train Text Encoder for fine tuning of the entire model, but option to train Text Encoder is available.
1414
- Ability to make learning even more flexible than with DreamBooth by preparing a certain number of images (several hundred or more seems to be desirable).
1515

16+
## Run locally
17+
Please refer to [bmaltais's repo](https://github.com/bmaltais) if you want to run it locally on your terminal
18+
- bmaltais's [kohya_ss](https://github.com/bmaltais/kohya_ss) (dreambooth)
19+
- bmaltais's [kohya_diffusers_fine_tuning](https://github.com/bmaltais/kohya_diffusers_fine_tuning)
20+
- bmaltais's [kohya_diffusion](https://github.com/bmaltais/kohya_diffusion) (gen_img_diffusers)
21+
22+
## Original post for each dedicated script:
23+
- [gen_img_diffusers](https://note.com/kohya_ss/n/n2693183a798e)
24+
- [merge_vae](https://note.com/kohya_ss/n/nf5893a2e719c)
25+
- [convert_diffusers20_original_sd](https://note.com/kohya_ss/n/n374f316fe4ad)
26+
- [detect_face_rotate](https://note.com/kohya_ss/n/nad3bce9a3622)
27+
- [diffusers_fine_tuning](https://note.com/kohya_ss/n/nbf7ce8d80f29)
28+
- [train_db_fixed](https://note.com/kohya_ss/n/nee3ed1649fb6)
29+
- [merge_block_weighted](https://note.com/kohya_ss/n/n9a485a066d5b)
30+
1631
## Change Logs:
32+
33+
##### v8 (13/12):
34+
- Added support for training with fp16 gradients (experimental feature). This allows training with 8GB VRAM on SD1.x. See "Training with fp16 gradients (experimental feature)" for details.
35+
- Updated WD14Tagger script to automatically download weights.
36+
37+
##### v7 (7/12):
38+
- Requires Diffusers 0.10.2 (0.10.0 or later will work, but there are reported issues with 0.10.0 so we recommend using 0.10.2). To update, run `pip install -U diffusers[torch]==0.10.2` in your virtual environment.
39+
- Added support for Diffusers 0.10 (uses code in Diffusers for `v-parameterization` training and also supports `safetensors`).
40+
- Added support for accelerate 0.15.0.
41+
- Added support for multiple teacher data folders. For caption and tag preprocessing, use the `--full_path` option. The arguments for the cleaning script have also changed, see "Caption and Tag Preprocessing" for details.
42+
1743
##### v6 (6/12):
1844
- Temporary fix for an error when saving in the .safetensors format with some models. If you experienced this error with v5, please try v6.
1945

@@ -44,6 +70,14 @@ You can find the latest update to the notebook [here](https://github.com/Linaqru
4470
- Fixed a bug that caused data to be shuffled twice.
4571
- Corrected spelling mistakes in the options for each script.
4672

73+
## Conclusion
74+
> While Stable Diffusion fine tuning is typically based on CompVis, using Diffusers as a base allows for efficient and fast fine tuning with less memory usage. We have also added support for the features proposed by Novel AI, so we hope this article will be useful for those who want to fine tune their models.
75+
76+
— kohya_ss
77+
4778
## Credit
48-
[Kohya](https://twitter.com/kohya_ss) | Just for my part
79+
[Kohya](https://twitter.com/kohya_ss) | [Lopho](https://github.com/lopho/stable-diffusion-prune) for prune script | Just for my part
80+
81+
82+
4983

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# convert Diffusers v1.x/v2.0 model to original Stable Diffusion
2+
# v1: initial version
3+
# v2: support safetensors
4+
# v3: fix to support another format
5+
# v4: support safetensors in Diffusers
6+
7+
import argparse
8+
import os
9+
import torch
10+
from diffusers import StableDiffusionPipeline
11+
12+
import model_util
13+
14+
15+
def convert(args):
16+
# 引数を確認する
17+
load_dtype = torch.float16 if args.fp16 else None
18+
19+
save_dtype = None
20+
if args.fp16:
21+
save_dtype = torch.float16
22+
elif args.bf16:
23+
save_dtype = torch.bfloat16
24+
elif args.float:
25+
save_dtype = torch.float
26+
27+
is_load_ckpt = os.path.isfile(args.model_to_load)
28+
is_save_ckpt = len(os.path.splitext(args.model_to_save)[1]) > 0
29+
30+
assert not is_load_ckpt or args.v1 != args.v2, f"v1 or v2 is required to load checkpoint / checkpointの読み込みにはv1/v2指定が必要です"
31+
assert is_save_ckpt or args.reference_model is not None, f"reference model is required to save as Diffusers / Diffusers形式での保存には参照モデルが必要です"
32+
33+
# モデルを読み込む
34+
msg = "checkpoint" if is_load_ckpt else ("Diffusers" + (" as fp16" if args.fp16 else ""))
35+
print(f"loading {msg}: {args.model_to_load}")
36+
37+
if is_load_ckpt:
38+
v2_model = args.v2
39+
text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(v2_model, args.model_to_load)
40+
else:
41+
pipe = StableDiffusionPipeline.from_pretrained(args.model_to_load, torch_dtype=load_dtype, tokenizer=None, safety_checker=None)
42+
text_encoder = pipe.text_encoder
43+
vae = pipe.vae
44+
unet = pipe.unet
45+
46+
if args.v1 == args.v2:
47+
# 自動判定する
48+
v2_model = unet.config.cross_attention_dim == 1024
49+
print("checking model version: model is " + ('v2' if v2_model else 'v1'))
50+
else:
51+
v2_model = args.v1
52+
53+
# 変換して保存する
54+
msg = ("checkpoint" + ("" if save_dtype is None else f" in {save_dtype}")) if is_save_ckpt else "Diffusers"
55+
print(f"converting and saving as {msg}: {args.model_to_save}")
56+
57+
if is_save_ckpt:
58+
original_model = args.model_to_load if is_load_ckpt else None
59+
key_count = model_util.save_stable_diffusion_checkpoint(v2_model, args.model_to_save, text_encoder, unet,
60+
original_model, args.epoch, args.global_step, save_dtype, vae)
61+
print(f"model saved. total converted state_dict keys: {key_count}")
62+
else:
63+
print(f"copy scheduler/tokenizer config from: {args.reference_model}")
64+
model_util.save_diffusers_checkpoint(v2_model, args.model_to_save, text_encoder, unet, args.reference_model, vae, args.use_safetensors)
65+
print(f"model saved.")
66+
67+
68+
if __name__ == '__main__':
69+
parser = argparse.ArgumentParser()
70+
parser.add_argument("--v1", action='store_true',
71+
help='load v1.x model (v1 or v2 is required to load checkpoint) / 1.xのモデルを読み込む')
72+
parser.add_argument("--v2", action='store_true',
73+
help='load v2.0 model (v1 or v2 is required to load checkpoint) / 2.0のモデルを読み込む')
74+
parser.add_argument("--fp16", action='store_true',
75+
help='load as fp16 (Diffusers only) and save as fp16 (checkpoint only) / fp16形式で読み込み(Diffusers形式のみ対応)、保存する(checkpointのみ対応)')
76+
parser.add_argument("--bf16", action='store_true', help='save as bf16 (checkpoint only) / bf16形式で保存する(checkpointのみ対応)')
77+
parser.add_argument("--float", action='store_true',
78+
help='save as float (checkpoint only) / float(float32)形式で保存する(checkpointのみ対応)')
79+
parser.add_argument("--epoch", type=int, default=0, help='epoch to write to checkpoint / checkpointに記録するepoch数の値')
80+
parser.add_argument("--global_step", type=int, default=0,
81+
help='global_step to write to checkpoint / checkpointに記録するglobal_stepの値')
82+
parser.add_argument("--reference_model", type=str, default=None,
83+
help="reference model for schduler/tokenizer, required in saving Diffusers, copy schduler/tokenizer from this / scheduler/tokenizerのコピー元のDiffusersモデル、Diffusers形式で保存するときに必要")
84+
parser.add_argument("--use_safetensors", action='store_true',
85+
help="use safetensors format to save Diffusers model (checkpoint depends on the file extension) / Duffusersモデルをsafetensors形式で保存する(checkpointは拡張子で自動判定)")
86+
87+
parser.add_argument("model_to_load", type=str, default=None,
88+
help="model to load: checkpoint file or Diffusers model's directory / 読み込むモデル、checkpointかDiffusers形式モデルのディレクトリ")
89+
parser.add_argument("model_to_save", type=str, default=None,
90+
help="model to save: checkpoint (with extension) or Diffusers model's directory (without extension) / 変換後のモデル、拡張子がある場合はcheckpoint、ない場合はDiffusesモデルとして保存")
91+
92+
args = parser.parse_args()
93+
convert(args)

0 commit comments

Comments
 (0)