Skip to content

Commit 8deb6a6

Browse files
authored
Merge pull request #3275 from vladmandic/dev
merge dev to master
2 parents a3ffd47 + 168e104 commit 8deb6a6

File tree

108 files changed

+1313
-2115
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

108 files changed

+1313
-2115
lines changed

CHANGELOG.md

+91
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,96 @@
11
# Change Log for SD.Next
22

3+
## Update for 2024-06-23
4+
5+
### Highlights for 2024-06-23
6+
7+
Following zero-day **SD3** release, a 10 days later here's a refresh with 10+ improvements
8+
including full prompt attention, support for compressed weights, additional text-encoder quantization modes.
9+
10+
But there's more than SD3:
11+
- support for quantized **T5** text encoder *FP16/FP8/FP4/INT8* in all models that use T5: SD3, PixArt-Σ, etc.
12+
- support for **PixArt-Sigma** in small/medium/large variants
13+
- support for **HunyuanDiT 1.1**
14+
- additional **NNCF weights compression** support: SD3, PixArt, ControlNet, Lora
15+
- integration of **MS Florence** VLM/VQA *Base* and *Large* models
16+
- (finally) new release of **Torch-DirectML**
17+
- additional efficiencies for users with low VRAM GPUs
18+
- over 20 overall fixes
19+
20+
### Model Improvements
21+
22+
- **SD3**: enable tiny-VAE (TAESD) preview and non-full quality mode
23+
- SD3: enable base LoRA support
24+
- SD3: add support for FP4 quantized T5 text encoder
25+
simply select in *settings -> model -> text encoder*
26+
*note* for SD3 with T5, set SD.Next to use FP16 precision, not BF16 precision
27+
- SD3: add support for INT8 quantized T5 text encoder, thanks @Disty0!
28+
- SD3: enable cpu-offloading for T5 text encoder, thanks @Disty0!
29+
- SD3: simplified loading of model in single-file safetensors format
30+
model load can now be performed fully offline
31+
- SD3: full support for prompt parsing and attention, thanks @AI-Casanova!
32+
- SD3: ability to target different prompts to each of text-encoders, thanks @AI-Casanova!
33+
example: `dog TE2: cat TE3: bird`
34+
- SD3: add support for sampler shift for Euler FlowMatch
35+
see *settings -> samplers*, also available as param in xyz grid
36+
higher shift means model will spend more time on structure and less on details
37+
- SD3: add support for selecting T5 text encoder variant in XYZ grid
38+
- **Pixart-Σ**: Add *small* (512px) and *large* (2k) variations, in addition to existing *medium* (1k)
39+
- Pixart-Σ: Add support for 4/8bit quantized t5 text encoder
40+
*note* by default pixart-Σ uses full fp16 t5 encoder with large memory footprint
41+
simply select in *settings -> model -> text encoder* before or after model load
42+
- **HunyuanDiT**: support for model version 1.1
43+
- **MS Florence**: integration of Microsoft Florence VLM/VQA Base and Large models
44+
simply select in *process -> visual query*!
45+
46+
### General Improvements
47+
48+
- support FP4 quantized T5 text encoder, in addtion to existing FP8 and FP16
49+
- support for T5 text-encoder loader in **all** models that use T5
50+
*example*: load FP4 or FP8 quantized T5 text-encoder into PixArt Sigma!
51+
- support for `torch-directml` **0.2.2**, thanks @lshqqytiger!
52+
*note*: new directml is finally based on modern `torch` 2.3.1!
53+
- xyz grid: add support for LoRA selector
54+
- vae load: store original vae so it can be restored when set to none
55+
- extra networks: info display now contains link to source url if model if its known
56+
works for civitai and huggingface models
57+
- force gc for lowvram users and improve gc logging
58+
- improved google.colab support
59+
- css tweaks for standardui
60+
- css tweaks for modernui
61+
- additional torch gc checks, thanks @Disty0!
62+
63+
**Improvements: NNCF**, thanks @Disty0!
64+
- SD3 and PixArt support
65+
- moved the first compression step to CPU
66+
- sequential cpu offload (lowvram) support
67+
- Lora support without reloading the model
68+
- ControlNet compression support
69+
70+
### Fixes
71+
72+
- fix unsaturated outputs, force apply vae config on model load
73+
- fix hidiffusion handling of non-square aspect ratios, thanks @ShenZhang-Shin!
74+
- fix control second pass resize
75+
- fix hunyuandit set attention processor
76+
- fix civitai download without name
77+
- fix compatibility with latest adetailer
78+
- fix invalid sampler warning
79+
- fix starting from non git repo
80+
- fix control api negative prompt handling
81+
- fix saving style without name provided
82+
- fix t2i-color adapter
83+
- fix sdxl "has been incorrectly initialized"
84+
- fix api face-hires
85+
- fix api ip-adapter
86+
- fix memory exceptions with ROCm, thanks @Disty0!
87+
- fix face-hires with lowvram, thanks @Disty0!
88+
- fix pag incorrectly resetting pipeline
89+
- cleanup image metadata
90+
- restructure api examples: `cli/api-*`
91+
- handle theme fallback when invalid theme is specified
92+
- remove obsolete training code leftovers
93+
394
## Update for 2024-06-13
495

596
### Highlights for 2024-06-13

TODO.md

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma
1111
- diffusers public callbacks
1212
- include reference styles
1313
- lora: sc lora, dora, etc
14+
- sd3 controlnet: <https://github.com/huggingface/diffusers/pull/8566>
1415

1516
## Experimental
1617

cli/simple-control.py cli/api-control.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ def get_image(encoded, output):
132132

133133

134134
if __name__ == "__main__":
135-
parser = argparse.ArgumentParser(description = 'simple-img2img')
135+
parser = argparse.ArgumentParser(description = 'api-img2img')
136136
parser.add_argument('--init', required=False, default=None, help='init image')
137137
parser.add_argument('--input', required=False, default=None, help='input image')
138138
parser.add_argument('--mask', required=False, help='mask image')

cli/api-faceid.py

+116
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
#!/usr/bin/env python
2+
import os
3+
import io
4+
import time
5+
import base64
6+
import logging
7+
import argparse
8+
import requests
9+
import urllib3
10+
from PIL import Image
11+
12+
sd_url = os.environ.get('SDAPI_URL', "http://127.0.0.1:7860")
13+
sd_username = os.environ.get('SDAPI_USR', None)
14+
sd_password = os.environ.get('SDAPI_PWD', None)
15+
16+
logging.basicConfig(level = logging.INFO, format = '%(asctime)s %(levelname)s: %(message)s')
17+
log = logging.getLogger(__name__)
18+
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
19+
20+
options = {
21+
"save_images": False,
22+
"send_images": True,
23+
}
24+
25+
26+
def auth():
27+
if sd_username is not None and sd_password is not None:
28+
return requests.auth.HTTPBasicAuth(sd_username, sd_password)
29+
return None
30+
31+
32+
def post(endpoint: str, dct: dict = None):
33+
req = requests.post(f'{sd_url}{endpoint}', json = dct, timeout=300, verify=False, auth=auth())
34+
if req.status_code != 200:
35+
return { 'error': req.status_code, 'reason': req.reason, 'url': req.url }
36+
else:
37+
return req.json()
38+
39+
40+
def encode(f):
41+
image = Image.open(f)
42+
if image.mode == 'RGBA':
43+
image = image.convert('RGB')
44+
with io.BytesIO() as stream:
45+
image.save(stream, 'JPEG')
46+
image.close()
47+
values = stream.getvalue()
48+
encoded = base64.b64encode(values).decode()
49+
return encoded
50+
51+
52+
def generate(args): # pylint: disable=redefined-outer-name
53+
t0 = time.time()
54+
if args.model is not None:
55+
post('/sdapi/v1/options', { 'sd_model_checkpoint': args.model })
56+
post('/sdapi/v1/reload-checkpoint') # needed if running in api-only to trigger new model load
57+
options['prompt'] = args.prompt
58+
options['negative_prompt'] = args.negative
59+
options['steps'] = int(args.steps)
60+
options['seed'] = int(args.seed)
61+
options['sampler_name'] = args.sampler
62+
options['width'] = args.width
63+
options['height'] = args.height
64+
options['face'] = {
65+
'mode': 'FaceID',
66+
'ip_model': 'FaceID Base',
67+
'source_images': [encode(args.face)],
68+
}
69+
data = post('/sdapi/v1/txt2img', options)
70+
t1 = time.time()
71+
if 'images' in data:
72+
for i in range(len(data['images'])):
73+
b64 = data['images'][i].split(',',1)[0]
74+
info = data['info']
75+
image = Image.open(io.BytesIO(base64.b64decode(b64)))
76+
log.info(f'received image: size={image.size} time={t1-t0:.2f} info="{info}"')
77+
if args.output:
78+
image.save(args.output)
79+
log.info(f'image saved: size={image.size} filename={args.output}')
80+
81+
else:
82+
log.warning(f'no images received: {data}')
83+
84+
85+
if __name__ == "__main__":
86+
parser = argparse.ArgumentParser(description = 'api-faceid')
87+
parser.add_argument('--width', required=False, default=512, help='image width')
88+
parser.add_argument('--height', required=False, default=512, help='image height')
89+
parser.add_argument('--face', required=False, help='face image')
90+
parser.add_argument('--prompt', required=False, default='', help='prompt text')
91+
parser.add_argument('--negative', required=False, default='', help='negative prompt text')
92+
parser.add_argument('--steps', required=False, default=20, help='number of steps')
93+
parser.add_argument('--seed', required=False, default=-1, help='initial seed')
94+
parser.add_argument('--sampler', required=False, default='Euler a', help='sampler name')
95+
parser.add_argument('--output', required=False, default=None, help='output image file')
96+
parser.add_argument('--model', required=False, help='model name')
97+
args = parser.parse_args()
98+
log.info(f'img2img: {args}')
99+
generate(args)
100+
101+
"""
102+
request.face.mode,
103+
request.face.source_images,
104+
request.face.ip_model,
105+
request.face.ip_override_sampler,
106+
request.face.ip_cache_model,
107+
request.face.ip_strength,
108+
request.face.ip_structure,
109+
request.face.id_strength,
110+
request.face.id_conditioning,
111+
request.face.id_cache,
112+
request.face.pm_trigger,
113+
request.face.pm_strength,
114+
request.face.pm_start,
115+
request.face.fs_cache
116+
"""

cli/simple-img2img.py cli/api-img2img.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ def generate(args): # pylint: disable=redefined-outer-name
8383

8484

8585
if __name__ == "__main__":
86-
parser = argparse.ArgumentParser(description = 'simple-img2img')
86+
parser = argparse.ArgumentParser(description = 'api-img2img')
8787
parser.add_argument('--init', required=True, help='init image')
8888
parser.add_argument('--mask', required=False, help='mask image')
8989
parser.add_argument('--prompt', required=False, default='', help='prompt text')

cli/simple-info.py cli/api-info.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ def info(args): # pylint: disable=redefined-outer-name
5050

5151

5252
if __name__ == "__main__":
53-
parser = argparse.ArgumentParser(description = 'simple-info')
53+
parser = argparse.ArgumentParser(description = 'api-info')
5454
parser.add_argument('--input', required=True, help='input image')
5555
args = parser.parse_args()
5656
log.info(f'info: {args}')

cli/api-json.py

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
#!/usr/bin/env python
2+
3+
# curl -vX POST http://localhost:7860/sdapi/v1/txt2img --header "Content-Type: application/json" -d @3261.json
4+
import os
5+
import json
6+
import logging
7+
import argparse
8+
import requests
9+
import urllib3
10+
11+
12+
sd_url = os.environ.get('SDAPI_URL', "http://127.0.0.1:7860")
13+
sd_username = os.environ.get('SDAPI_USR', None)
14+
sd_password = os.environ.get('SDAPI_PWD', None)
15+
options = {
16+
"save_images": True,
17+
"send_images": True,
18+
}
19+
20+
logging.basicConfig(level = logging.INFO, format = '%(asctime)s %(levelname)s: %(message)s')
21+
log = logging.getLogger(__name__)
22+
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
23+
24+
25+
def auth():
26+
if sd_username is not None and sd_password is not None:
27+
return requests.auth.HTTPBasicAuth(sd_username, sd_password)
28+
return None
29+
30+
31+
def post(endpoint: str, payload: dict = None):
32+
if 'sdapi' not in endpoint:
33+
endpoint = f'sdapi/v1/{endpoint}'
34+
if 'http' not in endpoint:
35+
endpoint = f'{sd_url}/{endpoint}'
36+
req = requests.post(endpoint, json = payload, timeout=300, verify=False, auth=auth())
37+
return { 'error': req.status_code, 'reason': req.reason, 'url': req.url } if req.status_code != 200 else req.json()
38+
39+
40+
if __name__ == "__main__":
41+
parser = argparse.ArgumentParser(description = 'api-txt2img')
42+
parser.add_argument('endpoint', nargs=1, help='endpoint')
43+
parser.add_argument('json', nargs=1, help='json data or file')
44+
args = parser.parse_args()
45+
log.info(f'api-json: {args}')
46+
if os.path.isfile(args.json[0]):
47+
with open(args.json[0], 'r', encoding='ascii') as f:
48+
dct = json.load(f) # TODO fails with b64 encoded images inside json due to string encoding
49+
else:
50+
dct = json.loads(args.json[0])
51+
res = post(endpoint=args.endpoint[0], payload=dct)
52+
print(res)

cli/simple-mask.py cli/api-mask.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ def info(args): # pylint: disable=redefined-outer-name
7373

7474

7575
if __name__ == "__main__":
76-
parser = argparse.ArgumentParser(description = 'simple-info')
76+
parser = argparse.ArgumentParser(description = 'api-mask')
7777
parser.add_argument('--input', required=True, help='input image')
7878
parser.add_argument('--mask', required=False, help='input mask')
7979
parser.add_argument('--type', required=False, help='output mask type')

cli/simple-preprocess.py cli/api-preprocess.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ def info(args): # pylint: disable=redefined-outer-name
6767

6868

6969
if __name__ == "__main__":
70-
parser = argparse.ArgumentParser(description = 'simple-info')
70+
parser = argparse.ArgumentParser(description = 'api-preprocess')
7171
parser.add_argument('--input', required=True, help='input image')
7272
parser.add_argument('--model', required=True, help='preprocessing model')
7373
parser.add_argument('--output', required=False, help='output image')

cli/idle.py cli/api-progress.py

File renamed without changes.
File renamed without changes.

cli/simple-txt2img.py cli/api-txt2img.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,10 @@ def generate(args): # pylint: disable=redefined-outer-name
4848
options['sampler_name'] = args.sampler
4949
options['width'] = int(args.width)
5050
options['height'] = int(args.height)
51-
options['restore_faces'] = args.faces
51+
if args.faces:
52+
options['restore_faces'] = args.faces
53+
options['denoising_strength'] = 0.5
54+
options['hr_sampler_name'] = args.sampler
5255
data = post('/sdapi/v1/txt2img', options)
5356
t1 = time.time()
5457
if 'images' in data:
@@ -65,7 +68,7 @@ def generate(args): # pylint: disable=redefined-outer-name
6568

6669

6770
if __name__ == "__main__":
68-
parser = argparse.ArgumentParser(description = 'simple-txt2img')
71+
parser = argparse.ArgumentParser(description = 'api-txt2img')
6972
parser.add_argument('--prompt', required=False, default='', help='prompt text')
7073
parser.add_argument('--negative', required=False, default='', help='negative prompt text')
7174
parser.add_argument('--width', required=False, default=512, help='image width')

cli/simple-upscale.py cli/api-upscale.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ def upscale(args): # pylint: disable=redefined-outer-name
8080

8181

8282
if __name__ == "__main__":
83-
parser = argparse.ArgumentParser(description = 'simple-upscale')
83+
parser = argparse.ArgumentParser(description = 'api-upscale')
8484
parser.add_argument('--input', required=True, help='input image')
8585
parser.add_argument('--output', required=True, help='output image')
8686
parser.add_argument('--upscaler', required=False, default='Nearest', help='upscaler name')

cli/simple-vqa.py cli/api-vqa.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ def info(args): # pylint: disable=redefined-outer-name
5555

5656

5757
if __name__ == "__main__":
58-
parser = argparse.ArgumentParser(description = 'simple-info')
58+
parser = argparse.ArgumentParser(description = 'api-vqa')
5959
parser.add_argument('--input', required=True, help='input image')
6060
parser.add_argument('--model', required=False, help='vqa model')
6161
parser.add_argument('--question', required=False, help='question')

cli/image-encode.py

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
#!/usr/bin/env python
2+
import io
3+
import os
4+
import sys
5+
import base64
6+
from PIL import Image
7+
from rich import print # pylint: disable=redefined-builtin
8+
9+
10+
def encode(file: str):
11+
image = Image.open(file) if os.path.exists(file) else None
12+
print(f'Input: file={file} image={image}')
13+
if image is None:
14+
return None
15+
if image.mode != 'RGB':
16+
image = image.convert('RGB')
17+
with io.BytesIO() as stream:
18+
image.save(stream, 'JPEG')
19+
image.close()
20+
values = stream.getvalue()
21+
encoded = base64.b64encode(values).decode()
22+
return encoded
23+
24+
25+
if __name__ == "__main__":
26+
sys.argv.pop(0)
27+
fn = sys.argv[0] if len(sys.argv) > 0 else ''
28+
b64 = encode(fn)
29+
print('=== BEGIN ===')
30+
print(f'{b64}')
31+
print('=== END ===')
32+

0 commit comments

Comments
 (0)