Advice on style training #10787
Replies: 3 comments 14 replies
-
can you post an image generated with your lora? It's probably doable with any model but it depends on your definition of 'isn't great'. For example, there's a lora for ikea instructions trained in the base SDXL model which is fine if you don't consider the text. About your decision to use Flux, I haven't tried to train a lora on it yet but for what I've seen and tested, Flux looses a lot of the quality when you finetune it, the loras I tested work fine but they degrade a lot the other stuff and also the model isn't that great for lines (for example anime) and it's really overtrained in professional photos with bokeh. For example, take this lora for SDXL and this one for Flux which are for 3D wireframes, if you ask me, based in the demo images, I think the SDXL one is better. Also how big if your dataset? Depending on the size, you will probably need to finetune a model that's really over trained in lines like the anime models. This probably needs a bigger dataset than a normal character lora. I really can't help with the hyperparameters for Flux, but one good start would be to train the whole model and then extract loras in different ranks to test which one works better for your use case. Also as a good default start, you should use the same LR than what the base model used. |
Beta Was this translation helpful? Give feedback.
-
hey @asomoza I wanted to update you in case you're interested: I've come up with a way to remove one constraint in my training (the orthographic representation). So now, I am trying to teach the model the style shown on the image I displayed in my first message above. I have a balanced mix of all views in my dataset (diverse objects, shown in front, side or top view). (I will then load these lora weights and use a controlnet in order to get this style AND following the structure of a depth map, but that's another story) I couldn't make flux learn it, now trying with SDXL. Using this script with hyperparams defaults for now. As you can see, the training is happening to some degree, as validation inferences show a drawing style, and dark thick edges, but their style is far from the wireframe, minimalist drawing style of the dataset |
Beta Was this translation helpful? Give feedback.
-
hey @asomoza a few months after, this issue is still current :) I see several possible reasons for the model's inability to generate accurate top views, including:
See example images below of a washing machine: front and side views are fine, but then the top view is incorrect, it's like the model desperately wants to show it's a washing machine, while a mere rectangle would be fine instead of this ![]() ![]() ![]() When the object prompted is well-known from top view (meaning well-known to us and to the training sets), like say a pool table, the model does fine. But when it is an object unfrequently depicted from top view (eg a fridge) then the model struggles. Sometimes it even generates a front view instead of a top view, as if it was completely giving up on the idea of respecting the prompt. I want to ask you, in light of the past months of research, do you see any way forward to tackle this issue? I was thinking of fine-tuning flux kontext, wdyt? thank you for your answer! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I'm looking for advice on how to best train a LoRA on the following task: the LoRA should learn to represent black and white wireframe drawings seen from top (orthogonal view ie objects seen from top, no perspective).
I attach one example of image from my dataset depicting a desk in the above mentioned style.
So far I've mostly tried fine-tuning flux dev, varying lr, rank etc. to see what I can get; and the LoRA partially learns but definitely isn't great.
Do you have any recommendations of model/hyperparams? (if this task is feasible at all)
Thanks vm!

Happy to provide more details if needed
(maybe @asomoza ?)
Beta Was this translation helpful? Give feedback.
All reactions