refactor(mm): migrate all models to new and improved probe API #8600

psychedelicious · 2025-10-01T10:56:21Z

Summary

The model manager's probing/identification code has been in a state of partial migration for many months now. Most models used the old API, which was difficult to extend, and the mixing of old and new APIs made it hard to understand what was happening.

This PR refines the new API and ports all supported models to use it.

Design goals

Make it easy to understand what happens when we identify a model. There should be as little indirection as possible.
Make it easy to add support for a new model, be it a totally new architecture or auxiliary model for an existing architecture.
Use consistent, clear patterns throughout. Document when it is not immediately obvious what is happening.
Model configs should only contain information relevant to themselves. No optional attrs that are only used for a subset of models, no "god classes" that contain bazillions of fields for every possible case.
Model configs should not hard-code information that can be trivially derived at runtime.
Reduce touch points required to add support for new models by colocating model identification and loading logic.

Specifics

Single method to identify models

The new API required model config classes to implement two methods:

matches() -> does the candidate model look like this kind of model?
parse() -> build the config fields from the candidate

The intention was for the matches() to be a quick, lightweight check, and then parse() to do any heavy lifting (like loading the model to inspect it). We find the first match, then parse it.

In practice, however, we need to do almost all of the heavy lifting in matches() anyways. parse() ended up doing very little, and sometimes even duplicated work done in matches().

This API is needlessly complex and doesn't add any practical benefits.

I've consolidated these responsibilities into a from_model_on_disk() method which each config must implement. The method gets a reference to the candidate model files and attempts to construct a config from it. If it fails, a NotAMatch exception is raised, which includes a reason.

Here's an example from one of the IP Adapter config classes:

@classmethod
def from_model_on_disk(cls, mod: ModelOnDisk, fields: dict[str, Any]) -> Self:
    _validate_is_dir(cls, mod)

    _validate_override_fields(cls, fields)

    cls._validate_has_weights_file(mod)

    cls._validate_has_image_encoder_metadata_file(mod)

    cls._validate_base(mod)
    
    return cls(**fields)

Each of the validate functions either returns or raises.

Narrow config classes

Previously, we have some "god" classes like MainModelConfig. This class accommodated every possible main/pipeline model and was worse off for it. Config classes have been split up to be rather narrow, and the class names reflect this.

Each config must represent a single combination of model type, base and format. As before, the base of "any" serves as a fallback for models without an associated pipeline architecture, like text encoders.

Here are all main/pipeline classes:

# Main (Pipeline) - diffusers format
- Main_Diffusers_SD1_Config
- Main_Diffusers_SD2_Config
- Main_Diffusers_SDXL_Config
- Main_Diffusers_SDXLRefiner_Config
- Main_Diffusers_SD3_Config
- Main_Diffusers_CogView4_Config
# Main (Pipeline) - checkpoint format
- Main_Checkpoint_SD1_Config
- Main_Checkpoint_SD2_Config
- Main_Checkpoint_SDXL_Config
- Main_Checkpoint_SDXLRefiner_Config
- Main_Checkpoint_FLUX_Config
# Main (Pipeline) - quantized formats
- Main_BnBNF4_FLUX_Config
- Main_GGUF_FLUX_Config

All classes are named with the format <type>_<format>_<base>. It's immediately obvious which kind of model you are working with.

Because the classes are narrow, identification logic can also be more focused. Adding a new model architecture doesn't require you to figure out where to slot in the checks amongst checks for all the existing architectures - you make a new class and implement logic to positively identify this specific kind of model.

Organize the identification codebase (planned)

Currently we have two big honking files that have all the model identification code in a big jumble. Two main reasons for this:

We had "god classes" that were responsible for many permutations of models.
The structure of classes would result in circular import issues if we were to split the files up.

With the revised structure for config classes, we can more easily split the code by, say, model type. I haven't gotten to this yet, but the goal is to organize things to make it hard to get lost, even for a new contributor.

Configs are the loaders (planned)

As I worked through this, I found the split between model loaders and identifiers to feel overly complicated. There's a separate model loader registry that lets a model loader declare which specific combinations of type/format/base it can load.

We can only load known models, and all known models are represented by a config class. The loaders essentially map directly to config classes, but instead of just being on the config classes, we have to manually associate them. Then, every loader has defensive code where it checks if the config it is trying to load the is the right class before loading. The loaders also duplicate some amount of the model identification code (like finding the weights file).

I plan to refactor this so that the config classes are also responsible for loading models. With the other changes in this PR and #8577, we will have reduced the touch points required to add support for new models substantially:

Old

Complicated new model identification API tangled with even more complicated old API
Loader and registration
Inference code
Nodes
Crapload of frontend boilerplate

New

Colocated and simplified model identification and loading (i.e. everything that deals with model files)
Inference code
Nodes
Minimal (if any) frontend boilerplate

Remove model hashing (planned)

While I still believe model hashing was a good idea in isolation, in practice, it is pure downside.

Models take longer to install, sometimes taking very long on certain filesystems or storage setups. Because we let users opt out of hashing and change the hashing algorithm, the hashes end up not serving the intended purpose - a cross-platform, stable model identifier.

Even if we required hashing and used the same algo everywhere, we don't actually use the stored hashes anywhere.

I plan to remove model hashing from the model manager as part of this PR.

Related Issues / Discussions

Offline discussions

QA Instructions

TBD

Merge Plan

TBD

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

- Add concept of match certainty to new probe - Port CLIP Embed models to new API - Fiddle with stuff

Previously, we had a multi-phase strategy to identify models from their files on disk: 1. Run each model config classes' `matches()` method on the files. It checks if the model could possibly be an identified as the candidate model type. This was intended to be a quick check. Break on the first match. 2. If we have a match, run the config class's `parse()` method. It derive some additional model config attrs from the model files. This was intended to encapsulate heavier operations that may require loading the model into memory. 3. Derive the common model config attrs, like name, description, calculate the hash, etc. Some of these are also heavier operations. This strategy has some issues: - It is not clear how the pieces fit together. There is some back-and-forth between different methods and the config base class. It is hard to trace the flow of logic until you fully wrap your head around the system and therefore difficult to add a model architecture to the probe. - The assumption that we could do quick, lightweight checks before heavier checks is incorrect. We often _must_ load the model state dict in the `matches()` method. So there is no practical perf benefit to splitting up the responsibility of `matches()` and `parse()`. - Sometimes we need to do the same checks in `matches()` and `parse()`. In these cases, splitting the logic is has a negative perf impact because we are doing the same work twice. - As we introduce the concept of an "unknown" model config (i.e. a model that we cannot identify, but still record in the db; see #8582), we will _always_ run _all_ the checks for every model. Therefore we need not try to defer heavier checks or resource-intensive ops like hashing. We are going to do them anyways. - There are situations where a model may match multiple configs. One known case are SD pipeline models with merged LoRAs. In the old probe API, we relied on the implicit order of checks to know that if a model matched for pipeline _and_ LoRA, we prefer the pipeline match. But, in the new API, we do not have this implicit ordering of checks. To resolve this in a resilient way, we need to get all matches up front, then use tie-breaker logic to figure out which should win (or add "differential diagnosis" logic to the matchers). - Field overrides weren't handled well by this strategy. They were only applied at the very end, if a model matched successfully. This means we cannot tell the system "Hey, this model is type X with base Y. Trust me bro.". We cannot override the match logic. As we move towards letting users correct mis-identified models (see #8582), this is a requirement. We can simplify the process significantly and better support "unknown" models. Firstly, model config classes now have a single `from_model_on_disk()` method that attempts to construct an instance of the class from the model files. This replaces the `matches()` and `parse()` methods. If we fail to create the config instance, a special exception is raised that indicates why we think the files cannot be identified as the given model config class. Next, the flow for model identification is a bit simpler: - Derive all the common fields up-front (name, desc, hash, etc). - Merge in overrides. - Call `from_model_on_disk()` for every config class, passing in the fields. Overrides are handled in this method. - Record the results for each config class and choose the best one. The identification logic is a bit more verbose, with the special exceptions and handling of overrides, but it is very clear what is happening. The one downside I can think of for this strategy is we do need to check every model type, instead of stopping at the first match. It's a bit less efficient. In practice, however, this isn't a hot code path, and the improved clarity is worth far more than perf optimizations that the end user will likely never notice.

… clean

Simpler logic to identify, less complexity to add new model, fewer useless attrs that do not relate to the model arch, etc

w

psychedelicious · 2025-10-01T10:59:29Z

As of this writing:

All legacy probe code is still in the PR for reference as I work on the refactor.
All starter models probe correctly from disk (i.e. without any field overrides/hints), except these:
- /home/bat/models/any/clip_vision/clip-vit-large-patch14
- /home/bat/models/sdxl/ip_adapter/Standard Reference (IP Adapter ViT-H).safetensors
- /home/bat/models/any/t5_encoder/t5_base_encoder/
- /home/bat/models/any/siglip/SigLIP - google/siglip-so400m-patch14-384
- /home/bat/models/any/t5_encoder/t5_bnb_int8_quantized_encoder/

psychedelicious added 30 commits September 23, 2025 13:00

refactor: port MM probes to new api

7c72824

- Add concept of match certainty to new probe - Port CLIP Embed models to new API - Fiddle with stuff

feat(mm): port TIs to new API

8ae9716

tidy(mm): remove unused probes

8b6fe5c

feat(mm): port spandrel to new API

cdcdecc

fix(mm): parsing for spandrel

12c3cbc

fix(mm): loader for clip embed

7ab6042

fix(mm): tis use existing weight_files method

1db1264

feat(mm): port vae to new API

82ffb58

fix(mm): vae class inheritance and config_path

20a0231

tidy(mm): patcher types and import paths

c88fee6

feat(mm): better errors when invalid model config found in db

5996e31

feat(mm): port t5 to new API

8217fd9

feat(mm): make config_path optional

1d3f6c4

refactor(mm): remove unused methods in config.py

049e9f2

refactor(mm): add model config parsing utils

8b6929b

fix(mm): abstractmethod bork

4220657

tidy(mm): clarify that model id utils are private

6c60e6d

fix(mm): fall back to UnknownModelConfig correctly

b1780f9

feat(mm): port CLIPVisionDiffusersConfig to new api

cfef478

feat(mm): port SigLIPDiffusersConfig to new api

4f4268e

feat(mm): make match helpers more succint

01104f5

feat(mm): port flux redux to new api

6c66013

feat(mm): port ip adapter to new api

20db2cb

tidy(mm): skip optimistic override handling for now

f0e931c

refactor(mm): continue iterating on config

2813ec4

feat(mm): port flux "control lora" and t2i adapter to new api

e0d91ef

tidy(ui): use Extract to get model config types

5deb9bb

fix(mm): t2i base determination

07e99c9

feat(mm): port cnet to new api

d27bef1

psychedelicious added 14 commits September 25, 2025 21:55

refactor(mm): add config validation utils, make it all consistent and…

1268b23

… clean

feat(mm): wip port of main models to new api

5f45a9c

feat(mm): wip port of main models to new api

7765c83

feat(mm): wip port of main models to new api

3a44fde

docs(mm): add todos

69efdc3

tidy(mm): removed unused model merge class

7765df4

feat(mm): wip port main models to new api

9676cb8

tidy(mm): clean up model heuristic utils

09449cf

tidy(mm): clean up ModelOnDisk caching

d63348b

tidy(mm): flux lora format util

bab7f62

refactor(mm): make config classes narrow

935fafe

Simpler logic to identify, less complexity to add new model, fewer useless attrs that do not relate to the model arch, etc

refactor(mm): diffusers loras

17c5ad2

w

feat(mm): consistent naming for all model config classes

29087af

fix(mm): tag generation & scattered probe fixes

32a9ad1

github-actions bot added api python PRs that change python files invocations PRs that change invocations backend PRs that change backend files services PRs that change app services frontend PRs that change frontend files python-tests PRs that change python tests labels Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(mm): migrate all models to new and improved probe API #8600

refactor(mm): migrate all models to new and improved probe API #8600

Uh oh!

psychedelicious commented Oct 1, 2025 •

edited

Loading

Uh oh!

psychedelicious commented Oct 1, 2025

Uh oh!

Uh oh!

refactor(mm): migrate all models to new and improved probe API #8600

Are you sure you want to change the base?

refactor(mm): migrate all models to new and improved probe API #8600

Uh oh!

Conversation

psychedelicious commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design goals

Specifics

Single method to identify models

Narrow config classes

Organize the identification codebase (planned)

Configs are the loaders (planned)

Old

New

Remove model hashing (planned)

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

psychedelicious commented Oct 1, 2025

Uh oh!

Uh oh!

psychedelicious commented Oct 1, 2025 •

edited

Loading