-
Notifications
You must be signed in to change notification settings - Fork 2.7k
refactor(mm): migrate all models to new and improved probe API #8600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
psychedelicious
wants to merge
44
commits into
psyche/feat/mm/normalized-model-storage
Choose a base branch
from
psyche/refactor/mm/probe-to-classify
base: psyche/feat/mm/normalized-model-storage
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
refactor(mm): migrate all models to new and improved probe API #8600
psychedelicious
wants to merge
44
commits into
psyche/feat/mm/normalized-model-storage
from
psyche/refactor/mm/probe-to-classify
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add concept of match certainty to new probe - Port CLIP Embed models to new API - Fiddle with stuff
Previously, we had a multi-phase strategy to identify models from their files on disk: 1. Run each model config classes' `matches()` method on the files. It checks if the model could possibly be an identified as the candidate model type. This was intended to be a quick check. Break on the first match. 2. If we have a match, run the config class's `parse()` method. It derive some additional model config attrs from the model files. This was intended to encapsulate heavier operations that may require loading the model into memory. 3. Derive the common model config attrs, like name, description, calculate the hash, etc. Some of these are also heavier operations. This strategy has some issues: - It is not clear how the pieces fit together. There is some back-and-forth between different methods and the config base class. It is hard to trace the flow of logic until you fully wrap your head around the system and therefore difficult to add a model architecture to the probe. - The assumption that we could do quick, lightweight checks before heavier checks is incorrect. We often _must_ load the model state dict in the `matches()` method. So there is no practical perf benefit to splitting up the responsibility of `matches()` and `parse()`. - Sometimes we need to do the same checks in `matches()` and `parse()`. In these cases, splitting the logic is has a negative perf impact because we are doing the same work twice. - As we introduce the concept of an "unknown" model config (i.e. a model that we cannot identify, but still record in the db; see #8582), we will _always_ run _all_ the checks for every model. Therefore we need not try to defer heavier checks or resource-intensive ops like hashing. We are going to do them anyways. - There are situations where a model may match multiple configs. One known case are SD pipeline models with merged LoRAs. In the old probe API, we relied on the implicit order of checks to know that if a model matched for pipeline _and_ LoRA, we prefer the pipeline match. But, in the new API, we do not have this implicit ordering of checks. To resolve this in a resilient way, we need to get all matches up front, then use tie-breaker logic to figure out which should win (or add "differential diagnosis" logic to the matchers). - Field overrides weren't handled well by this strategy. They were only applied at the very end, if a model matched successfully. This means we cannot tell the system "Hey, this model is type X with base Y. Trust me bro.". We cannot override the match logic. As we move towards letting users correct mis-identified models (see #8582), this is a requirement. We can simplify the process significantly and better support "unknown" models. Firstly, model config classes now have a single `from_model_on_disk()` method that attempts to construct an instance of the class from the model files. This replaces the `matches()` and `parse()` methods. If we fail to create the config instance, a special exception is raised that indicates why we think the files cannot be identified as the given model config class. Next, the flow for model identification is a bit simpler: - Derive all the common fields up-front (name, desc, hash, etc). - Merge in overrides. - Call `from_model_on_disk()` for every config class, passing in the fields. Overrides are handled in this method. - Record the results for each config class and choose the best one. The identification logic is a bit more verbose, with the special exceptions and handling of overrides, but it is very clear what is happening. The one downside I can think of for this strategy is we do need to check every model type, instead of stopping at the first match. It's a bit less efficient. In practice, however, this isn't a hot code path, and the improved clarity is worth far more than perf optimizations that the end user will likely never notice.
Simpler logic to identify, less complexity to add new model, fewer useless attrs that do not relate to the model arch, etc
As of this writing:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
api
backend
PRs that change backend files
frontend
PRs that change frontend files
invocations
PRs that change invocations
python
PRs that change python files
python-tests
PRs that change python tests
services
PRs that change app services
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The model manager's probing/identification code has been in a state of partial migration for many months now. Most models used the old API, which was difficult to extend, and the mixing of old and new APIs made it hard to understand what was happening.
This PR refines the new API and ports all supported models to use it.
Design goals
Specifics
Single method to identify models
The new API required model config classes to implement two methods:
matches()
-> does the candidate model look like this kind of model?parse()
-> build the config fields from the candidateThe intention was for the
matches()
to be a quick, lightweight check, and thenparse()
to do any heavy lifting (like loading the model to inspect it). We find the first match, then parse it.In practice, however, we need to do almost all of the heavy lifting in
matches()
anyways.parse()
ended up doing very little, and sometimes even duplicated work done inmatches()
.This API is needlessly complex and doesn't add any practical benefits.
I've consolidated these responsibilities into a
from_model_on_disk()
method which each config must implement. The method gets a reference to the candidate model files and attempts to construct a config from it. If it fails, aNotAMatch
exception is raised, which includes a reason.Here's an example from one of the IP Adapter config classes:
Each of the validate functions either returns or raises.
Narrow config classes
Previously, we have some "god" classes like
MainModelConfig
. This class accommodated every possible main/pipeline model and was worse off for it. Config classes have been split up to be rather narrow, and the class names reflect this.Each config must represent a single combination of model type, base and format. As before, the base of "any" serves as a fallback for models without an associated pipeline architecture, like text encoders.
Here are all main/pipeline classes:
All classes are named with the format
<type>_<format>_<base>
. It's immediately obvious which kind of model you are working with.Because the classes are narrow, identification logic can also be more focused. Adding a new model architecture doesn't require you to figure out where to slot in the checks amongst checks for all the existing architectures - you make a new class and implement logic to positively identify this specific kind of model.
Organize the identification codebase (planned)
Currently we have two big honking files that have all the model identification code in a big jumble. Two main reasons for this:
With the revised structure for config classes, we can more easily split the code by, say, model type. I haven't gotten to this yet, but the goal is to organize things to make it hard to get lost, even for a new contributor.
Configs are the loaders (planned)
As I worked through this, I found the split between model loaders and identifiers to feel overly complicated. There's a separate model loader registry that lets a model loader declare which specific combinations of type/format/base it can load.
We can only load known models, and all known models are represented by a config class. The loaders essentially map directly to config classes, but instead of just being on the config classes, we have to manually associate them. Then, every loader has defensive code where it checks if the config it is trying to load the is the right class before loading. The loaders also duplicate some amount of the model identification code (like finding the weights file).
I plan to refactor this so that the config classes are also responsible for loading models. With the other changes in this PR and #8577, we will have reduced the touch points required to add support for new models substantially:
Old
New
Remove model hashing (planned)
While I still believe model hashing was a good idea in isolation, in practice, it is pure downside.
Models take longer to install, sometimes taking very long on certain filesystems or storage setups. Because we let users opt out of hashing and change the hashing algorithm, the hashes end up not serving the intended purpose - a cross-platform, stable model identifier.
Even if we required hashing and used the same algo everywhere, we don't actually use the stored hashes anywhere.
I plan to remove model hashing from the model manager as part of this PR.
Related Issues / Discussions
Offline discussions
QA Instructions
TBD
Merge Plan
TBD
Checklist
What's New
copy (if doing a release after this PR)