-
-
Notifications
You must be signed in to change notification settings - Fork 5k
Consistent license handling throughout timm #2585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Consistent license handling throughout timm #2585
Conversation
- Skipping the cspnet.py file with formatting due to large diff
@alexanderdann can we keep the PR to functional changes only and not formatting? I don't do black indentation of args... |
This reverts commit ed00d06. Reducing diff to keep pull request only for functional change.
@rwightman Yes, of course. Reverted the commit related to formatting. Had a deeper look at the code and are there reasons from your side not just implement it in the I just draftet the idea, tested and committed it. It also has the benefit that in case you want to overwrite the field manually (i.e. |
Probably the most elegant part would have been to just put it here, but def generate_default_cfgs(cfgs: Dict[str, Union[Dict[str, Any], PretrainedCfg]]):
out = defaultdict(DefaultCfg)
default_set = set() # no tag and tags ending with * are prioritized as default
for k, v in cfgs.items():
if isinstance(v, dict):
v = PretrainedCfg(**v)
# just a one liner here to set license
has_weights = v.has_weights |
timm predates the HF hub and some users expect to be able to use the models with just the relevant .pth / .safetensors file and no extra metdata. So all of the needed metadata for pretrained_cfg is registered in the code. Cannot expect to be able to call home and do http transfers any time a model is used. The pretrained_cfg on the HF hub is only sourced when hf-hub: or local: w/ a config downloaded from the Hub is used. So it's a bit confusing, but for it to make sense all of this needs to be synchronized but it's not feasible to grab the license from the Hub on the fly. Rather the processes to sync everything needs to do this and then updated the code semi-automatically (or with an AI agent) to deal with inevitable edge cases |
more precisely there are 3 sources of license truth
3 is the most correct but it was there primarly for HuggingFace Hub UI. The library itself sources the metadata from So to synchronize everythign the ideas was to write helpers to pull all of the license info from the yaml metadata, and then use some scripts and/or coding agents to help sync it to the code and the configs ... That's a bit of an undertaking :) |
Thinking a bit more on this, if synchronizing all license meta-data is beyond what what you were thinking, just making '_get_license_from_hf_hub' public, allowing a user to pass a full model name w/ tag, created model, or pretrained_cfg ... locate the hub location, extract and resolve the model card license meta-data would work for users who want to query the license... I do not want such a call l to be made in model creation, registration, etc flow though. |
This reverts commit 166a524.
Really appreciate the detailed feedback and elaboration on the constraints. Was a misunderstanding on my end. I already iterate on your comments and using Claude Code as you suggested works well. I will review the changes and commit them step by step. Thanks once again! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@alexanderdann taking a quick peek... overall looking good but a few things For 'other', while that's a thing for the HF Hub for reasons beyond my control, I'd rather be more specific, so for the 'other', I think the best approach would be to sub in the 'license-name' field if in the model card metadata license='other', and license-name exists, should defer to license name... e.g. for dino, would be 'dinov3-license' or even 'dinov3' is better than 'other'.... other is more of a UI thing for licenses that the UI devs don't want to add an official drop down entry for...
Also curious where the license=None entries come from, what logic leads to that... |
@rwightman this is a good idea, will definitely adapt it. When looking at potential discrepencies yesterday I found that we need to decide on how to go with the following cases:
Some examples:
Some examples
|
Those appl_mclip ones as an example, a HF hub fetch should pick that up, if you get the hf_hub_id for the mci0 as example, it's Same for florence2, https://huggingface.co/microsoft/Florence-2-base ... Those legacy seresnet are ported from caffe originals I think, so https://github.com/hujie-frank/SENet and apache-2.0 Vitamin, also need to follow the hf_hub_id in the pretrained_cfg entry -> https://huggingface.co/jienengchen/ViTamin-L-384px .. |
Model weight and code licenses aren't necessarily the same, so model weight license should be used when that is set and I only fallback to code license if the weights were released in a repo licensed say 'apache-2.0' with no mention of a different license for the weights. It's also quite debateable whether copyright applies to weights at all, and all software licenses rest on copyright so might be all around pointless :) |
@rwightman Integrated your feedback und filled the remaining |
Summary
Implement
default_cfg
population from_cfg()
function as discussed in #2581Given your feedback on #2581, I went ahead and integrated your proposed changes.
Appreciate any feedback coming from your side!
Changes Made
A new function
_get_license_from_hf_hub()
was added totimm/models/_hub.py
to automatically populate thedefault_cfg
from the_cfg()
function. Formatting was applied based on guidelines inCONTRIBUTING.md
. To keep the diff manageable for this initial implementation, formatting was only applied to thetimm/models/_hub.py
file. I will probably skip the model files since the formatting changes alone fortimm/models/cspnet.py
were large.The rationale behind
'license': _get_license_from_hf_hub(kwargs.pop('model_id', None), kwargs.get('hf_hub_id')),
is thatmodel_id
needs to be popped to avoid interfering with subsequent parts of the code such as thePretrainedCfg
, which do not expect such an argument. One could also easily rename this if this is the wish. Since we can have also configs without pretraining, we also allowNone
which leads to early return withNone
as license. I opt for explicit population withNone
to always have the field in the config.Testing
Tests were run using
pytest -k "csp" tests/
from the project root to verify the changes work correctly:Scope
This change has been implemented for
timm/models/cspnet.py
as a proof of concept. If this approach meets your expectations, I'm ready to extend the implementation to the remaining model files.Note
Since I could not find a pull request template I mocked something on my own, feel free to point me to your template if there is one.
Also, there are some configs such as
'davit_base_fl.msft_florence2
which can be found in/timm/models/davit.py
, lines 825onwards which cannot be found on the hub, any preference on how to handle such cases? Maybe this can be used as an additional check if all model infos are on the hub as well.
Checklist