Consistent license handling throughout timm #2585

alexanderdann · 2025-09-20T09:50:29Z

Summary

Implement default_cfg population from _cfg() function as discussed in #2581

Given your feedback on #2581, I went ahead and integrated your proposed changes.

Appreciate any feedback coming from your side!

Changes Made

A new function _get_license_from_hf_hub() was added to timm/models/_hub.py to automatically populate the default_cfg from the _cfg() function. Formatting was applied based on guidelines in CONTRIBUTING.md. To keep the diff manageable for this initial implementation, formatting was only applied to the timm/models/_hub.py file. I will probably skip the model files since the formatting changes alone for timm/models/cspnet.py were large.

The rationale behind 'license': _get_license_from_hf_hub(kwargs.pop('model_id', None), kwargs.get('hf_hub_id')), is that model_id needs to be popped to avoid interfering with subsequent parts of the code such as the PretrainedCfg, which do not expect such an argument. One could also easily rename this if this is the wish. Since we can have also configs without pretraining, we also allow None which leads to early return with None as license. I opt for explicit population with None to always have the field in the config.

Testing

Tests were run using pytest -k "csp" tests/ from the project root to verify the changes work correctly:

Scope

This change has been implemented for timm/models/cspnet.py as a proof of concept. If this approach meets your expectations, I'm ready to extend the implementation to the remaining model files.

Note

Since I could not find a pull request template I mocked something on my own, feel free to point me to your template if there is one.

Also, there are some configs such as 'davit_base_fl.msft_florence2 which can be found in /timm/models/davit.py, lines 825
onwards which cannot be found on the hub, any preference on how to handle such cases? Maybe this can be used as an additional check if all model infos are on the hub as well.

Checklist

Referenced original issue [FEATURE] Consistent License Handling #2581
Followed CONTRIBUTING.md formatting guidelines
Added function to timm/models/_hub.py
Tested changes with pytest
Limited scope to single file for initial review
Work on feedback by reviewer
Formatting in case of additional changes or the explicit recommendation
Extension for remaining files

- Skipping the cspnet.py file with formatting due to large diff

rwightman · 2025-09-20T15:12:42Z

@alexanderdann can we keep the PR to functional changes only and not formatting? I don't do black indentation of args...

This reverts commit ed00d06. Reducing diff to keep pull request only for functional change.

alexanderdann · 2025-09-22T20:06:00Z

@rwightman Yes, of course. Reverted the commit related to formatting.

Had a deeper look at the code and are there reasons from your side not just implement it in the PretrainedCfg? The issue with changing it in each file is that it would lead to the fetching of model_info for every model when importing timm.

I just draftet the idea, tested and committed it. It also has the benefit that in case you want to overwrite the field manually (i.e. model_info returns other as license for https://huggingface.co/timm/vit_huge_plus_patch16_dinov3.lvd1689m), you can still manually overwrite it in _cfg and the function will not overwrite it, as it has been already set (as you also have done with by just stating dinov3 in timm/models/eva.py) .

alexanderdann · 2025-09-22T20:09:50Z

Probably the most elegant part would have been to just put it here, but import timm would take too much time to import.

def generate_default_cfgs(cfgs: Dict[str, Union[Dict[str, Any], PretrainedCfg]]):
    out = defaultdict(DefaultCfg)
    default_set = set()  # no tag and tags ending with * are prioritized as default

    for k, v in cfgs.items():
        if isinstance(v, dict):
            v = PretrainedCfg(**v)
            # just a one liner here to set license
        has_weights = v.has_weights

rwightman · 2025-09-22T20:18:53Z

timm predates the HF hub and some users expect to be able to use the models with just the relevant .pth / .safetensors file and no extra metdata. So all of the needed metadata for pretrained_cfg is registered in the code. Cannot expect to be able to call home and do http transfers any time a model is used. The pretrained_cfg on the HF hub is only sourced when hf-hub: or local: w/ a config downloaded from the Hub is used.

So it's a bit confusing, but for it to make sense all of this needs to be synchronized but it's not feasible to grab the license from the Hub on the fly. Rather the processes to sync everything needs to do this and then updated the code semi-automatically (or with an AI agent) to deal with inevitable edge cases

rwightman · 2025-09-22T20:24:43Z

more precisely there are 3 sources of license truth

The code based registration of pretrained_cfgs via the generate_default_cfgs in every model file
The pretrained_cfg fields present in the config.json for timm models on the HF Hub
The model card (yaml block) metadata on the HF Hub

3 is the most correct but it was there primarly for HuggingFace Hub UI. The library itself sources the metadata from config.json when hf-hub:timm/model_name or local-dir:/blah/model_name is used. And it's sourced from what's registered in code when that's not used (only the weights are downloaded if not cached already).

So to synchronize everythign the ideas was to write helpers to pull all of the license info from the yaml metadata, and then use some scripts and/or coding agents to help sync it to the code and the configs ...

That's a bit of an undertaking :)

rwightman · 2025-09-23T04:23:14Z

Thinking a bit more on this, if synchronizing all license meta-data is beyond what what you were thinking, just making '_get_license_from_hf_hub' public, allowing a user to pass a full model name w/ tag, created model, or pretrained_cfg ... locate the hub location, extract and resolve the model card license meta-data would work for users who want to query the license...

I do not want such a call l to be made in model creation, registration, etc flow though.

This reverts commit 166a524.

alexanderdann · 2025-09-23T17:07:13Z

Really appreciate the detailed feedback and elaboration on the constraints. Was a misunderstanding on my end. I already iterate on your comments and using Claude Code as you suggested works well. I will review the changes and commit them step by step.

Thanks once again!

HuggingFaceDocBuilderDev · 2025-09-25T04:04:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rwightman · 2025-09-25T04:06:09Z

@alexanderdann taking a quick peek... overall looking good but a few things

For 'other', while that's a thing for the HF Hub for reasons beyond my control, I'd rather be more specific, so for the 'other', I think the best approach would be to sub in the 'license-name' field if in the model card metadata license='other', and license-name exists, should defer to license name... e.g. for dino, would be 'dinov3-license' or even 'dinov3' is better than 'other'.... other is more of a UI thing for licenses that the UI devs don't want to add an official drop down entry for...

license: other
license_name: dinov3-license
license_link: https://ai.meta.com/resources/models-and-libraries/dinov3-license

Also curious where the license=None entries come from, what logic leads to that...

alexanderdann · 2025-09-25T06:43:16Z

@rwightman this is a good idea, will definitely adapt it. When looking at potential discrepencies yesterday I found that we need to decide on how to go with the following cases:

HF Hub ↔ timm differences
There are some models which have a different license set in the code already than on the hub. For instance, there are currently 18 models falling into that category. Since I am not a lawyer or know the potential reasoning behind that, I left these unchanged and would rather consult with you.

Some examples:

# structure: model_id: (hf_hub_license, timm_code_license')
'vit_base_patch32_clip_quickgelu_224.metaclip_400m': ('apache-2.0',  'cc-by-nc-4.0'),
'resnet50_clip_gap.yfcc15m': ('apache-2.0', 'mit'),

Missing Licenses on the HF Hub
This is exactly the case from above you mentioned with None. I set the default parameter of the license in _cfg, but some models have no license, which I can grab from the hub. Moreover, it did not have a license set before my changes. Hence, before jumping to conclusions and guessing, I took the safe route and overwrite it explicitly to avoid a potentially wrong license. Right now, there are 49 pretrained models I can find via timm.list_pretrained and timm.get_pretrained_cfg, which are not on the remote. 29 of these have a manually set license by you in the code.

Some examples

'davit_base_fl.msft_florence2': (None, 'apache-2.0'),
'davit_huge_fl.msft_florence2': (None, 'apache-2.0'),
'fastvit_mci0.apple_mclip': (None, None),
'fastvit_mci1.apple_mclip': (None, None),
'legacy_seresnet50.in1k': (None, None),
'vitamin_large_384.datacomp1b_clip': (None, None),

rwightman · 2025-09-25T14:56:38Z

Those appl_mclip ones as an example, a HF hub fetch should pick that up, if you get the hf_hub_id for the mci0 as example, it's apple/mobileclip_s0_timm so -> https://huggingface.co/apple/mobileclip_s0_timm and that's apple-amlr

Same for florence2, https://huggingface.co/microsoft/Florence-2-base ... mit

Those legacy seresnet are ported from caffe originals I think, so https://github.com/hujie-frank/SENet and apache-2.0

Vitamin, also need to follow the hf_hub_id in the pretrained_cfg entry -> https://huggingface.co/jienengchen/ViTamin-L-384px .. mit

rwightman · 2025-09-25T14:59:26Z

Model weight and code licenses aren't necessarily the same, so model weight license should be used when that is set and I only fallback to code license if the weights were released in a repo licensed say 'apache-2.0' with no mention of a different license for the weights. It's also quite debateable whether copyright applies to weights at all, and all software licenses rest on copyright so might be all around pointless :)

alexanderdann · 2025-09-26T18:03:00Z

@rwightman Integrated your feedback und filled the remaining other with <model>-license since these were project specific licenses by Apple.

alexanderdann added 2 commits September 20, 2025 11:07

Adding licensing information to cspnet.py

ae9bb38

Running formatting with command from CONTRIBUTING.md

ed00d06

- Skipping the cspnet.py file with formatting due to large diff

alexanderdann mentioned this pull request Sep 20, 2025

[FEATURE] Consistent License Handling #2581

Open

alexanderdann added 3 commits September 22, 2025 20:05

Revert "Running formatting with command from CONTRIBUTING.md"

09a13d8

This reverts commit ed00d06. Reducing diff to keep pull request only for functional change.

Undoing changes to cspnet.py

a955d2c

Adding new approach

166a524

Revert "Adding new approach"

24b67f8

This reverts commit 166a524.

Updating licenses

60f2837

alexanderdann added 2 commits September 26, 2025 19:43

Updating remaining licenses

b6be046

Reusing license field

c09d5fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Consistent license handling throughout timm #2585

Consistent license handling throughout timm #2585

Uh oh!

alexanderdann commented Sep 20, 2025 •

edited

Loading

Uh oh!

rwightman commented Sep 20, 2025

Uh oh!

alexanderdann commented Sep 22, 2025

Uh oh!

alexanderdann commented Sep 22, 2025

Uh oh!

rwightman commented Sep 22, 2025

Uh oh!

rwightman commented Sep 22, 2025

Uh oh!

rwightman commented Sep 23, 2025

Uh oh!

alexanderdann commented Sep 23, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 25, 2025

Uh oh!

rwightman commented Sep 25, 2025

Uh oh!

alexanderdann commented Sep 25, 2025

Uh oh!

rwightman commented Sep 25, 2025

Uh oh!

rwightman commented Sep 25, 2025

Uh oh!

alexanderdann commented Sep 26, 2025

Uh oh!

Uh oh!

Uh oh!

Consistent license handling throughout timm #2585

Are you sure you want to change the base?

Consistent license handling throughout timm #2585

Uh oh!

Conversation

alexanderdann commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes Made

Testing

Scope

Note

Checklist

Uh oh!

rwightman commented Sep 20, 2025

Uh oh!

alexanderdann commented Sep 22, 2025

Uh oh!

alexanderdann commented Sep 22, 2025

Uh oh!

rwightman commented Sep 22, 2025

Uh oh!

rwightman commented Sep 22, 2025

Uh oh!

rwightman commented Sep 23, 2025

Uh oh!

alexanderdann commented Sep 23, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 25, 2025

Uh oh!

rwightman commented Sep 25, 2025

Uh oh!

alexanderdann commented Sep 25, 2025

Uh oh!

rwightman commented Sep 25, 2025

Uh oh!

rwightman commented Sep 25, 2025

Uh oh!

alexanderdann commented Sep 26, 2025

Uh oh!

Uh oh!

alexanderdann commented Sep 20, 2025 •

edited

Loading