[FEATURE] Load model config AND weights from local folder (support locally cloned HF hub models) #2338

rwightman · 2024-11-19T21:45:23Z

Is your feature request related to a problem? Please describe.

Right now to load pretrained models w/ associated pretained config (preprocess + head details) pretrained=True must be used with

the model name consisting of a string specifing builtin model + pretrained tag (one of timm.list_pretrained())
OR
consisting of a string specifying a model repo on the HuggingFace (HF) hub hf-hub:repo_name

For option 1 above, the pretrained_cfg is loaded from builtins in the library. The builtin config can specify a weight location at a url or a specific HF hub repo + filename.

For option 2, the pretrained_cfg is loaded from the specified repo in the config.json file, and the weight is loaded from that same repo.

In any case, if it is desired to using an existing pretrained_cfg by either mechanism above, the create_model factory will accept a pretrained_cfg_overlay argument, it should be a dict, and each key-value of that dict will override the values in the originally sourced config. This allows one to load weights locally by passing a file= key that will override any url or hf_hub entry, but cannot change how/where the config is sourced from.

Example:

timm.create_model(
  'swinv2_large_window12to16_192to256',
  pretrained=True,
  pretrained_cfg_overlay=dict(file='path/to/checkpoint'),
)

Describe the solution you'd like
It should be possible to call timm.create_model('...', pretrained=True) and specify a local folder from which both the config and weights will be sourced, as with transformers.AutoModel.from_pretrained()

The hardest part of adding this is figuring out how to integrate passing a folder into the create_model API. The pretrained_cfg_overlay was always a bit clunky/kludgy ... ideally this would be cleaner.

Based on existing use the two options that initially stand out to me

if model name isn't found, check for folder at the specified string timm.create_model('/blah/blah/my-resnet50')
prefix like hf-hub with something like local: , 'folder: ... timm.create_model('local:/blah/blah/my-resnet50')

I'm liking 2 because it's a bit safer and more explicit, parallels hf-hub use. May be some considerations re timm wrapper in transformers

Additional context
This is useful generally but would be particularly useful with new timm model wrapper in transformers to maintain full API compability. Users expect to be able to clone a model repo from the hub and point to a local folder with those files when calling from_pretrained().

The text was updated successfully, but these errors were encountered:

rwightman · 2024-11-19T21:49:02Z

CC @qubvel @LysandreJik re transformers integration

rwightman · 2024-11-19T21:57:16Z

Some related discussion / issue ... some were at least partially addressed by pretrained_cfg_overlay:
#1323
#2061
#2000
#2133
#1941
#2069

turicas · 2024-12-04T01:55:16Z

This will be very useful! However, I think requiring users to manually download the model in advance (e.g., using HF's CLI) adds unnecessary complexity, such as introducing an extra dependency (HF CLI) and adding another step to the workflow (running the command with the exact same model name that will be in the code, leading to possible duplication).

This could be avoided if timm.create_model allowed users to specify both the model name and a custom cache directory (not the full path to the safetensors file). If the model already exists in the specified cache dir, timm could load it; otherwise, it could download and save it to that location automatically.

This approach would offer flexibility without requiring a "pre-download" step. For users who prefer to pre-download, the HF CLI could remain an optional tool to populate the cache directory, accommodating both workflows.

Regarding pretrained_cfg_overlay: I find it harder to work with options inside a dict compared to explicit function parameters. Since this feature would make timm manage the downloading/loading of the model, I think a dedicated parameter (e.g., model_dir) would make the interface more intuitive.

rwightman · 2024-12-04T06:33:20Z

@turicas the huggingface-cli is installed in the python env (and will be in the path) when you pip install huggingface_hub, which is required by timm to access the weights via the API, so dep is there, but understand the extra step/lookup isn't desirable.

So, I could punch a cache_dir arg via create_model all the way to the hf hub dl (and the url downloader) which would override defaults. However, it needs to remain consistent with env vars that override HF cache, would be the same as HF_HUB_CACHE as it's still being treated as a HF cache folder and needs to match the naming, to be useable via the API.

pretrained_cfg_overlay was not really supposed to be a user fancing 'thing', it was supposed to be more of a developer testing/power user thing that could be used to override and key-value in the pretrained-cfg for various purposes, it happened to be useful here and filled a need for some to punt this feature here down the road (needed some more thought). I use it a lot in my model staging process, scripts to help with publishing, etc.

rwightman · 2024-12-05T15:37:16Z

@turicas I impl a first pass of the cache_dir as an argument idea in #2356 ... it lets you specify the cache dir per create call, but it still behaves as a cache dir.

turicas · 2024-12-06T02:17:09Z

@turicas the huggingface-cli is installed in the python env (and will be in the path) when you pip install huggingface_hub, which is required by timm to access the weights via the API, so dep is there, but understand the extra step/lookup isn't desirable.

Oh, my bad - I thought it would require an extra library, sorry.

@turicas I impl a first pass of the cache_dir as an argument idea in #2356 ... it lets you specify the cache dir per create call, but it still behaves as a cache dir.

Nice! I tested and it worked perfectly. Thanks for this. I've also created a little PR with an example of cache_dir usage.

rwightman · 2024-12-06T04:52:50Z

@turicas thanks for confirming, I'll merge that soon, and your example. Will probably be closer to the end of the month before it makes it into another pypi release though.

rwightman added enhancement New feature or request help wanted Extra attention is needed labels Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Load model config AND weights from local folder (support locally cloned HF hub models) #2338

[FEATURE] Load model config AND weights from local folder (support locally cloned HF hub models) #2338

rwightman commented Nov 19, 2024 •

edited

Loading

rwightman commented Nov 19, 2024

rwightman commented Nov 19, 2024

turicas commented Dec 4, 2024

rwightman commented Dec 4, 2024

rwightman commented Dec 5, 2024

turicas commented Dec 6, 2024

rwightman commented Dec 6, 2024

[FEATURE] Load model config AND weights from local folder (support locally cloned HF hub models) #2338

[FEATURE] Load model config AND weights from local folder (support locally cloned HF hub models) #2338

Comments

rwightman commented Nov 19, 2024 • edited Loading

rwightman commented Nov 19, 2024

rwightman commented Nov 19, 2024

turicas commented Dec 4, 2024

rwightman commented Dec 4, 2024

rwightman commented Dec 5, 2024

turicas commented Dec 6, 2024

rwightman commented Dec 6, 2024

rwightman commented Nov 19, 2024 •

edited

Loading