Replies: 1 comment
-
@drzraf it's documented, did you try what's in the documentation and not have success? https://huggingface.co/timm/ViT-SO400M-14-SigLIP-384#with-timm-for-image-embeddings |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've some inference code around
open_clip
derivated from their README: https://github.com/mlfoundations/open_clip/But after excavating Timm's code/issues/discussions, I still can't find a way to do it
There are, indeed:
feature_cfg
parameter passed tocreate_model()
ClassifierHead()
/create_classifier
mostly initialized bycreate_model
(aVisionTransformer
in my case of avit_so400m*
)accuracy
functionSimply said: the API assumes the developer knows most of what
global_pool="avgmax", fc_norm, embed_dim
mean among dozen other parameter & semantics + their direct and indirect implications. This is not exactly an intuitive API. While it definitly sounds flexible and powerful, not having snippets (nor tests) to start from makes approach it, somehow time-consuming.Could someone be so kind to provide Timm's equivalent of the openclip's usage snippet ?
?
Somehow related:
mlfoundations/open_clip#685
Beta Was this translation helpful? Give feedback.
All reactions