-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add from_huggingface
method to KerasNLP models
#1294
Comments
from_huggingface
method to KerasNLP models
I will check with other folks here, but I think this is something we probably will not want to pursue. We could mirror all our own presets on huggingface, or make it easier to bulk convert hf checkpoints offline, but I do not think converting huggingface checkpoints "live" will be a good solution.
I think right now it makes sense to continue to integrate with Kaggle #1292, which will help us define a external friendly format for our presets. Once we have that we could consider exposing a set of tools to automatically convert huggingface model to our format on a best effort basis. This would never included all models or all huggingface config options (I just don't see that happening feasibly), but it could be easy to use and side step the performance issues mentioned above. |
Hey there 🤗 I think there is a confusion in this issue between 2 different topics:
The HF Hub is a platform to host and share all kinds of models, and not only I have actually worked on a fork to showcase how the implementation would look like: master...Wauplin:keras-nlp:huggingface-hub-integration. The integration requires the Here is a Colab notebook showcasing it. import keras_nlp
from keras_nlp.models import BertClassifier
from keras_nlp.utils.preset_utils import save_to_preset
classifier = BertClassifier.from_preset("bert_base_en_uncased")
(...) # train/retrain/fine-tune
# Save to Hugging Face Hub
save_to_preset(classifier, "hf://Wauplin/bert_base_en_uncased_retrained")
# Reload from Hugging Face Hub
classifier_reloaded = BertClassifier.from_preset("hf://Wauplin/bert_base_en_uncased_retrained") Here is how it looks like once uploaded on the Hub: https://huggingface.co/Wauplin/bert_base_en_uncased_retrained/tree/main.. WDYT? I am wiling to help creating a PR if this is of interest for the Keras team. It is essentially what's already in the fork + some documentation and testing. On the Hugging Face side, we could make Disclaimer: I work at Hugging Face as a maintainer of the |
Following my comment above, I've opened #1510 to continue the discussion :) |
Thanks! Overall totally agree with your comment. Let's add a Re: conversion, I still do think a solid set of tooling for converting bi-directionally from transformers format <> KerasNLP format is important, at least for popular architectures (gemma, llama, mistral, falcon, bloom...). But see more design questions there to nail down. Let's start with the hub integration, and keep figuring out what we want for conversion tooling. Note that for Gemma, @nkovela1 did give us a tool for HF export -> https://github.com/keras-team/keras-nlp/blob/master/tools/gemma/export_gemma_to_hf.py, so a flow of fine-tuning with Keras exporting to vllm, TGI, etc is possible. But I suspect we might want to move stuff like that into the library proper at some point. |
Draft for public saving API -> #1512, though expect some changes. Comments welcome! |
Add support for loading huggingface model checkpoints in KerasNLP backbones
Is your feature request related to a problem? Please describe.
As of now KerasNLP backbones load pretrained weights of standard checkpoints. However there are lots of fine-tuned checkpoints on huggingface hub which most of the time solve a lot of problems. If we add this functionality of supporting HF checkpoints, we can truly fulfil them with Keras's Multi-backend-promise with KerasNLP's modular design for most of the NLP community.
Describe the solution you'd like
Implemnting
from_huggingface
method passing checkpoint name from huggingfaceAll it will require is mapping layer names and implementing checkpoint-conversion scripts as methods.
Alternative solution
Instead of implementing a seperate method, we could modify
from_preset
method to use huggingface checkpointsI'm up for contributing this feature.
cc: @abheesht17
The text was updated successfully, but these errors were encountered: