-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Cannot load a model with flan-t5 embeddings #3581
Comments
Notice that the reproducing script doesn't reproduce the error, I think the issue is that the line I managed to minify the script a bit and reproduce the error with the following: from pathlib import Path
from flair.data import Dictionary
from flair.embeddings import TransformerWordEmbeddings
from flair.models import SequenceTagger
embeddings = TransformerWordEmbeddings(
model="google/flan-t5-small",
layers="-1",
subtoken_pooling="first",
fine_tune=False,
)
tagger = SequenceTagger(embeddings, Dictionary(), "ner")
save_path = Path("flan-t5.pt")
tagger.save(save_path)
del tagger
loaded_tagger = SequenceTagger.load(save_path) it is important to notice, that the bug requires a newer version of transformers. It doesn't occur with |
I could also reproduce this. My observations so far:
In the Fast Tokenizer case it helped to remove the flair/flair/embeddings/transformer.py Line 1084 in 68508cc
But then you can't use the non-fast variant. |
this is already fixed via #3544 updating to flair==0.15.0 (today's release) solves the issue |
Describe the bug
To Reproduce
Expected behavior
The traqined model should load OK so it can be used to make predictions.
This exact code works OK if I used the xlm-roberta-large model instead of the google/flan-t5-large
embeddings = TransformerWordEmbeddings(model='xlm-roberta-large', #
layers="-1",
subtoken_pooling="first",
fine_tune=True,
use_context=True,
)
Logs and Stack traces
Screenshots
No response
Additional Context
TLDR:
Training and using the trained model works great if using xlm-roberta-large
Training works OK when using lan-t5-large but loading the trained model fails with a TypeError
I need some help on how to use the saved/fine-tuned model if the original embedding was based on flan-t5
Environment
Versions:
Flair
0.14.0
Pytorch
2.5.1+cu121
Transformers
4.46.3
GPU
True
The text was updated successfully, but these errors were encountered: