You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't think generate() is intended to work automatically with torch.jit.trace(), is it? cc @gante - do we have a recommended way to trace/export generation loops?
In the meantime, I performed inference using the IPEX backend. Below is the snippet I used:
model = ipex.llm.optimize(model, dtype=amp_dtype, inplace=True, **deployment_mode=True**)
output = model.generate(inputs)
When deployment_mode=True is set, it internally uses torch.jit.trace (Reference). To use the traced model for generate, some modifications have been made in IPEX (Reference).
Do we have any similar fix or approach like this in native PyTorch?
Environment Info:
I am trying to run inference for the model "meta-llama/Llama-3.1-8B-Instruct" using torch.jit.trace(). Below is my code snippet.
However, I am encountering errors when I try to pass the traced_model to model.generate(). Is there any possibility to overcome this error
But when I try in different version of transformers(transformers=4.43.2), I am getting different error as below
AttributeError: 'RecursiveScriptModule' object has no attribute 'generate'
Is there a fix for this issue in any version of Transformers, or an alternative approach to enable
generate
with atraced model
?The text was updated successfully, but these errors were encountered: