-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipex 2.3 released #725
ipex 2.3 released #725
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Hi @echarlaix . Thanks for your review. The core optimization for the llama2 model is the IAKV, which will change the shape of kv-cache, so functions in assisted decoding like crop_past_key_values cannot be used. We can skip the assisted decoding tests for now and try to enable it in the future. |
@@ -259,16 +257,13 @@ def test_ipex_patching_beam_search(self, test_name, model_arch, use_cache): | |||
GenerationConfig(max_new_tokens=4, num_beams=4, do_sample=True), | |||
GenerationConfig(max_new_tokens=4, num_beams=8, do_sample=True), | |||
GenerationConfig(max_new_tokens=4, num_beams=32, do_sample=True), | |||
GenerationConfig(max_new_tokens=4, do_sample=not use_cache, top_p=1.0, top_k=5, penalty_alpha=0.6), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IPEXModel is not supported _contrastive_search for now, we will try to enable it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, could we then add a warning to state this is not supported (at least for transformers >= v4.39.0) and then upgrade it in the setup.py maybe ?
@@ -259,16 +257,13 @@ def test_ipex_patching_beam_search(self, test_name, model_arch, use_cache): | |||
GenerationConfig(max_new_tokens=4, num_beams=4, do_sample=True), | |||
GenerationConfig(max_new_tokens=4, num_beams=8, do_sample=True), | |||
GenerationConfig(max_new_tokens=4, num_beams=32, do_sample=True), | |||
GenerationConfig(max_new_tokens=4, do_sample=not use_cache, top_p=1.0, top_k=5, penalty_alpha=0.6), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, could we then add a warning to state this is not supported (at least for transformers >= v4.39.0) and then upgrade it in the setup.py maybe ?
Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
Hi @echarlaix. Reply to it. I have set |
Reply to it. It is not related to the transformer version; it's the current limitation for IAKV. I added the warning to clarify that only greedy search and beam search are verified for the patched model. Please have a look, thx! |
Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
Hi @echarlaix . I have finished all your required changes. Could you please take a look at the failed ipex CI? It is a weird import error. |
Fixed! |
Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
Hi @echarlaix . I make some changes to the llama model since the ipex 2.3 is released. The API name has changed, and the assisted decoding cannot support for now.