ipex 2.3 released #725

jiqing-feng · 2024-05-23T09:10:10Z

Hi @echarlaix . I make some changes to the llama model since the ipex 2.3 is released. The API name has changed, and the assisted decoding cannot support for now.

HuggingFaceDocBuilderDev · 2024-05-23T09:16:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tests/ipex/test_modeling.py

jiqing-feng · 2024-05-30T01:39:47Z

Hi @echarlaix . Thanks for your review. The core optimization for the llama2 model is the IAKV, which will change the shape of kv-cache, so functions in assisted decoding like crop_past_key_values cannot be used. We can skip the assisted decoding tests for now and try to enable it in the future.

jiqing-feng · 2024-05-30T05:33:47Z

tests/ipex/test_modeling.py

@@ -259,16 +257,13 @@ def test_ipex_patching_beam_search(self, test_name, model_arch, use_cache):
            GenerationConfig(max_new_tokens=4, num_beams=4, do_sample=True),
            GenerationConfig(max_new_tokens=4, num_beams=8, do_sample=True),
            GenerationConfig(max_new_tokens=4, num_beams=32, do_sample=True),
-            GenerationConfig(max_new_tokens=4, do_sample=not use_cache, top_p=1.0, top_k=5, penalty_alpha=0.6),


The IPEXModel is not supported _contrastive_search for now, we will try to enable it in the future.

I see, could we then add a warning to state this is not supported (at least for transformers >= v4.39.0) and then upgrade it in the setup.py maybe ?

optimum/exporters/ipex/modeling_utils.py

optimum/intel/ipex/modeling_base.py

echarlaix · 2024-05-30T17:46:37Z

tests/ipex/test_modeling.py

@@ -259,16 +257,13 @@ def test_ipex_patching_beam_search(self, test_name, model_arch, use_cache):
            GenerationConfig(max_new_tokens=4, num_beams=4, do_sample=True),
            GenerationConfig(max_new_tokens=4, num_beams=8, do_sample=True),
            GenerationConfig(max_new_tokens=4, num_beams=32, do_sample=True),
-            GenerationConfig(max_new_tokens=4, do_sample=not use_cache, top_p=1.0, top_k=5, penalty_alpha=0.6),


I see, could we then add a warning to state this is not supported (at least for transformers >= v4.39.0) and then upgrade it in the setup.py maybe ?

tests/ipex/test_modeling.py

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

jiqing-feng · 2024-05-31T01:27:55Z

Hi @echarlaix. Reply to it. I have set do_sample=False so you can compare the results between the ipex model and the original model.

jiqing-feng · 2024-05-31T01:33:16Z

Reply to it. It is not related to the transformer version; it's the current limitation for IAKV. I added the warning to clarify that only greedy search and beam search are verified for the patched model. Please have a look, thx!

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

jiqing-feng · 2024-06-05T07:52:24Z

Hi @echarlaix . I have finished all your required changes. Could you please take a look at the failed ipex CI? It is a weird import error.

jiqing-feng · 2024-06-05T08:25:54Z

Hi @echarlaix . I have finished all your required changes. Could you please take a look at the failed ipex CI? It is a weird import error.

Fixed!

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

optimum/intel/ipex/modeling_base.py

.github/workflows/test_ipex.yml

optimum/exporters/ipex/modeling_utils.py

setup.py

optimum/intel/ipex/modeling_base.py

ipex 2.3 released

5351f4a

This was referenced May 26, 2024

refactor CPU llama inference code #728

Merged

add IPEX-XPU support for Llama2 model Inference #703

Closed

jiqing-feng added 8 commits May 27, 2024 11:07

skip tests

1f98d6d

skip testing without pkv

b2b93bb

add tests skip

64dcde4

only llama2 with at least 64 head size support IAKV

945f6b6

cannot assert same outputs cause do_sample=True

c8922f3

rm tiny-llama model testing cause it not work for IAKV

2ddfa7a

fix code style

f4e887d

fix style

d96ea58

echarlaix reviewed May 28, 2024

View reviewed changes

tests/ipex/test_modeling.py Outdated Show resolved Hide resolved

tests/ipex/test_modeling.py Outdated Show resolved Hide resolved

tests/ipex/test_modeling.py Outdated Show resolved Hide resolved

rm tiny llama on test pipeline

ec24d5a

jiqing-feng commented May 30, 2024

View reviewed changes

jiqing-feng added 4 commits May 30, 2024 05:50

fix tests

871de7b

support use_cache=False

d0c8951

rm use_cache in model_kwargs

537f0aa

set use_cache

5a71790

echarlaix reviewed May 30, 2024

View reviewed changes

Update optimum/intel/ipex/modeling_base.py

bde814e

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

jiqing-feng added 6 commits May 31, 2024 04:51

fix spelling error

4a81ea9

fix style

3a61e84

add transformers version warning

fd69407

add compare resultes

1032a26

add warning

c8e7969

set pad_token_id

afdc8d7

jiqing-feng and others added 3 commits June 4, 2024 14:13

fix code styke

8f2f025

add transformers version tests

8dc5ad5

Update .github/workflows/test_ipex.yml

e482e58

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

jiqing-feng and others added 12 commits June 5, 2024 07:01

check geenration method

d366b80

Update optimum/intel/ipex/modeling_base.py

3948cad

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

fix use_cache

d1b63ef

add hidden size limitation for patch

ea4d3e2

add llama in tests

bcb2b5a

add re-load tests

f5f1af8

fix hidden size check

c08c957

rm norm config

51e6f3d

add version variable

d06123b

fix import

641e8f9

rm useless logger

50c1059

rm useless logging

a961746

echarlaix reviewed Jun 5, 2024

View reviewed changes

optimum/intel/ipex/modeling_base.py Outdated Show resolved Hide resolved

optimum/intel/ipex/modeling_base.py Outdated Show resolved Hide resolved

jiqing-feng and others added 2 commits June 6, 2024 05:40

fix last round review

c2253a8

Merge branch 'huggingface:main' into rename

e29ea58

echarlaix reviewed Jun 6, 2024

View reviewed changes

echarlaix added 5 commits June 6, 2024 14:06

Update .github/workflows/test_ipex.yml

caa27c3

Update optimum/intel/ipex/modeling_base.py

78498ab

Update optimum/intel/ipex/modeling_base.py

97f7876

Update setup.py

cf3525a

Update optimum/exporters/ipex/modeling_utils.py

8ba602d

echarlaix merged commit f06f504 into huggingface:main Jun 6, 2024
13 checks passed

jiqing-feng added 2 commits June 6, 2024 16:39

fix

f15a1f5

limit the new tokens of assisted decoding tests

36ae751

jiqing-feng deleted the rename branch October 9, 2024 03:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ipex 2.3 released #725

ipex 2.3 released #725

jiqing-feng commented May 23, 2024

HuggingFaceDocBuilderDev commented May 23, 2024

jiqing-feng commented May 30, 2024 •

edited

Loading

jiqing-feng May 30, 2024

echarlaix May 30, 2024

echarlaix May 30, 2024

jiqing-feng commented May 31, 2024

jiqing-feng commented May 31, 2024 •

edited

Loading

jiqing-feng commented Jun 5, 2024

jiqing-feng commented Jun 5, 2024

ipex 2.3 released #725

ipex 2.3 released #725

Conversation

jiqing-feng commented May 23, 2024

HuggingFaceDocBuilderDev commented May 23, 2024

jiqing-feng commented May 30, 2024 • edited Loading

jiqing-feng May 30, 2024

Choose a reason for hiding this comment

echarlaix May 30, 2024

Choose a reason for hiding this comment

echarlaix May 30, 2024

Choose a reason for hiding this comment

jiqing-feng commented May 31, 2024

jiqing-feng commented May 31, 2024 • edited Loading

jiqing-feng commented Jun 5, 2024

jiqing-feng commented Jun 5, 2024

jiqing-feng commented May 30, 2024 •

edited

Loading

jiqing-feng commented May 31, 2024 •

edited

Loading