Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor CPU llama inference code #728

Merged
merged 31 commits into from
Jun 7, 2024
Merged

Conversation

faaany
Copy link
Contributor

@faaany faaany commented May 26, 2024

What does this PR do?

This PR refactors the current CPU llama inference code to make code clean. The major changes are as follows:

  • introduce a new class _IPEXLlamaAttention and move the attention-related OPs and attention forward code to _IPEXLlamaAttention
  • introduce a new class _IPEXLlamaMLP and move the MLP-related OPs and forward code to _IPEXLlamaMLP
  • simplify _patch_llama_model
  • rename _IPEXLlamaDecoderLayerRef to _IPEXLlamaDecoderLayer
  • refactor the forward mtehod of _IPEXLlamaAttention into gemm, rope and sdpa

Please note that this PR is based on the unmerged PR #725 by Jiqing as can be seen in the commit history.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

optimum/exporters/ipex/modeling_utils.py Outdated Show resolved Hide resolved
optimum/exporters/ipex/modeling_utils.py Outdated Show resolved Hide resolved
optimum/exporters/ipex/modeling_utils.py Outdated Show resolved Hide resolved
optimum/exporters/ipex/modeling_utils.py Show resolved Hide resolved
optimum/exporters/ipex/modeling_utils.py Outdated Show resolved Hide resolved
tests/ipex/test_modeling.py Outdated Show resolved Hide resolved
optimum/exporters/ipex/model_patcher.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks @faaany, let's wait for #725 to be merged before merging

@echarlaix
Copy link
Collaborator

#725 is now merged, you mind rebasing @faaany ?

@faaany
Copy link
Contributor Author

faaany commented Jun 6, 2024

#725 is now merged, you mind rebasing @faaany ?

Cool! Rebase done, pls have a review. Thx!

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @faaany

tests/ipex/test_modeling.py Outdated Show resolved Hide resolved
tests/ipex/test_modeling.py Outdated Show resolved Hide resolved
Comment on lines -234 to -238
if is_transformers_version("<", _TRANSFORMERS_MIN_VERSION) or is_transformers_version(
">", _TRANSFORMERS_MAX_VERSION
):
raise ImportError(
f"Only transformers versions {_TRANSFORMERS_MIN_VERSION} ~ {_TRANSFORMERS_MAX_VERSION} are verified."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep this as only the transformers versions in between are supported, but this should be moved what do you think about :

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, totally agree! code updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the detailed review! the rebase and merge conflict did bring some headache

optimum/intel/utils/modeling_utils.py Outdated Show resolved Hide resolved
faaany and others added 2 commits June 7, 2024 00:48
Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
@faaany
Copy link
Contributor Author

faaany commented Jun 7, 2024

Hi @echarlaix , I manually-checked the changes in #725 and fixed the bugs introduced through rebasing. Now all tests passed, I think we are good to go.

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks a lot @faaany !

@echarlaix echarlaix merged commit 36e5b23 into huggingface:main Jun 7, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants