Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to NNCF 2.11 #763

Merged
merged 9 commits into from
Jun 18, 2024
Merged

Conversation

nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Jun 12, 2024

What does this PR do?

  • Update NNCF requirement to version 2.11
  • Enable scale estimation weight compression algorithm
  • Add option awq option to OVWeightQuantizationConfig so that AWQ can be enabled by providing OVWeightQuantizationConfig(awq=True) besides already present OVWeightQuantizationConfig(quant_method=QuantizationMethod.AWQ).

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Comment on lines 258 to 261
if self.quant_method == QuantizationMethod.AWQ:
self.quant_method = OVQuantizationMethod.DEFAULT
self.awq = True
logger.warning('Using quant_method=\"AWQ\" is deprecated. Please use awq=True instead in the future.')
Copy link
Collaborator Author

@nikita-savelyevv nikita-savelyevv Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to change API for applying AWQ via source code, i.e. when we create OVWeightQuantizationConfig.

Despite the fact there is an QuantizationMethod.AWQ enum, this approach does not quite fit the whole API of NNCF weight compression because AWQ is just another method, alike to Scale Estimation that can additionally be applied along a regular weight compression, and not some specific standalone method like hybrid quantization for example.

I've added a backward compatible logic to still apply AWQ if it was provided via previous API.

During export via optimum-cli it is already like this, e.g. there is an --awq argument. So this change additionally makes it more aligned with CLI API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review June 12, 2024 17:19
@AlexKoff88 AlexKoff88 requested a review from echarlaix June 13, 2024 08:01
@@ -253,8 +263,9 @@ def run(self):
"all_layers": None if is_int8 else self.args.all_layers,
"dataset": self.args.dataset,
"num_samples": self.args.num_samples,
"quant_method": QuantizationMethod.AWQ if self.args.awq else None,
"awq": self.args.awq,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think originally it looked like this but @echarlaix asked to make AWQ option unified with the rest of the HF ecosystem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I'd me in favor of keeping the quant_method attribute to keep compatibility with what's done in transformers, I think we can have a awq argument instead of a quant_method though if easier : OVWeightQuantizationConfig(awq=True) even though slight preference to something like OVWeightQuantizationConfig(quant_method="awq") so that when (if) new quantization methods are added it'll be easier than to have multiple argument for each of them

Copy link
Collaborator Author

@nikita-savelyevv nikita-savelyevv Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@echarlaix Thanks for the idea. We indeed could keep both awq=True and quant_method="awq". We can remove awq parameter when there will be other relevant options available.

@AlexKoff88
Copy link
Collaborator

I am ok with the rest of the changes but the way how AWQ option looks should be clarified. @echarlaix, can you please comment?

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -253,8 +263,9 @@ def run(self):
"all_layers": None if is_int8 else self.args.all_layers,
"dataset": self.args.dataset,
"num_samples": self.args.num_samples,
"quant_method": QuantizationMethod.AWQ if self.args.awq else None,
"awq": self.args.awq,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I'd me in favor of keeping the quant_method attribute to keep compatibility with what's done in transformers, I think we can have a awq argument instead of a quant_method though if easier : OVWeightQuantizationConfig(awq=True) even though slight preference to something like OVWeightQuantizationConfig(quant_method="awq") so that when (if) new quantization methods are added it'll be easier than to have multiple argument for each of them

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment on lines 218 to 219
if awq:
self.quant_method = QuantizationMethod.AWQ
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to only have one path to enable awq (currently we have both quant_method and awq arguments), what do you think about something like OVWeightQuantizationConfig(quant_method="awq") ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@echarlaix Ah, I thought you meant it's fine keeping both. Ok, removed awq argument.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot @nikita-savelyevv

@echarlaix
Copy link
Collaborator

Thanks a lot @nikita-savelyevv will merge once the tests pass

@echarlaix echarlaix merged commit 109c3a1 into huggingface:main Jun 18, 2024
11 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants