Update to NNCF 2.11 #763

nikita-savelyevv · 2024-06-12T16:38:52Z

What does this PR do?

Update NNCF requirement to version 2.11
Enable scale estimation weight compression algorithm
Add option awq option to OVWeightQuantizationConfig so that AWQ can be enabled by providing OVWeightQuantizationConfig(awq=True) besides already present OVWeightQuantizationConfig(quant_method=QuantizationMethod.AWQ).

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

nikita-savelyevv · 2024-06-12T17:14:11Z

optimum/intel/openvino/configuration.py

+        if self.quant_method == QuantizationMethod.AWQ:
+            self.quant_method = OVQuantizationMethod.DEFAULT
+            self.awq = True
+            logger.warning('Using quant_method=\"AWQ\" is deprecated. Please use awq=True instead in the future.')


I decided to change API for applying AWQ via source code, i.e. when we create OVWeightQuantizationConfig.

Despite the fact there is an QuantizationMethod.AWQ enum, this approach does not quite fit the whole API of NNCF weight compression because AWQ is just another method, alike to Scale Estimation that can additionally be applied along a regular weight compression, and not some specific standalone method like hybrid quantization for example.

I've added a backward compatible logic to still apply AWQ if it was provided via previous API.

During export via optimum-cli it is already like this, e.g. there is an --awq argument. So this change additionally makes it more aligned with CLI API.

cc @AlexKoff88

AlexKoff88 · 2024-06-13T09:21:54Z

optimum/commands/export/openvino.py

@@ -253,8 +263,9 @@ def run(self):
                    "all_layers": None if is_int8 else self.args.all_layers,
                    "dataset": self.args.dataset,
                    "num_samples": self.args.num_samples,
-                    "quant_method": QuantizationMethod.AWQ if self.args.awq else None,
+                    "awq": self.args.awq,


I think originally it looked like this but @echarlaix asked to make AWQ option unified with the rest of the HF ecosystem.

yes I'd me in favor of keeping the quant_method attribute to keep compatibility with what's done in transformers, I think we can have a awq argument instead of a quant_method though if easier : OVWeightQuantizationConfig(awq=True) even though slight preference to something like OVWeightQuantizationConfig(quant_method="awq") so that when (if) new quantization methods are added it'll be easier than to have multiple argument for each of them

@echarlaix Thanks for the idea. We indeed could keep both awq=True and quant_method="awq". We can remove awq parameter when there will be other relevant options available.

AlexKoff88 · 2024-06-13T09:24:24Z

I am ok with the rest of the changes but the way how AWQ option looks should be clarified. @echarlaix, can you please comment?

echarlaix

LGTM @nikita-savelyevv

echarlaix · 2024-06-13T12:24:34Z

optimum/commands/export/openvino.py

@@ -253,8 +263,9 @@ def run(self):
                    "all_layers": None if is_int8 else self.args.all_layers,
                    "dataset": self.args.dataset,
                    "num_samples": self.args.num_samples,
-                    "quant_method": QuantizationMethod.AWQ if self.args.awq else None,
+                    "awq": self.args.awq,


yes I'd me in favor of keeping the quant_method attribute to keep compatibility with what's done in transformers, I think we can have a awq argument instead of a quant_method though if easier : OVWeightQuantizationConfig(awq=True) even though slight preference to something like OVWeightQuantizationConfig(quant_method="awq") so that when (if) new quantization methods are added it'll be easier than to have multiple argument for each of them

HuggingFaceDocBuilderDev · 2024-06-18T07:34:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

echarlaix · 2024-06-18T11:33:15Z

optimum/intel/openvino/configuration.py

+        if awq:
+            self.quant_method = QuantizationMethod.AWQ


I'd prefer to only have one path to enable awq (currently we have both quant_method and awq arguments), what do you think about something like OVWeightQuantizationConfig(quant_method="awq") ?

@echarlaix Ah, I thought you meant it's fine keeping both. Ok, removed awq argument.

thanks a lot @nikita-savelyevv

echarlaix · 2024-06-18T15:27:46Z

Thanks a lot @nikita-savelyevv will merge once the tests pass

nikita-savelyevv force-pushed the update-to-nncf-211 branch from 129b51e to 3ff19f7 Compare June 12, 2024 16:41

nikita-savelyevv added 4 commits June 12, 2024 18:41

Update NNCF requirement to 2.11; add scale-estimation option

3ff19f7

Describe AWQ and scale estimation in docs

99be00c

Tweak docs

5d35ffb

Refactor AWQ API

57c5938

nikita-savelyevv commented Jun 12, 2024

View reviewed changes

Style

6de90a9

nikita-savelyevv marked this pull request as ready for review June 12, 2024 17:19

Remove not needed quotes

6c3daef

AlexKoff88 requested a review from echarlaix June 13, 2024 08:01

AlexKoff88 reviewed Jun 13, 2024

View reviewed changes

echarlaix reviewed Jun 13, 2024

View reviewed changes

nikita-savelyevv force-pushed the update-to-nncf-211 branch from d9fbac2 to b796fd0 Compare June 17, 2024 10:34

Apply comments; some additional tweaks

b796fd0

AlexKoff88 approved these changes Jun 17, 2024

View reviewed changes

echarlaix reviewed Jun 18, 2024

View reviewed changes

nikita-savelyevv added 2 commits June 18, 2024 13:43

Remove awq argument; fix test

be35e3f

Revert notebook changes

c4658c8

echarlaix approved these changes Jun 18, 2024

View reviewed changes

echarlaix merged commit 109c3a1 into huggingface:main Jun 18, 2024
11 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to NNCF 2.11 #763

Update to NNCF 2.11 #763

nikita-savelyevv commented Jun 12, 2024 •

edited

Loading

nikita-savelyevv Jun 12, 2024 •

edited

Loading

nikita-savelyevv Jun 12, 2024

AlexKoff88 Jun 13, 2024

echarlaix Jun 13, 2024

nikita-savelyevv Jun 17, 2024 •

edited

Loading

AlexKoff88 commented Jun 13, 2024

echarlaix left a comment

echarlaix Jun 13, 2024

HuggingFaceDocBuilderDev commented Jun 18, 2024

echarlaix Jun 18, 2024

nikita-savelyevv Jun 18, 2024

echarlaix Jun 18, 2024

echarlaix commented Jun 18, 2024

Update to NNCF 2.11 #763

Update to NNCF 2.11 #763

Conversation

nikita-savelyevv commented Jun 12, 2024 • edited Loading

What does this PR do?

Before submitting

nikita-savelyevv Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

nikita-savelyevv Jun 12, 2024

Choose a reason for hiding this comment

AlexKoff88 Jun 13, 2024

Choose a reason for hiding this comment

echarlaix Jun 13, 2024

Choose a reason for hiding this comment

nikita-savelyevv Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

AlexKoff88 commented Jun 13, 2024

echarlaix left a comment

Choose a reason for hiding this comment

echarlaix Jun 13, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jun 18, 2024

echarlaix Jun 18, 2024

Choose a reason for hiding this comment

nikita-savelyevv Jun 18, 2024

Choose a reason for hiding this comment

echarlaix Jun 18, 2024

Choose a reason for hiding this comment

echarlaix commented Jun 18, 2024

nikita-savelyevv commented Jun 12, 2024 •

edited

Loading

nikita-savelyevv Jun 12, 2024 •

edited

Loading

nikita-savelyevv Jun 17, 2024 •

edited

Loading