-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to NNCF 2.11 #763
Update to NNCF 2.11 #763
Conversation
129b51e
to
3ff19f7
Compare
if self.quant_method == QuantizationMethod.AWQ: | ||
self.quant_method = OVQuantizationMethod.DEFAULT | ||
self.awq = True | ||
logger.warning('Using quant_method=\"AWQ\" is deprecated. Please use awq=True instead in the future.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to change API for applying AWQ via source code, i.e. when we create OVWeightQuantizationConfig
.
Despite the fact there is an QuantizationMethod.AWQ
enum, this approach does not quite fit the whole API of NNCF weight compression because AWQ is just another method, alike to Scale Estimation that can additionally be applied along a regular weight compression, and not some specific standalone method like hybrid quantization for example.
I've added a backward compatible logic to still apply AWQ if it was provided via previous API.
During export via optimum-cli it is already like this, e.g. there is an --awq
argument. So this change additionally makes it more aligned with CLI API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @AlexKoff88
optimum/commands/export/openvino.py
Outdated
@@ -253,8 +263,9 @@ def run(self): | |||
"all_layers": None if is_int8 else self.args.all_layers, | |||
"dataset": self.args.dataset, | |||
"num_samples": self.args.num_samples, | |||
"quant_method": QuantizationMethod.AWQ if self.args.awq else None, | |||
"awq": self.args.awq, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think originally it looked like this but @echarlaix asked to make AWQ option unified with the rest of the HF ecosystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I'd me in favor of keeping the quant_method
attribute to keep compatibility with what's done in transformers, I think we can have a awq argument instead of a quant_method though if easier : OVWeightQuantizationConfig(awq=True)
even though slight preference to something like OVWeightQuantizationConfig(quant_method="awq")
so that when (if) new quantization methods are added it'll be easier than to have multiple argument for each of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@echarlaix Thanks for the idea. We indeed could keep both awq=True
and quant_method="awq"
. We can remove awq
parameter when there will be other relevant options available.
I am ok with the rest of the changes but the way how AWQ option looks should be clarified. @echarlaix, can you please comment? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @nikita-savelyevv
optimum/commands/export/openvino.py
Outdated
@@ -253,8 +263,9 @@ def run(self): | |||
"all_layers": None if is_int8 else self.args.all_layers, | |||
"dataset": self.args.dataset, | |||
"num_samples": self.args.num_samples, | |||
"quant_method": QuantizationMethod.AWQ if self.args.awq else None, | |||
"awq": self.args.awq, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I'd me in favor of keeping the quant_method
attribute to keep compatibility with what's done in transformers, I think we can have a awq argument instead of a quant_method though if easier : OVWeightQuantizationConfig(awq=True)
even though slight preference to something like OVWeightQuantizationConfig(quant_method="awq")
so that when (if) new quantization methods are added it'll be easier than to have multiple argument for each of them
d9fbac2
to
b796fd0
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
if awq: | ||
self.quant_method = QuantizationMethod.AWQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to only have one path to enable awq (currently we have both quant_method
and awq
arguments), what do you think about something like OVWeightQuantizationConfig(quant_method="awq")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@echarlaix Ah, I thought you meant it's fine keeping both. Ok, removed awq
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot @nikita-savelyevv
Thanks a lot @nikita-savelyevv will merge once the tests pass |
What does this PR do?
awq
option toOVWeightQuantizationConfig
so that AWQ can be enabled by providingOVWeightQuantizationConfig(awq=True)
besides already presentOVWeightQuantizationConfig(quant_method=QuantizationMethod.AWQ)
.Before submitting