-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Torch][WeightCompression] Add Scale Estimation data-aware support #3179
base: develop
Are you sure you want to change the base?
Changes from 1 commit
09126af
67cef71
94d2850
db42165
f96788a
b1d4c47
51ccdd6
e6a9191
e37ef52
d1843ad
cd79e80
bc0731c
df6b43b
368054a
e2a6f46
dbf2b1d
702f8b1
24e39c2
e97078b
035a668
58b9924
be3694b
9345e2f
6e7d981
683cfd4
1a33369
dcf88a5
e48a44b
63e8c0a
b2fef75
32788a4
9d0acdb
37a41ac
a305fac
ddee495
f89ae9d
026a0ed
e3f12c2
a347a25
601f2e4
32bc0e5
f8d6451
7557fb5
568809c
8c7efd6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1185,3 +1185,18 @@ def _create_ov_model(self): | |
|
||
model = ov.Model([sin_result, cos_result], [position_ids]) | ||
return model | ||
|
||
|
||
class MLP(OVReferenceModel): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be honest it is not MLP. There is one layer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
def _create_ov_model(self): | ||
input_node = opset.parameter([1, 32, 32], name="Input") | ||
|
||
weights_data = np.arange(0, 32 * 32, dtype=np.float32).reshape(32, 32) | ||
weights_node = opset.constant(weights_data, dtype=np.float32, name="Weights") | ||
|
||
matmul_node = opset.matmul(input_node, weights_node, transpose_a=False, transpose_b=True, name="MatMul") | ||
|
||
result_node = opset.result(matmul_node, name="Result") | ||
|
||
model = ov.Model([result_node], [input_node], name="MLP_Model") | ||
return model |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Copyright (c) 2024 Intel Corporation | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
from nncf.quantization.algorithms.weight_compression.mixed_precision import HAWQCriterion | ||
from nncf.quantization.algorithms.weight_compression.mixed_precision import MaxVarianceCriterion | ||
from nncf.quantization.algorithms.weight_compression.mixed_precision import MeanMaxCriterion | ||
from nncf.quantization.algorithms.weight_compression.mixed_precision import MeanVarianceCriterion | ||
from nncf.quantization.algorithms.weight_compression.torch_backend import PTMixedPrecisionAlgoBackend | ||
from tests.cross_fw.test_templates.test_weights_compression_backends import TemplateTestMixedPrecisionAlgoBackend | ||
|
||
|
||
class TestPTMixedPrecisionAlgoBackend(TemplateTestMixedPrecisionAlgoBackend): | ||
def get_hawq_with_backend(self, subset_size): | ||
hawq = HAWQCriterion(None, None, subset_size=subset_size) | ||
hawq._backend_entity = PTMixedPrecisionAlgoBackend() | ||
return hawq | ||
|
||
def get_mean_variance_with_backend(self, subset_size: int): | ||
mean_variance = MeanVarianceCriterion(None, None, subset_size=subset_size) | ||
mean_variance._backend_entity = PTMixedPrecisionAlgoBackend() | ||
return mean_variance | ||
|
||
def get_max_variance_with_backend(self, subset_size: int): | ||
max_variance = MaxVarianceCriterion(None, None, subset_size=subset_size) | ||
max_variance._backend_entity = PTMixedPrecisionAlgoBackend() | ||
return max_variance | ||
|
||
def get_mean_max_with_backend(self, subset_size: int): | ||
mean_max_variance = MeanMaxCriterion(None, None, subset_size=subset_size) | ||
mean_max_variance._backend_entity = PTMixedPrecisionAlgoBackend() | ||
return mean_max_variance |
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -44,15 +44,45 @@ | |||
UNSUPPORTED_MODES = (CompressWeightsMode.NF4, CompressWeightsMode.E2M1) | ||||
|
||||
|
||||
class MatMulModel(torch.nn.Module): | ||||
class SequentialMatmulModel(nn.Module): | ||||
def __init__(self): | ||||
super(SequentialMatmulModel, self).__init__() | ||||
self.main_values = [10000, 1000, 1, 10, 10000] | ||||
self.layers = nn.ModuleList() | ||||
|
||||
for _, main_value in enumerate(self.main_values): | ||||
weights_data = torch.arange(0, 16, dtype=torch.float32).reshape(4, 4) | ||||
weights_data[-1, -1] = main_value | ||||
weight_tensor = torch.tensor(weights_data) | ||||
layer = nn.Linear(4, 4, bias=False) | ||||
layer.weight = nn.Parameter(weight_tensor.t()) | ||||
self.layers.append(layer) | ||||
|
||||
def forward(self, x): | ||||
for layer in self.layers: | ||||
x = layer(x) | ||||
return x | ||||
|
||||
|
||||
class MatMulModel(torch.nn.Module): | ||||
def __init__(self, weight: torch.Tensor = torch.ones(size=(256, 256), dtype=torch.float32)): | ||||
super().__init__() | ||||
self.w = torch.nn.Parameter(torch.ones(size=(256, 256), dtype=torch.float32)) | ||||
self.w = torch.nn.Parameter(weight) | ||||
|
||||
def forward(self, input): | ||||
return input @ self.w | ||||
|
||||
|
||||
class LinearModel(torch.nn.Module): | ||||
def __init__(self, weight: torch.Tensor = torch.ones(size=(256, 256), dtype=torch.float32)): | ||||
super().__init__() | ||||
self.linear = torch.nn.Linear(weight.shape[0], weight.shape[1], False) | ||||
self.linear.weight = torch.nn.Parameter(weight) | ||||
|
||||
def forward(self, input): | ||||
return self.linear(input) | ||||
|
||||
|
||||
class FunctionalModel(torch.nn.Module): | ||||
def __init__(self): | ||||
super().__init__() | ||||
|
@@ -326,41 +356,10 @@ def test_pack_int4(): | |||
assert torch.all(unpacked_w == w_int8) | ||||
|
||||
|
||||
class IdentityMatmul(torch.nn.Module): | ||||
def __init__(self): | ||||
super().__init__() | ||||
self.w = torch.nn.Parameter( | ||||
torch.eye(3, dtype=torch.float32) * 255, | ||||
) | ||||
|
||||
def forward(self, input): | ||||
return input @ self.w | ||||
|
||||
|
||||
class SequentialMatmulModel(nn.Module): | ||||
def __init__(self): | ||||
super(SequentialMatmulModel, self).__init__() | ||||
self.main_values = [10000, 1000, 1, 10, 10000] | ||||
self.layers = nn.ModuleList() | ||||
|
||||
for _, main_value in enumerate(self.main_values): | ||||
weights_data = torch.arange(0, 16, dtype=torch.float32).reshape(4, 4) | ||||
weights_data[-1, -1] = main_value | ||||
weight_tensor = torch.tensor(weights_data) | ||||
layer = nn.Linear(4, 4, bias=False) | ||||
layer.weight = nn.Parameter(weight_tensor.t()) | ||||
self.layers.append(layer) | ||||
|
||||
def forward(self, x): | ||||
for layer in self.layers: | ||||
x = layer(x) | ||||
return x | ||||
|
||||
|
||||
class TestPTTemplateWeightCompression(TemplateWeightCompression): | ||||
@staticmethod | ||||
def get_matmul_model() -> torch.nn.Module: | ||||
return IdentityMatmul() | ||||
return MatMulModel(255 * torch.eye(3, dtype=torch.float32)) | ||||
|
||||
@staticmethod | ||||
def get_sequential_matmul_model() -> torch.nn.Module: | ||||
|
@@ -381,3 +380,46 @@ def check_weights(model: torch.nn.Module, ref_ids: List[int]) -> None: | |||
assert torch.numel(op.weight) == 8 # workaround to detect uint4 weights | ||||
else: | ||||
assert torch.numel(op.weight) == 16 | ||||
|
||||
@staticmethod | ||||
def get_model_for_test_scale_estimation(): | ||||
return LinearModel(torch.arange(0, 32 * 32, dtype=torch.float32).reshape(32, 32)) | ||||
|
||||
@staticmethod | ||||
def get_scale_estimation_ref(): | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is it possible to move reference to json file like int this test:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reduce the scales number. Should look much better |
||||
return torch.tensor( | ||||
[ | ||||
[[2.0666666]], | ||||
[[3.7624271]], | ||||
[[5.8847833]], | ||||
[[8.0360603]], | ||||
[[10.1368332]], | ||||
[[12.2918606]], | ||||
[[14.3441496]], | ||||
[[16.4496689]], | ||||
[[18.6086369]], | ||||
[[20.8027000]], | ||||
[[22.9477024]], | ||||
[[25.0835018]], | ||||
[[27.1524105]], | ||||
[[29.1419849]], | ||||
[[31.1714401]], | ||||
[[33.0447121]], | ||||
[[35.1780472]], | ||||
[[37.3113823]], | ||||
[[39.4447136]], | ||||
[[41.5780487]], | ||||
[[43.7113838]], | ||||
[[45.8447189]], | ||||
[[47.9780464]], | ||||
[[50.1113815]], | ||||
[[52.2447128]], | ||||
[[54.3780441]], | ||||
[[56.5113831]], | ||||
[[58.6447144]], | ||||
[[60.7780533]], | ||||
[[62.9113808]], | ||||
[[65.0447083]], | ||||
[[67.1780548]], | ||||
] | ||||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this to be reflected in the references, but I don't see anything outstanding. Maybe the value should be higher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this test intends to check the difference between torch and OV backends. It does not aim to check the algorithm's correctness. However, I agree that your proposal is good. I can add a new test which will check the error after quantization and demonstrate that the outlier channel has the lowest error against others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to add this new test in the follow-up PR?