-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[CPU][ARM] Limit cases when ACL int8 convolution executor is chosen #33040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
v-Golubev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please extend the existing tests by adding a test case with per channel dequantization?
Added per-channel test case. |
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/arm/conv_fq.cpp
Show resolved
Hide resolved
v-Golubev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| ov::element::Type expectedPrecision = element::f32; | ||
| #if defined(OPENVINO_ARCH_ARM64) | ||
| const auto& [inputShape, inputPrecision, quantizeIntervals, fqConstShapes, targetName] = this->GetParam(); | ||
| if (fqConstShapes.empty()) { | ||
| expectedPrecision = quantizeIntervals[0][0] < 0.f ? element::i8 : element::u8; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a short comment on why do we expect these precisions? The main thing I'd expect to see here is the mention that we don't support per channel dequantization for quantized convolution, so in this case we fallback on f32 implementation
Details:
Tickets: