Skip to content

[CPU] Conv-DWConv-PRelu fusing fix#33917

Open
nshchego wants to merge 1 commit intoopenvinotoolkit:masterfrom
nshchego:cpu/bug/conv_dw_prelu_fuse
Open

[CPU] Conv-DWConv-PRelu fusing fix#33917
nshchego wants to merge 1 commit intoopenvinotoolkit:masterfrom
nshchego:cpu/bug/conv_dw_prelu_fuse

Conversation

@nshchego
Copy link
Contributor

@nshchego nshchego commented Feb 1, 2026

Details:

  • Kernel jit_avx2_1x1_conv_kernel_f32_old skips DW Convolution post-op on initialization step, but tries to add it on the code generation step. This fix aligns behavior.
  • ConvDWConv test does not work as expected: it does not check any fusing. There is a condition in the FuseConvolutionAndDWConvolution: (dw_conv_input_size + dw_conv_output_size > L3_cache_size / 2) that is not satisfied in test, but test is green. Fix is adding fusing check via CpuTestWithFusing class.
  • Fusing FuseConvolutionAndDWConvolution is applicable for AVX2 only, thus this code was disabled for non-X86 platforms to reduce binary size.

Related OneDNN PR: openvinotoolkit/oneDNN#296

Tickets:

  • 173761

@nshchego nshchego requested review from a team as code owners February 1, 2026 11:55
@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Feb 1, 2026
@nshchego nshchego force-pushed the cpu/bug/conv_dw_prelu_fuse branch 2 times, most recently from a48b5df to 1a6e1f1 Compare February 1, 2026 17:57
@nshchego nshchego force-pushed the cpu/bug/conv_dw_prelu_fuse branch from 1a6e1f1 to 8f28688 Compare February 1, 2026 18:24
@maxnick maxnick requested a review from Copilot February 2, 2026 08:39
@maxnick maxnick added this to the 2026.1 milestone Feb 2, 2026
@maxnick
Copy link
Contributor

maxnick commented Feb 2, 2026

@EgorDuplensky , could you please review?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the Conv-DWConv-PRelu fusing optimization for AVX2 platforms and improves the associated test coverage. The fix addresses a mismatch where the oneDNN kernel skipped DW Convolution post-ops during initialization but attempted to add them during code generation. Additionally, the test suite was not properly verifying the fusing behavior due to an unsatisfied cache size condition.

Changes:

  • Updated oneDNN submodule to include the kernel initialization fix
  • Replaced makeNgraphFunction with create_ov_model across all test files for naming consistency
  • Moved Conv-DWConv test from common to x64-specific directory with proper fusing verification
  • Added platform guards to disable Conv-DWConv fusing on non-x86 platforms

Reviewed changes

Copilot reviewed 92 out of 92 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/plugins/intel_cpu/thirdparty/onednn Updated submodule commit to include the DW convolution post-op fix
src/plugins/intel_cpu/tests/functional/utils/cpu_test_utils.hpp Renamed function from makeNgraphFunction to create_ov_model
src/plugins/intel_cpu/tests/functional/utils/cpu_test_utils.cpp Renamed function implementation to match header
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/x64/conv_dw_conv.cpp New x64-specific test with proper fusing verification using CpuTestWithFusing
src/plugins/intel_cpu/tests/functional/custom/subgraph_tests/src/common/conv_dw_conv.cpp Removed generic test that didn't verify fusing correctly
src/plugins/intel_cpu/src/graph_optimizer.cpp Added platform guards and reorganized AVX2/AVX512 checks for Conv-DWConv fusing
(multiple test files) Updated function calls from makeNgraphFunction to create_ov_model

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ov::CoordinateDiff{1, 1},
std::vector<size_t>{1, 1},
ov::op::PadType::EXPLICIT);
auto bias_const = utils::make_constant(precision, {1, out_channels , 1, 1});
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra whitespace before comma in shape specification. Should be {1, out_channels, 1, 1} for consistent formatting.

Suggested change
auto bias_const = utils::make_constant(precision, {1, out_channels , 1, 1});
auto bias_const = utils::make_constant(precision, {1, out_channels, 1, 1});

Copilot uses AI. Check for mistakes.
@maxnick maxnick linked an issue Feb 2, 2026 that may be closed by this pull request
3 tasks
}

void GraphOptimizer::FuseConvolutionAndDWConvolution(Graph& graph) {
#if defined(OPENVINO_ARCH_X86) || defined(OPENVINO_ARCH_X86_64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we better to refactor graph optimizer a bit to avoid ifdefs in such cases. There is also an ARM pr which changes the order of some optimizations based on arch. We could 'register' optimizations like it is done in core transformation pipeline.
Not asking to implement it in scope of this pr of course

#if defined(OPENVINO_ARCH_X86) || defined(OPENVINO_ARCH_X86_64)
// There is no optimized implementation for avx512, so two avx512 convolutions
// are expected to be faster than single fused avx2 convolution
if (!impl::cpu::x64::mayiuse(impl::cpu::x64::avx2) || impl::cpu::x64::mayiuse(impl::cpu::x64::avx512_core)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use utility implication(avx2, !avx512)?


auto eltwise = utils::make_eltwise(parameters[0], secondaryInput, eltwiseType);
function = makeNgraphFunction(netType, parameters, eltwise, "Eltwise");
function = create_ov_model(netType, parameters, eltwise, "Eltwise");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change, but It would be better to make re-factor as separate PR/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: CPU OpenVINO CPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: out of bounds memory access

4 participants