Skip to content

Conversation

@EnricoDeg
Copy link
Contributor

Proposed changes

Summary:

  • Add tests to existing xdl implementation
  • Integrate scaling implementation in multiple D
  • Generalize existing b_scale for ab_scale
  • Add instances
  • Add tests for WMMA
  • Avoid code duplication in pipeline v3 between main loop and tail

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

  • I have added tests relevant to the introduced functionality, and the unit tests are passing locally
  • I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
  • I have added inline documentation which enables the maintainers with understanding the motivation
  • I have removed the stale documentation which is no longer relevant after this pull request
  • (If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
  • I have run clang-format on all changed files
  • Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

@EnricoDeg EnricoDeg self-assigned this Nov 27, 2025
@EnricoDeg EnricoDeg force-pushed the streamhpc/gemm_multiply_multiply_wp branch from ab6eb0d to 064df80 Compare December 1, 2025 17:38
@EnricoDeg EnricoDeg force-pushed the streamhpc/gemm_ab_scale branch from 97badf1 to 647f08c Compare December 1, 2025 17:48
@EnricoDeg EnricoDeg force-pushed the streamhpc/gemm_multiply_multiply_wp branch from 064df80 to 47cacfe Compare December 2, 2025 09:06
@EnricoDeg EnricoDeg force-pushed the streamhpc/gemm_ab_scale branch from 647f08c to a023f2c Compare December 2, 2025 09:23
Base automatically changed from streamhpc/gemm_multiply_multiply_wp to develop December 3, 2025 15:38
EnricoDeg and others added 3 commits December 3, 2025 16:15
 - Add tests
 - Integrate scaling implementation in multiple D
 - Generalize existing b_scale for ab_scale
 - Add instances
 - Generalize implementation for ScaleBlockM, ScaleBlockN, ScaleBlockK
 - Add support for all layouts supported by xdl
 - Fix splitk xdl
* Support for  preshuffle with ab scale

 - add support for b preshuffle in GridwiseGemm_wmma_cshuffle_v3_ab_scale
 - add support for AScaleLayout amnd BScaleLayout (can be different
   from ALayout and BLayout, respectively)
 - add Run method in v1 pipeline to support preshuffle + scaling
 - add support for preshuffle gemms in common invoker
 - Add splitk support

* Fix copyright header
@EnricoDeg EnricoDeg force-pushed the streamhpc/gemm_ab_scale branch from c8ec0d0 to 8e62dc9 Compare December 3, 2025 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants