-
Notifications
You must be signed in to change notification settings - Fork 257
[CK_TILE] Support more layouts for BQuant GEMM #3349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…w_layouts_datatypes
…w_layouts_datatypes
…w_layouts_datatypes
ThomasNing
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contributions. LGTM except the above comments.
| static_assert(std::is_same_v<BQLayout, tensor_layout::gemm::ColumnMajor>); | ||
| // PreshuffleQuant currently assumes ColumnMajor layout | ||
| // For RowMajor, the preshuffle logic would need adjustment | ||
| static_assert(std::is_same_v<BQLayout, tensor_layout::gemm::ColumnMajor>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amd-khushbu Please not this part as our discussion yesterday. With the preshuffle quant our tensor layout should actually only be RowMajor. We could modify that in your PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll revert the comment here to avoid creating any confusion.
| decltype(make_static_distributed_tensor<ADataType>(ABlockTileDistr{})); | ||
| using BBlockTile = | ||
| decltype(make_static_distributed_tensor<BDataType>(BBlockTileDistr{})); | ||
| decltype(make_static_distributed_tensor<ADataType>(BBlockTileDistr{})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here should be BDataType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be the non-pkint4 datatype, which should be the same as ADataType, but maybe it should be made more clear somehow. I'll create an alias for the type name, so it avoids confusion.
| b_dram_block_window_tmp, | ||
| [](const BDataType& b) { return b; }, | ||
| // Note: BDataType gets converted to ADataType during loading | ||
| [](const ADataType& b) { return b; }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That condition should only happen when the BDataType is pk_int4 right?
Proposed changes
Enables more layouts for BQuant GEMM. Row/Col for A is tested, and support for Row/Col for B and BQ is added.
Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed filesDiscussion
If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered