-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Stream-K kernel breaks for some GEMM Problem-K #2100
Comments
Another one that failed . The common pattern between these are cluster=4x1x1, stream_k, ws |
|
Can you provide the full CMake config you used? Also, we recently fixed a similar issue internally (also occurring with larger clusters only). It is planned to be pushed here soon. Can you see if the following change (which is the one that will be upstreamed) fixes the issue for you? new_hw_info.max_active_clusters = hw_info.max_active_clusters; |
|
@jackkosaian , code in your comment suggest the change is in |
Sorry, the suggestion I was trying to make was to add the code that I pasted in the comment (max_active_clusters) below the code linked. Here's the diff: diff --git a/include/cutlass/gemm/kernel/tile_scheduler_params.h b/include/cutlass/gemm/kernel/tile_scheduler_params.h
index aa599a35..a4467e8a 100644
--- a/include/cutlass/gemm/kernel/tile_scheduler_params.h
+++ b/include/cutlass/gemm/kernel/tile_scheduler_params.h
@@ -1204,6 +1204,7 @@ struct PersistentTileSchedulerSm90StreamKParams {
KernelHardwareInfo new_hw_info;
new_hw_info.device_id = hw_info.device_id;
new_hw_info.sm_count = hw_info.sm_count;
+ new_hw_info.max_active_clusters = hw_info.max_active_clusters;
if (new_hw_info.sm_count <= 0) {
CUTLASS_TRACE_HOST(" WARNING: Arguments do not include a valid SM count.\n"
" For optimal performance, populate the arguments KernelHardwareInfo struct with the SM count."); I was able to reproduce the issue you mentioned before making this diff, and the issue went away after the diff. |
Let us check that in into mainline. |
It will be merged in when we tag 3.8 (soon). |
Please close it if this merged and you verified it on your end. We will enable stream_k again, and reopen, if we see an issue. |
GEMM Problem Shape --m=8 --n=8192 --k=8192 Does NOT Work
GEMM Problem Shape --m=8 --n=8192 --k=128 Works
The text was updated successfully, but these errors were encountered: