Skip to content

Commit da5a0b5

Browse files
authored
Remove marlin warning (vllm-project#4918)
1 parent 6287537 commit da5a0b5

File tree

1 file changed

+0
-4
lines changed

1 file changed

+0
-4
lines changed

csrc/quantization/gptq_marlin/gptq_marlin.cu

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1519,10 +1519,6 @@ exec_config_t determine_thread_config(int prob_m, int prob_n, int prob_k,
15191519
}
15201520
}
15211521

1522-
printf("WARNING: Marlin kernel is reducing max_m_blocks due to small SM "
1523-
"GPU cache. This may "
1524-
"hurt performance. Consider upgrading your GPU.\n");
1525-
15261522
max_m_blocks--; // Process less M blocks per invocation to reduce cache
15271523
// usage
15281524
}

0 commit comments

Comments
 (0)