-
Notifications
You must be signed in to change notification settings - Fork 257
[CK_TILE] Generate random tensor values with multiple threads #3324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces multi-threaded random tensor value generation for the ck_tile library, replacing the previous single-threaded or opt-in multi-threaded approach with an always-on deterministic multi-threaded implementation. The changes ensure reproducible results across different thread counts by using a block-based distribution strategy with RNG state management via discard().
Key Changes
- Refactored
FillUniformDistributionto always use multi-threading with deterministic block-based random number generation - Added CPU core management utilities (
get_available_cpu_cores()andcpu_core_guard) for testing different thread configurations - Updated the template parameter to allow type deduction (
T = void)
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
include/ck_tile/host/fill.hpp |
Replaced opt-in multi-threading with always-on deterministic block-based multi-threaded filling; changed template parameter to support type deduction |
include/ck_tile/host/joinable_thread.hpp |
Added get_available_cpu_cores() function and cpu_core_guard class for CPU affinity management in tests |
test/ck_tile/utility/test_fill.cpp |
New comprehensive test suite validating deterministic behavior across different sizes and thread counts |
test/ck_tile/utility/CMakeLists.txt |
Registered the new test executable |
example/ck_tile/18_flatmm/mxgemm/run_mx_flatmm.inc |
Updated to use new template syntax with type deduction |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Proposed changes
ck_tile version of f9bf275 (merged with #2297)
Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed files