-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
When applying a pass that moves transient arrays stored in GPU_Global out of a kernel, two bugs occur:
- The added input connectors fail to infer the correct type and default to
"void". This results in references tovoid, which are invalid in C++ and cause compilation errors.
Example: the generated code attempts to create a reference to void, which is not allowed when no explicit types for the connectors are provided.
leading to:
- To work around (1), I tried explicitly setting the type for added connectors as
pointer(data_descriptor.dtype), since the connector should be a reference to an array. While this compiles, it introduces a new bug: the array index is calculated incorrectly.
How I explicitly set connector types:
Now, the accessed value in mask_r12[i,j] is wrong.
How the generated code computes the index:
The correct generated code should compute the index:
mask_r12[i * N + j]To Reproduce
Steps to reproduce the behavior:
-
Checkout to the branch
newgpucodegen -
Run the GPU test example provided in
dace/tests/npbench/misc/azimint_naive_test.py, which triggers the error.- Currently, the connectors are set explicitly in the pass:
dace/transformations/passes/move_array_out_of_kernel.py
→ This triggers bug (2).
- Currently, the connectors are set explicitly in the pass:
-
To trigger bug (1) instead, search for
"dtypes.pointer"in the code (i.e.dace/transformations/passes/move_array_out_of_kernel.py) and remove the additional explicit input from any connector.- This will cause the auto-detection to fail and default to
void.
- This will cause the auto-detection to fail and default to