-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Fusion] Refine remapping of GEP instruction during internalization #12128
Conversation
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Converted back to draft because of the CI failure on the newly-introduced test. |
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather include less tests targeting the cases we are handling here instead of .ll
dumps. This would simplify testing and make sure we check all the boxes.
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks fine, a few comments on the tests.
Overall, I do see value in having those tests also in .ll
format to get an understanding what kind of input and transformation we expect in the compiler. However, maybe we can condense the input a bit by removing unnecessary stuff or running a bit of optimization on it first.
sycl-fusion/test/internalization/promote-private-non-unit-cuda.ll
Outdated
Show resolved
Hide resolved
sycl-fusion/test/internalization/promote-private-non-unit-cuda.ll
Outdated
Show resolved
Hide resolved
sycl-fusion/test/internalization/promote-private-non-unit-hip.ll
Outdated
Show resolved
Hide resolved
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
I was able to condense the spir64 lit-test quite a bit, and will continue with the CUDA and HIP versions later. Reverted this PR back to draft until then. |
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look better now, but I think we can still improve readability.
In particular for metadata, we should only keep what we need for kernel fusion/the test and remove everything else.
sycl-fusion/test/internalization/promote-private-non-unit-cuda.ll
Outdated
Show resolved
Hide resolved
sycl-fusion/test/internalization/promote-private-non-unit-cuda.ll
Outdated
Show resolved
Hide resolved
sycl-fusion/test/internalization/promote-private-non-unit-cuda.ll
Outdated
Show resolved
Hide resolved
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Agreed and done, I misinterpreted your earlier comment, sorry. I also shortened identifier names throughout the test cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming looks more sensible to me now
Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
So far, we distinguished GEP instructions that select an element of the internalized buffer (must be remapped) and GEPs that address into an aggregate type (must not be remapped) by looking at the number of indices.
However, we can also encounter single-index GEP instructions that use a byte-offset to address into padded structures, as well as multi-index GEPs with a base pointer offset that address into an aggregate and need to be remapped.
This PR adds uses the newly added element size information (#12108) to correctly distinguish the required action for these kinds of GEP instructions. In addition to the E2E tests, target-specific lit tests derived from it are also added to demonstrate the subtle differences among SPIR-V, CUDA and HIP.