-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EXP][CMDBUF] Improve CUDA Fill op implementation #1319
Merged
kbenzie
merged 2 commits into
oneapi-src:main
from
Bensuo:maxime/cuda-large-fill-pattern
Mar 14, 2024
Merged
[EXP][CMDBUF] Improve CUDA Fill op implementation #1319
kbenzie
merged 2 commits into
oneapi-src:main
from
Bensuo:maxime/cuda-large-fill-pattern
Mar 14, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Linked DPC++ PR: intel/llvm#12605 |
Bensuo
approved these changes
Feb 8, 2024
EwanC
approved these changes
Feb 8, 2024
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1319 +/- ##
==========================================
- Coverage 14.82% 12.49% -2.33%
==========================================
Files 250 239 -11
Lines 36220 36003 -217
Branches 4094 4086 -8
==========================================
- Hits 5369 4498 -871
- Misses 30800 31501 +701
+ Partials 51 4 -47 ☔ View full report in Codecov by Sentry. |
EwanC
added a commit
to Bensuo/unified-runtime
that referenced
this pull request
Feb 19, 2024
Match the CUDA change from oneapi-src#1319 in HIP.
EwanC
added a commit
to Bensuo/unified-runtime
that referenced
this pull request
Feb 19, 2024
Match the CUDA change from oneapi-src#1319 in HIP.
EwanC
force-pushed
the
maxime/cuda-large-fill-pattern
branch
from
March 11, 2024 15:40
4861b40
to
6545b72
Compare
EwanC
force-pushed
the
maxime/cuda-large-fill-pattern
branch
from
March 12, 2024 15:43
6545b72
to
ee408be
Compare
EwanC
force-pushed
the
maxime/cuda-large-fill-pattern
branch
from
March 14, 2024 14:15
ee408be
to
81f142c
Compare
EwanC
added a commit
to reble/llvm
that referenced
this pull request
Mar 14, 2024
Adjustment of value pointer size according to pattern size. Large patterns are now broken into 1-byte chunks, as in the regular implementation.
EwanC
force-pushed
the
maxime/cuda-large-fill-pattern
branch
from
March 14, 2024 16:40
81f142c
to
ef72b3f
Compare
bader
pushed a commit
to intel/llvm
that referenced
this pull request
Mar 14, 2024
Graph support in the CUDA backend for graph buffer fill nodes has been improved in UR PR oneapi-src/unified-runtime#1319 --------- Co-authored-by: Ewan Crawford <ewan@codeplay.com> Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adjustment of value pointer size according to pattern size.
Large patterns are now broken into 1-byte chunks, as in the regular implementation.