[EXP][CMDBUF] Improve CUDA Fill op implementation #1319

mfrancepillois · 2024-02-05T12:07:29Z

Adjustment of value pointer size according to pattern size.
Large patterns are now broken into 1-byte chunks, as in the regular implementation.

mfrancepillois · 2024-02-05T12:32:04Z

Linked DPC++ PR: intel/llvm#12605

codecov-commenter · 2024-02-13T11:07:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 12.49%. Comparing base (78ef1ca) to head (81f142c).
Report is 144 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1319      +/-   ##
==========================================
- Coverage   14.82%   12.49%   -2.33%     
==========================================
  Files         250      239      -11     
  Lines       36220    36003     -217     
  Branches     4094     4086       -8     
==========================================
- Hits         5369     4498     -871     
- Misses      30800    31501     +701     
+ Partials       51        4      -47

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Match the CUDA change from oneapi-src#1319 in HIP.

Test UR PR oneapi-src/unified-runtime#1319

Adjustment of value pointer size according to pattern size. Large patterns are now broken into 1-byte chunks, as in the regular implementation.

Graph support in the CUDA backend for graph buffer fill nodes has been improved in UR PR oneapi-src/unified-runtime#1319 --------- Co-authored-by: Ewan Crawford <ewan@codeplay.com> Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>

mfrancepillois mentioned this pull request Feb 5, 2024

[SYCL][Graph] Improve CUDA Fill op implementation. Bensuo/unified-runtime#6

Closed

mfrancepillois mentioned this pull request Feb 5, 2024

[EXP][CMDBUF] Fix dependency handling for large buffer fill ops in CUDA graph support #1256

Closed

mfrancepillois marked this pull request as ready for review February 5, 2024 14:20

mfrancepillois requested a review from a team as a code owner February 5, 2024 14:20

Bensuo approved these changes Feb 8, 2024

View reviewed changes

EwanC approved these changes Feb 8, 2024

View reviewed changes

EwanC added the ready to merge Added to PR's which are ready to merge label Feb 8, 2024

EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Feb 19, 2024

Improve fill op implementation

90bd325

Match the CUDA change from oneapi-src#1319 in HIP.

EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Feb 19, 2024

Improve fill op implementation

c9d1431

Match the CUDA change from oneapi-src#1319 in HIP.

EwanC force-pushed the maxime/cuda-large-fill-pattern branch from 4861b40 to 6545b72 Compare March 11, 2024 15:40

EwanC added the v0.9.x Include in the v0.9.x release label Mar 11, 2024

EwanC mentioned this pull request Mar 11, 2024

[SYCL][Graph][CUDA] Graph buffer fill node improvement intel/llvm#12605

Merged

EwanC force-pushed the maxime/cuda-large-fill-pattern branch from 6545b72 to ee408be Compare March 12, 2024 15:43

EwanC force-pushed the maxime/cuda-large-fill-pattern branch from ee408be to 81f142c Compare March 14, 2024 14:15

EwanC added a commit to reble/llvm that referenced this pull request Mar 14, 2024

[SYCL][CUDA] Graph buffer fill improvement

ecd4a4f

Test UR PR oneapi-src/unified-runtime#1319

mfrancepillois and others added 2 commits March 14, 2024 16:40

[SYCL][Graph] Improve CUDA Fill op implementation.

b064aef

Adjustment of value pointer size according to pattern size. Large patterns are now broken into 1-byte chunks, as in the regular implementation.

Fixup rebase issue

ef72b3f

EwanC force-pushed the maxime/cuda-large-fill-pattern branch from 81f142c to ef72b3f Compare March 14, 2024 16:40

kbenzie merged commit bb589ca into oneapi-src:main Mar 14, 2024
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EXP][CMDBUF] Improve CUDA Fill op implementation #1319

[EXP][CMDBUF] Improve CUDA Fill op implementation #1319

mfrancepillois commented Feb 5, 2024

mfrancepillois commented Feb 5, 2024

codecov-commenter commented Feb 13, 2024 •

edited

Loading

[EXP][CMDBUF] Improve CUDA Fill op implementation #1319

[EXP][CMDBUF] Improve CUDA Fill op implementation #1319

Conversation

mfrancepillois commented Feb 5, 2024

mfrancepillois commented Feb 5, 2024

codecov-commenter commented Feb 13, 2024 • edited Loading

Codecov Report

codecov-commenter commented Feb 13, 2024 •

edited

Loading