-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][DOC] Add draft of sycl_ext_oneapi_async_memcpy #9439
Open
dkhaldi
wants to merge
2
commits into
intel:sycl
Choose a base branch
from
dkhaldi:sycl_ext_oneapi_async_memcpy
base: sycl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
175 changes: 175 additions & 0 deletions
175
sycl/doc/extensions/proposed/sycl_ext_oneapi_async_memcpy.asciidoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,175 @@ | ||
= sycl_ext_oneapi_async_memcpy | ||
|
||
:source-highlighter: coderay | ||
:coderay-linenums-mode: table | ||
|
||
// This section needs to be after the document title. | ||
:doctype: book | ||
:toc2: | ||
:toc: left | ||
:encoding: utf-8 | ||
:lang: en | ||
:dpcpp: pass:[DPC++] | ||
|
||
// Set the default source code type in this document to C++, | ||
// for syntax highlighting purposes. This is needed because | ||
// docbook uses c++ and html5 uses cpp. | ||
:language: {basebackend@docbook:c++:cpp} | ||
|
||
|
||
== Notice | ||
|
||
[%hardbreaks] | ||
Copyright (C) 2023-2023 Intel Corporation. All rights reserved. | ||
|
||
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks | ||
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by | ||
permission by Khronos. | ||
|
||
|
||
== Contact | ||
|
||
To report problems with this extension, please open a new issue at: | ||
|
||
https://github.com/intel/llvm/issues | ||
|
||
|
||
== Dependencies | ||
|
||
This extension is written against the SYCL 2020 revision 7 specification. All | ||
references below to the "core SYCL specification" or to section numbers in the | ||
SYCL specification refer to that revision. | ||
|
||
This extension also depends on the following SYCL extensions: | ||
|
||
* link:https://github.com/intel/llvm/pull/9186/[sycl_ext_oneapi_barrier] | ||
|
||
== Status | ||
|
||
This is a proposed extension specification, intended to gather community | ||
feedback. Interfaces defined in this specification may not be implemented yet | ||
or may be in a preliminary state. The specification itself may also change in | ||
incompatible ways before it is finalized. *Shipping software products should | ||
not rely on APIs defined in this specification.* | ||
|
||
|
||
== Overview | ||
|
||
This extension defines | ||
`sycl::ext::oneapi::experimental::async_memcpy` and | ||
`sycl::ext::oneapi::experimental::async_memcpy` free functions to | ||
generalize and replace the current `sycl::async_work_group_copy` | ||
function. | ||
|
||
== Specification | ||
|
||
=== Feature test macro | ||
|
||
This extension provides a feature-test macro as described in the core SYCL | ||
specification. An implementation supporting this extension must predefine the | ||
macro `SYCL_EXT_ONEAPI_ASYNC_MEMCPY` to one of the values defined in the table | ||
below. Applications can test for the existence of this macro to determine if | ||
the implementation supports this feature, or applications can test the macro's | ||
value to determine which of the extension's features the implementation | ||
supports. | ||
|
||
[%header,cols="1,5"] | ||
|=== | ||
|Value | ||
|Description | ||
|
||
|1 | ||
|The APIs of this experimental extension are not versioned, so the | ||
feature-test macro always has this value. | ||
|=== | ||
|
||
|
||
=== `async_memcpy` and `async_copy` Functions | ||
`sycl::ext::oneapi::experimental::async_memcpy` is a free function | ||
that asynchronously copies a number of elements specified by | ||
`num_elements` of data of type `T` from the source pointer `src` to | ||
destination pointer `dest`. It also takes a barrier object of type | ||
`sycl::ext::oneapi::experimental::barrier` as an argument that can be | ||
used to wait on the completion of the memory copy. | ||
|
||
This extension provides two versions of `async_memcpy`: with and | ||
without `Group` template parameter and argument. In the case of the | ||
group variant, `group_async_memcpy` is issued by all the threads in | ||
the group. This is a _group function_, as defined in Section 4.17.3 | ||
of the SYCL specification. In the case of the work-item variant, | ||
`async_memcpy` is issued by the current work-item. | ||
|
||
[source,c++] | ||
---- | ||
namespace sycl::ext::oneapi::experimental { | ||
|
||
template <sycl::memory_scope Scope, class CompletionFunction> | ||
void async_memcpy(void* dest, const void* src, size_t numBytes, | ||
barrier<Scope, CompletionFunction> &bar); | ||
|
||
template <typename T, sycl::memory_scope Scope, class | ||
CompletionFunction> | ||
void async_copy(T* dest, const T *src, size_t count, | ||
barrier<Scope, CompletionFunction> &bar); | ||
|
||
template <typename Group, sycl::memory_scope Scope, class | ||
CompletionFunction> | ||
void group_async_memcpy(Group g, void* dest, const void* src, size_t | ||
numBytes, barrier<Scope, CompletionFunction> &bar); | ||
|
||
template <typename Group, typename T, sycl::memory_scope Scope, | ||
class CompletionFunction> | ||
void group_async_copy(Group g, T* dest, const T *src, size_t count, | ||
barrier<Scope, CompletionFunction> &bar); | ||
|
||
} // namespace sycl::ext::oneapi::experimental | ||
---- | ||
|
||
**Preconditions**: `T` must be a trivially copyable type. | ||
In the group variant APIs, the parameters `dest`, `src`, `count` or | ||
`numBytes`, and `bar` must be the same across all the work-items in | ||
`Group`. | ||
|
||
=== `async_memcpy` Example | ||
|
||
[source,c++] | ||
---- | ||
using wg_barrier = barrier<sycl::memory_scope::work_group>; | ||
T *psrc = malloc_shared<T>(N, q); | ||
T *pdest = malloc_shared<T>(N, q); | ||
|
||
q.parallel_for(..., [=](sycl::nd_item it) { | ||
|
||
// Allocate memory for and construct the barrier | ||
auto* bar = | ||
sycl::ext::oneapi::group_local_memory<wg_barrier>(it.get_group(), | ||
nthreads); | ||
|
||
async_copy(pdest, psrc, N, bar); | ||
// Use the barrier | ||
bar->arrive_and_wait(); | ||
|
||
}).wait(); | ||
---- | ||
|
||
=== `group_async_memcpy` Example | ||
|
||
[source,c++] | ||
---- | ||
using wg_barrier = barrier<sycl::memory_scope::work_group>; | ||
T *psrc = malloc_shared<T>(N, q); | ||
T *pdest = malloc_shared<T>(N, q); | ||
|
||
q.parallel_for(..., [=](sycl::nd_item it) { | ||
|
||
// Allocate memory for and construct the barrier | ||
auto* bar = | ||
sycl::ext::oneapi::group_local_memory<wg_barrier>(it.get_group(), | ||
nthreads); | ||
|
||
group_async_copy(it.get_group(), pdest, psrc, N, bar); | ||
// Use the group barrier wait | ||
group_arrive_and_wait(it.get_group(), bar); | ||
|
||
}).wait(); | ||
---- |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a good idea to align with SYCL 2020's USM copy functions here. For USM,
memcpy
acceptsvoid*
and anumBytes
, whereascopy
acceptsT*
and acount
. The function you've defined here is calledmemcpy
but acceptsT*
.My recommendation would be to define both
async_memcpy
andasync_copy
.