From 9fcb4c6961b1db36186a5ae137f5ad4ff2dca9ae Mon Sep 17 00:00:00 2001 From: Victor Perez Date: Thu, 7 Mar 2024 12:50:23 +0100 Subject: [PATCH] [SYCL][Doc] Non semantic changes to `sycl_ext_oneapi_private_alloca` (#12908) - Rename some C++ names used in the text for clarification - Minor rephrasing - Add example creating a `sycl::span` out of the returned pointer --------- Signed-off-by: Victor Perez --- .../sycl_ext_oneapi_private_alloca.asciidoc | 49 +++++++++++++++---- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc b/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc index fbf37679bcb99..60ca0cb610f4e 100644 --- a/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc +++ b/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc @@ -120,29 +120,29 @@ using the `private_alloca` API defined in the following sections. [source,c++] ---- namespace sycl::ext::oneapi::experimental { -template private_ptr private_alloca(kernel_handler &kh); -template private_ptr aligned_private_alloca(kernel_handler &kh); } // namespace sycl::ext::oneapi::experimental ---- -_Mandates_: `ElementType` must be a cv-unqualified trivial type and `SpecName` -must be a reference to a specialization constant of integral `value_type`. In -the case of `aligned_private_alloca`, `Alignment` must be an alignment value and -must be a positive multiple of `alignof(ElementType)`. If `Alignment` is an -extended alignment, it must be supported by the implementation. +_Mandates_: `ElementType` must be a cv-unqualified trivial type and +`SizeSpecName` must be a reference to a specialization constant of integral +type. In the case of `aligned_private_alloca`, `Alignment` must be an alignment +value and must be a positive multiple of `alignof(ElementType)`. If `Alignment` +is an extended alignment, it must be supported by the implementation. -_Effects_: `h.get_specialization_constant()` elements of type +_Effects_: `kh.get_specialization_constant()` elements of type `ElementType` are allocated and default initialized in private memory. _Returns_: A pointer to a default initialized region of private memory of -`h.get_specialization_constant()` elements of type +`kh.get_specialization_constant()` elements of type `ElementType`. `DecorateAddress` defines whether the returned `multi_ptr` is decorated. In the case of `private_alloca`, the pointer is suitably aligned for an object of type `ElementType`. In the case of `aligned_private_alloca`, the @@ -151,7 +151,7 @@ pointer is aligned to the specified `Alignment`. _Remarks_: In case of private memory exhaustion, the implementation must report an error in the same fashion as if the allocation size were static. In case of a successful call, allocated memory has automatic storage duration. Additionally, -`SpecName` must have a default value of at least 1 and not be set to a value +`SizeSpecName` must have a default value of at least 1 and not be set to a value less than 1 during program execution. Violation of these conditions results in undefined behaviour. @@ -213,6 +213,35 @@ void run(queue q, const float *in, float *out, size_t n) { }); ---- +=== Usage with `sycl::span` + +In this section, we show an example of how users could use this extension with +`sycl::span` as a `std::array` replacement: + +[source,c++] +---- +constexpr specialization_id size(1); + +class Kernel; + +// Counterpart to 'impl' in the first example using 'sycl::span' +SYCL_EXTERNAL void impl(const float *in, float *out, + sycl::span ptr); + +void run(queue q, const float *in, float *out, size_t n) { + q.submit([&](handler &h) { + h.set_specialization_constant(n); + h.parallel_for(n, [=](id<1> i, kernel_handler kh) { + // Create sycl::span with the returned pointer and the specialization + // constant used as size. + sycl::span tmp{ + private_alloca(kh).get_raw(), + kh.get_specialization_constant()}; + impl(in, out, tmp); + }); + }); +---- + == Design constraints The big design constraint stems from the unknown allocation size at compile