Skip to content

Commit

Permalink
[SYCL][Doc] Add specialization constant-length alloca extension propo…
Browse files Browse the repository at this point in the history
…sal (#12660)

Document extension proposal for specialization constant length private
memory allocations. Users will be able to perform dynamic memory
allocations using specialization constants and new `private_alloca` and
`aligned_private_alloca` functions returning a `private_ptr` to a memory
region of automatic storage duration.

This is included as an experimental extension as implementation will
shortly follow once the extension is approved.

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
  • Loading branch information
victor-eds authored Feb 28, 2024
1 parent e19076a commit aaf7a58
Showing 1 changed file with 266 additions and 0 deletions.
266 changes: 266 additions & 0 deletions sycl/doc/extensions/experimental/sycl_ext_private_alloca.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,266 @@
= sycl_ext_oneapi_private_alloca

:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en
:dpcpp: pass:[DPC++]
:cpp: pass:[C++]

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}


== Notice

[%hardbreaks]
Copyright (C) Codeplay Software Limited. All rights reserved.

Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
permission by Khronos.


== Contact

To report problems with this extension, please open a new issue at:

https://github.com/intel/llvm/issues


== Dependencies

This extension is written against the SYCL 2020 revision 8 specification. All
references below to the "core SYCL specification" or to section numbers in the
SYCL specification refer to that revision.


== Status

This is an experimental extension specification, intended to provide early
access to features and gather community feedback. Interfaces defined in this
specification are implemented in {dpcpp}, but they are not finalized and may
change incompatibly in future versions of {dpcpp} without prior notice.
*Shipping software products should not rely on APIs defined in this
specification.*

== Backend support status

The APIs in this extension may be used only on a device that has
`aspect::ext_oneapi_private_alloca`. The application must check that the device
has this aspect before submitting a kernel using any of the APIs in this
extension. If the application fails to do this, the implementation throws a
synchronous exception with the `errc::kernel_not_supported` error code when the
kernel is submitted to the queue.

== Overview

{cpp} arrays and `std::array` can be used in SYCL code to represent fixed-size
sequences of objects. However, these constructs have a significant restriction:
the number of elements must be known at compile time. In host-code context,
users can make use of dynamic memory allocations, e.g., `std::vector`, but this
is not the case in SYCL device code.

SYCL specialization constants (SYCL Section 4.9.5.) can be used to represent
constants whose values can be set dynamically during the execution of a SYCL
application, but that will not change when a SYCL kernel function is
invoked. This way, specialization constants could be used to implement SYCL
private arrays whose size is given during the execution of the SYCL
application. There is no possible way of implementing this using `std::array`,
as the size of such containers must be known at compile time, so we propose to
define a new `private_alloca` function whose size is specified using SYCL
specialization constants.

== Specification

=== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
specification. An implementation supporting this extension must predefine the
macro `SYCL_EXT_ONEAPI_PRIVATE_ALLOCA` to one of the values defined in the table
below. Applications can test for the existence of this macro to determine if the
implementation supports this feature, or applications can test the macro's value
to determine which of the extension's features the implementation supports.

[%header,cols="1,5"]
|===
|Value
|Description

|1
|The APIs of this experimental extension are not versioned, so the
feature-test macro always has this value.
|===

=== New aspect for specialization constant-length allocations

This extension adds a new device aspect:

[source,c++]
----
namespace sycl {
enum class aspect : /*unspecified*/ {
ext_oneapi_private_alloca
};
} // namespace sycl
----

The `ext_oneapi_private_alloca` aspect indicates that the device is capable of
using the `private_alloca` API defined in the following sections.

=== The `private_alloca` API

[source,c++]
----
namespace sycl::ext::oneapi::experimental {
template <typename ElementType, auto &SpecName,
access::decorated DecorateAddress>
private_ptr<ElementType, DecorateAddress>
private_alloca(kernel_handler &kh);
template <typename ElementType, std::size_t Alignment, auto &SpecName,
access::decorated DecorateAddress>
private_ptr<ElementType, DecorateAddress>
aligned_private_alloca(kernel_handler &kh);
} // namespace sycl::ext::oneapi::experimental
----

_Mandates_: `ElementType` must be a cv-unqualified trivial type and `SpecName`
must be a reference to a specialization constant of integral `value_type`. In
the case of `aligned_private_alloca`, `Alignment` must be an alignment value and
must be a positive multiple of `alignof(ElementType)`. If `Alignment` is an
extended alignment, it must be supported by the implementation.

_Effects_: `h.get_specialization_constant<size>()` elements of type
`ElementType` are allocated and default initialized in private memory.

_Returns_: A pointer to a default initialized region of private memory of
`h.get_specialization_constant<size>()` elements of type
`ElementType`. `DecorateAddress` defines whether the returned `multi_ptr` is
decorated. In the case of `private_alloca`, the pointer is suitably aligned for
an object of type `ElementType`. In the case of `aligned_private_alloca`, the
pointer is aligned to the specified `Alignment`.

_Remarks_: In case of private memory exhaustion, the implementation must report
an error in the same fashion as if the allocation size were static. In case of a
successful call, allocated memory has automatic storage duration. Additionally,
`SpecName` must have a default value of at least 1 and not be set to a value
less than 1 during program execution. Violation of these conditions results in
undefined behaviour.

== Example usage

This non-normative section shows some example usages of the extension.

=== Basic usage

[source,c++]
----
constexpr specialization_id<int> size(1);
class Kernel;
SYCL_EXTERNAL void impl(const float *in, float *out, size_t n,
decorated_private_ptr<float> ptr);
void run(queue q, const float *in, float *out, size_t n) {
q.submit([&](handler &h) {
h.set_specialization_constant<size>(n);
h.parallel_for<Kernel>(n, [=](id<1> i, kernel_handler kh) {
// Allocate memory for 'n' 'float's
auto ptr = private_alloca<float, size, access::decorated::yes>(kh);
// Use pointer in implementation
impl(in, out, kh.get_specialization_constant<size>(), ptr);
});
});
----

=== Storage duration clarification

The following example is intended to clarify storage duration of memory
allocated by `private_alloca`.

[source,c++]
----
constexpr specialization_id<int> size(1);
class Kernel;
SYCL_EXTERNAL void impl(const float *in, float *out, size_t n,
raw_private_ptr<float> ptr);
void run(queue q, const float *in, float *out, size_t n) {
q.submit([&](handler &h) {
h.set_specialization_constant<size>(n);
h.parallel_for<Kernel>(n, [=](id<1> i, kernel_handler kh) {
raw_private_ptr<float> ptr;
{
ptr = private_alloca<float, size, access::decorated::no>(kh);
// 'private_alloca' has allocated a private memory region we can use in
// this block.
impl(in, out, kh.get_specialization_constant<size>(), ptr);
}
// Memory allocated by 'private_alloca' has been deallocated.
// Dereferencing 'ptr' at this program point is undefined behaviour.
});
});
----

== Design constraints

The big design constraint stems from the unknown allocation size at compile
time. {cpp} does not support variable length arrays and complete type sizes must
be known at compile time. Thus, the free function interface returning a pointer
to private memory is the better way to represent this construct in
{cpp}. Lifetime of the underlying memory region was a concern too, but the
current design with automatic storage duration for the allocated memory region
closely follows what the user would get from a stack-allocated array.

== Issues

=== Default `DecorateAddress` value

At the time this extension was first proposed, there was no consensus for a
default value for `sycl::access::decorate` in SYCL. The SYCL specification
chooses `sycl::access::decorate::legacy` to avoid making breaking changes, but
this would not justify using that value in this extension.

Although it would be desirable to have one, the `private_alloca` extension will
not commit to a default value until the SYCL community has come to an agreement.

=== Passing size as an argument

Initial design passes size as a `sycl::specialization_id<Integral> &` template
argument and receives a `sycl::kernel_handler &` as an argument. This decision
comes from the current situation in which `sycl::specialization_id` is a unique
identifier to represent a specialization constant and `sycl::kernel_handler` is
used to query the *value* of specialization constants with
`sycl::kernel_handler::get_specialization_constant`. Having a
`sycl::specialization_constant` class representing specialization constants
would enable cleaner interfaces to this function like:

[source,c++]
----
namespace sycl::ext::oneapi::experimental {
template <typename ElementType, access::decorated DecorateAddress>
private_ptr<ElementType, DecorateAddress>
private_alloca(const specialization_constant<std::size_t> &size);
} // namespace sycl::ext::oneapi::experimental
----

== Revision history

[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|========================================
|Rev|Date|Authors|Changes
|1|2024-02-08|Victor Lomüller, Lukas Sommer, Victor Perez, Julian Oppermann, Tadej Ciglaric, Romain Biessy|*Initial draft*
|========================================

0 comments on commit aaf7a58

Please sign in to comment.