From 7cfec8894ba07dffe8acde890755e76ac6ab09ba Mon Sep 17 00:00:00 2001 From: Michael Aziz Date: Wed, 4 Oct 2023 13:37:19 -0700 Subject: [PATCH 1/3] Create test plan for `auto_local_range` extension Signed-off-by: Michael Aziz --- test_plans/auto_local_range.asciidoc | 37 ++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 test_plans/auto_local_range.asciidoc diff --git a/test_plans/auto_local_range.asciidoc b/test_plans/auto_local_range.asciidoc new file mode 100644 index 000000000..ef4d447b5 --- /dev/null +++ b/test_plans/auto_local_range.asciidoc @@ -0,0 +1,37 @@ +:sectnums: +:xrefstyle: short + += Test plan for sycl_ext_oneapi_auto_local_range + +This is a test plan for the API described in +https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_auto_local_range.asciidoc[sycl_ext_oneapi_auto_local_range]. + + +== Testing scope + +=== Device coverage + +All of the tests described below are performed only on the default device that +is selected on the CTS command line. + +=== Feature test macro + +All of the tests should use `#ifdef SYCL_EXT_ONEAPI_AUTO_LOCAL_RANGE` so they can be skipped +if feature is not supported. + +== Tests + +* All following tests run with `Dimensions` = 1, 2, 3 + +=== auto_range function + +Check that `auto_range()` return type is `range` + +=== auto_range in parallel_for + +For following `parallel_for` functions: + +* `queue::parallel_for` +* `handler::parallel_for` + +Check that `auto_range()` as the local range parameter behaves as expected: when each work item writes 1 into zero-initialized global memory all array values read after the `parallel_for` invocation are equal to 1 and the number of array values is equal to the global range. From c23d4ec293a5187ac5e8baefe404ac9e401aeb14 Mon Sep 17 00:00:00 2001 From: 0x12CC <68250218+0x12CC@users.noreply.github.com> Date: Thu, 5 Oct 2023 09:47:28 -0400 Subject: [PATCH 2/3] Update test_plans/auto_local_range.asciidoc Co-authored-by: Greg Lueck --- test_plans/auto_local_range.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test_plans/auto_local_range.asciidoc b/test_plans/auto_local_range.asciidoc index ef4d447b5..6e390b1f9 100644 --- a/test_plans/auto_local_range.asciidoc +++ b/test_plans/auto_local_range.asciidoc @@ -34,4 +34,4 @@ For following `parallel_for` functions: * `queue::parallel_for` * `handler::parallel_for` -Check that `auto_range()` as the local range parameter behaves as expected: when each work item writes 1 into zero-initialized global memory all array values read after the `parallel_for` invocation are equal to 1 and the number of array values is equal to the global range. +Check that `auto_range()` as the local range parameter behaves as expected: when each work item writes 1 into zero-initialized global memory all array values read after the `parallel_for` invocation are equal to 1. From c6c75674563c288a4cd185042e0c8f2513fc5d48 Mon Sep 17 00:00:00 2001 From: Michael Aziz Date: Tue, 10 Oct 2023 10:20:18 -0700 Subject: [PATCH 3/3] Add more complicated test Signed-off-by: Michael Aziz --- test_plans/auto_local_range.asciidoc | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/test_plans/auto_local_range.asciidoc b/test_plans/auto_local_range.asciidoc index 6e390b1f9..d4dfabab0 100644 --- a/test_plans/auto_local_range.asciidoc +++ b/test_plans/auto_local_range.asciidoc @@ -34,4 +34,12 @@ For following `parallel_for` functions: * `queue::parallel_for` * `handler::parallel_for` -Check that `auto_range()` as the local range parameter behaves as expected: when each work item writes 1 into zero-initialized global memory all array values read after the `parallel_for` invocation are equal to 1. +Check that a kernel launched using `auto_range()` as the local range behaves as expected and can use group APIs: Create a local accumulator in each work item to sum values from an input buffer. Get the total input sum using `sycl::reduce_over_group` to accumulate the partial sums from the work items within the group. Check that this total has the expected value. Example kernel: + +``` +int local_accumulator = 0; +for (int i = g.get_local_id(); i < N; i += g.get_local_linear_range()) { + local_accumulator += input[i]; +} +int total = sycl::reduce_over_group(g, local_accumulator, sycl::plus<>()); +```