-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge remote-tracking branch 'upstream/sycl' into 'sycl-mlir' #12293
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add four extra option resets in `disableUnsupportedOptions` when parsing SYCL code. All four are needed to make clangd happily parse code when `-fsycl` is specified in compile commands. Without each of them, the following happens: * without `SYCLIsDevice=false`: crash on !nullptr assert * without `DeclareSPIRVBuiltins=false`: crash on !nullptr assert * without `SYCLUnnamedLambda=false`: `ref_non_value` error, `'T' does not refer to a value` * without `Triple=HostTriple`: `pp_file_not_found` error, `'gnu/stubs-32.h' file not found` Fixes intel#11088 and clangd/clangd#1097 Also add a LIT test which parses the following code: ```cpp #include <sycl/sycl.hpp> sycl::queue q{}; ``` and checks `symbolInfo` for the symbol `q` against a reference.
llvm/llvm-project#73077 added -Wswitch-default diagnostic but it produced false positives in templates. This PR will address that. llvm/llvm-project#75943
We can't distinguish UAR and UAS, but by definition UAR is already UAS.
Similar to `vector.transfer_read`/`vector.transfer_write`, allow 0-D vectors. This commit fixes `mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir` when verifying the IR after each pattern (#74270). That test produces a temporary 0-D load/store op.
The runners-32 group is broken, for reasons... The easiest fix is to move the jobs to runners-8. (which needs to be renamed, because they're all actually 30 core machines)
`VEX_4V` does not look simpler than `VEX, VVVV`. It's kind of confusing b/c classes like `VEX_L`, `VEX_LIG` do not imply `VEX` but it does. For APX, we have promote EVEX, NDD, NF and NDD_NF instructions. All of the 4 variants are in EVEX space and NDD/NDD_NF set the VVVV fields. To extract the common fields (e.g EVEX) into a class and set VVVV conditionally, we need VVVV to not imply other prefixes.
This change adds (runtime) bounds checks for `memref` ops using the existing `RuntimeVerifiableOpInterface`. For `memref.load` and `memref.store`, we check that the indices are in-bounds of the memref's index space. For `memref.reinterpret_cast` and `memref.subview` we check that the resulting address space is in-bounds of the input memref's address space.
…74686) `clang -g -gpubnames -fdebug-types-section` now emits .debug_names section with references to local type unit entries defined in COMDAT .debug_info sections. ``` .section .debug_info,"G",@progbits,5657452045627120676,comdat .Ltu_begin0: ... .section .debug_names,"",@progbits ... // DWARF32 .long .Ltu_begin0 # Type unit 0 // DWARF64 // .long .Ltu_begin0 # Type unit 0 ``` When `.Ltu_begin0` is relative to a non-prevailing .debug_info section, the relocation resolves to 0, which is a valid offset within the .debug_info section. ``` cat > a.cc <<e struct A { int x; }; inline A foo() { return {1}; } int main() { foo(); } e cat > b.cc <<e struct A { int x; }; inline A foo() { return {1}; } void use() { foo(); } e clang++ -g -gpubnames -fdebug-types-section -fuse-ld=lld a.cc b.cc -o old ``` ``` % llvm-dwarfdump old ... Local Type Unit offsets [ LocalTU[0]: 0x00000000 ] ... Local Type Unit offsets [ LocalTU[0]: 0x00000000 // indistinguishable from a valid offset within .debug_info ] ``` https://dwarfstd.org/issues/231013.1.html proposes that we use a tombstone value instead to inform consumers. This patch implements the idea. The second LocalTU entry will now use 0xffffffff. https://reviews.llvm.org/D84825 has a TODO that we should switch the tombstone value for most `.debug_*` sections to UINT64_MAX. We have postponed the change for more than three years for consumers to migrate. At some point we shall make the change, so that .debug_names is no long different from other debug section that is not .debug_loc/.debug_ranges. Co-authored-by: Alexander Yermolovich <ayermolo@meta.com>
This test demonstrates how the PhysicalStorageBuffer extension can be used end-2-end in a spir-v module. This module has been verified to pass serialization, deserialization, and validation with spirv-val.
The legalize action for uint_to_fp/sint_to_fp uses the source integer type, not the result FP type so setting an action on an FP type does nothing.
Issue #76202 This reverts commit f568763.
This patch removes the explicit llvm-objdump man page. By enabling sphinx man page output with `-DLLVM_ENABLE_SPHINX=ON` and `-DSPHINX_OUTPUT_MAN=ON`, we can generate man pages for all the llvm binary utilities from the restructured text documentation. Having an additional man page upstream increases fragementation and maintenance.
After c718336 the output of print-before/print-after was changed. A semicolon was added before the banner "*** IR Dump ..." and this change broke chunk-print-before-all script. This change makes it more resilient to small output format variations.
This avoids a regression in a future commit
CONFLICT (content): Merge conflict in openmp/libomptarget/test/lit.cfg
R16-R31 was added into GPRs in llvm/llvm-project#70958, This patch supports the encoding/decoding for promoted CET instruction in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4
And fusion features are added to processor definition.
add sycl-rel-** to the pre-commit/post-commit workflow supported branches
We split `TuneShiftedZExtFusion` into three fusions to make them reusable and match the GCC implementation[1]. The zexth/zextw fusions can be reused by XiangShan[2] and other commercial processors, but shifted zero extension is not so common. `macro-fusions-veyron-v1.mir` is renamed so it's not relevant to specific processor. References: [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637303.html [2] https://xiangshan-doc.readthedocs.io/zh_CN/latest/frontend/decode
Before this patch, there was a regression in comment formatting due to some code formatting in bd3e8eb. This was fixed in 428660c. Github interprets a tab before a string as starting code formatting. The message that indicted the code formatting in a PR had been fixed was refactored to a python multi-line string, but with a tab in front, causing these messages to be rendered as code blocks in Github, instead of as intended. This patch builds upon the original fix to reformat the strings so that they fit within ~80 character lines and are simpler to modify in the future, hopefully removing traps like the one that caused the original issue.
… (#76125) R16-R31 was added into GPRs in llvm/llvm-project#70958, This patch supports the encoding/decoding for promoted CMPCCXADD instruction in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4
LLVM: llvm/llvm-project@bc8c4bb SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@e63fdb8
intel#12219) Co-authored-by: Artur Gainullin <artur.gainullin@intel.com>
…2211) Apply intel#11623 to `annotated_arg` class
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
…onAnalysis` Latest upstream changes made data-flow analyses interprocedural. Our `setToEntryState()` function is not working as expected, which is causing issues. Overriding this function and using previous behavior is thus needed for our analysis to work. Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Update tests to use `scf.reduce` instead of `scf.yield` in `scf.parallel` operations. Signed-off-by: Victor Perez <victor.perez@codeplay.com>
…s()` Define shortcut to get alias analysis being used in accessor analysis in order to query alias between accessors.
Drop following passing tests from list: - `Basic/half_type.cpp` - `Basic/image/image_accessor_readwrite_half.cpp` - `Basic/image/image_write_fp16.cpp` - `ESIMD/accessor_load_store_stateless.cpp` - `ESIMD/api/svm_gather_scatter_64.cpp` - `ESIMD/regression/globals.cpp` - `InvokeSimd/Spec/ImplicitSubgroup/uniform_retval.cpp` - `NonUniformGroups/fixed_size_group_algorithms.cpp` Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Add address space to global constants so that they can be translated to SPIR-V. Also add alignment in the process. Signed-off-by: Victor Perez <victor.perez@codeplay.com>
cgeist `-O0` E2E tests are failing. `XFAIL` them till further investigation. Signed-off-by: Victor Perez <victor.perez@codeplay.com>
- `ESIMD/fp_controls.cpp` - `ESIMD/noinline_call_from_func.cpp` - `ESIMD/private_memory/private_memory.cpp` - `ESIMD/regression/bfloat16_half_vector_plus_eq_scalar.cpp` - `ESIMD/unified_memory_api/atomic_update_acc.cpp` - `ESIMD/unified_memory_api/atomic_update_acc_64.cpp` - `ESIMD/unified_memory_api/atomic_update_acc_cmpxchg.cpp` - `ESIMD/unified_memory_api/atomic_update_slm.cpp` - `ESIMD/unified_memory_api/atomic_update_slm_acc.cpp` - `ESIMD/unified_memory_api/atomic_update_slm_acc_cmpxchg.cpp` - `ESIMD/unified_memory_api/atomic_update_slm_cmpxchg.cpp` - `ESIMD/unified_memory_api/block_store_slm.cpp` - `ESIMD/unified_memory_api/block_store_slm_acc.cpp` - `Graph/Explicit/sub_group_prop.cpp` - `Graph/Explicit/work_group_size_prop.cpp` - `Graph/RecordReplay/sub_group_prop.cpp` - `Graph/RecordReplay/work_group_size_prop.cpp` - `InvokeSimd/Regression/address_space_cast.cpp` - `KernelAndProgram/disable-caching.cpp` - `Plugin/level_zero_queue_profiling.cpp` - `Plugin/level_zero_usm_residency.cpp` Signed-off-by: Victor Perez <victor.perez@codeplay.com>
victor-eds
added
disable-lint
Skip linter check step and proceed with build jobs
sycl-mlir
Pull requests or issues for sycl-mlir branch
labels
Jan 4, 2024
Diff is too big to be rendered by browser, so here's the diff for each conflicting file: diff --git a/.github/workflows/sycl_mlir_linux_precommit.yml b/.github/workflows/sycl_mlir_linux_precommit.yml
index 23e3a20e6627..f8f64cd7a8bb 100644
--- a/.github/workflows/sycl_mlir_linux_precommit.yml
+++ b/.github/workflows/sycl_mlir_linux_precommit.yml
@@ -20,12 +22,18 @@ on:
- 'devops/containers/**'
- 'devops/actions/build_container/**'
+concurrency:
+ # Cancel a currently running workflow from the same PR, branch or tag.
+ group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+ cancel-in-progress: true
+
jobs:
detect_changes:
uses: ./.github/workflows/sycl_detect_changes.yml
lint:
runs-on: [Linux, build]
+ if: ${{ always() && !contains(github.event.pull_request.labels.*.name, 'disable-lint') }}
container:
image: ghcr.io/intel/llvm/sycl_ubuntu2204_nightly:no-drivers
options: -u 1001:1001
@@ -56,7 +64,7 @@ jobs:
needs: [lint, detect_changes]
if: |
always()
- && (success() || contains(github.event.pull_request.labels.*.name, 'disable-lint'))
+ && (success() || needs.lint.result == 'skipped')
uses: ./.github/workflows/sycl_linux_build.yml
with:
build_ref: ${{ github.sha }}
@@ -64,6 +72,8 @@ jobs:
build_cache_root: "/__w/"
build_artifact_suffix: "default"
build_cache_suffix: "default"
+ # FIXME: UR can't find /opt/rocm/hsa with rocm-6.0.0. Previous rocm-5.7.0 had it.
+ build_image: "ghcr.io/intel/llvm/ubuntu2204_build:7ed894ab0acc8ff09262113fdb08940d22654a30"
changes: ${{ needs.detect_changes.outputs.filters }}
test:
diff --git a/mlir/lib/Dialect/LLVMIR/IR/LLVMInlining.cpp b/mlir/lib/Dialect/LLVMIR/IR/LLVMInlining.cpp
index 0cc8ea886093..65c1daee6711 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/LLVMInlining.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/LLVMInlining.cpp
@@ -50,11 +50,31 @@ static bool hasLifetimeMarkers(LLVM::AllocaOp allocaOp) {
static void
handleInlinedAllocas(Operation *call,
iterator_range<Region::iterator> inlinedBlocks) {
+ // Locate the entry block of the closest callsite ancestor that has either the
+ // IsolatedFromAbove or AutomaticAllocationScope trait. In pure LLVM dialect
+ // programs, this is the LLVMFuncOp containing the call site. However, in
+ // mixed-dialect programs, the callsite might be nested in another operation
+ // that carries one of these traits. In such scenarios, this traversal stops
+ // at the closest ancestor with either trait, ensuring visibility post
+ // relocation and respecting allocation scopes.
+ Block *callerEntryBlock = nullptr;
+ Operation *currentOp = call;
+ while (Operation *parentOp = currentOp->getParentOp()) {
+ if (parentOp->mightHaveTrait<OpTrait::IsIsolatedFromAbove>() ||
+ parentOp->mightHaveTrait<OpTrait::AutomaticAllocationScope>()) {
+ callerEntryBlock = ¤tOp->getParentRegion()->front();
+ break;
+ }
+ currentOp = parentOp;
+ }
+
+ // Avoid relocating the alloca operations if the call has been inlined into
+ // the entry block already, which is typically the encompassing
+ // LLVM function, or if the relevant entry block cannot be identified.
Block *calleeEntryBlock = &(*inlinedBlocks.begin());
- Block *callerEntryBlock = &(*calleeEntryBlock->getParent()->begin());
- if (calleeEntryBlock == callerEntryBlock)
- // Nothing to do.
+ if (!callerEntryBlock || callerEntryBlock == calleeEntryBlock)
return;
+
SmallVector<std::tuple<LLVM::AllocaOp, IntegerAttr, bool>> allocasToMove;
bool shouldInsertLifetimes = false;
bool hasDynamicAlloca = false;
@@ -496,14 +516,14 @@ static void handleAccessGroups(Operation *call,
/// If `requestedAlignment` is higher than the alignment specified on `alloca`,
/// realigns `alloca` if this does not exceed the natural stack alignment.
/// Returns the post-alignment of `alloca`, whether it was realigned or not.
-static unsigned tryToEnforceAllocaAlignment(LLVM::AllocaOp alloca,
- unsigned requestedAlignment,
+static uint64_t tryToEnforceAllocaAlignment(LLVM::AllocaOp alloca,
+ uint64_t requestedAlignment,
DataLayout const &dataLayout) {
- unsigned allocaAlignment = alloca.getAlignment().value_or(1);
+ uint64_t allocaAlignment = alloca.getAlignment().value_or(1);
if (requestedAlignment <= allocaAlignment)
// No realignment necessary.
return allocaAlignment;
- unsigned naturalStackAlignmentBits = dataLayout.getStackAlignment();
+ uint64_t naturalStackAlignmentBits = dataLayout.getStackAlignment();
// If the natural stack alignment is not specified, the data layout returns
// zero. Optimistically allow realignment in this case.
if (naturalStackAlignmentBits == 0 ||
@@ -525,7 +545,7 @@ static unsigned tryToEnforceAllocaAlignment(LLVM::AllocaOp alloca,
/// the pointer, then returns the resulting post-alignment, regardless of
/// whether it was realigned or not. If no existing alignment attribute is
/// found, returns 1 (i.e., assume that no alignment is guaranteed).
-static unsigned tryToEnforceAlignment(Value value, unsigned requestedAlignment,
+static uint64_t tryToEnforceAlignment(Value value, uint64_t requestedAlignment,
DataLayout const &dataLayout) {
if (Operation *definingOp = value.getDefiningOp()) {
if (auto alloca = dyn_cast<LLVM::AllocaOp>(definingOp))
@@ -557,8 +577,8 @@ static unsigned tryToEnforceAlignment(Value value, unsigned requestedAlignment,
/// the address of the new alloca, then returns the value of the new alloca.
static Value handleByValArgumentInit(OpBuilder &builder, Location loc,
Value argument, Type elementType,
- unsigned elementTypeSize,
- unsigned targetAlignment) {
+ uint64_t elementTypeSize,
+ uint64_t targetAlignment) {
// Allocate the new value on the stack.
Value allocaOp;
{
@@ -587,7 +607,7 @@ static Value handleByValArgumentInit(OpBuilder &builder, Location loc,
/// attribute (or 1 if no align attribute was set).
static Value handleByValArgument(OpBuilder &builder, Operation *callable,
Value argument, Type elementType,
- unsigned requestedAlignment) {
+ uint64_t requestedAlignment) {
auto func = cast<LLVM::LLVMFuncOp>(callable);
LLVM::MemoryEffectsAttr memoryEffects = func.getMemoryAttr();
// If there is no memory effects attribute, assume that the function is
@@ -597,16 +617,16 @@ static Value handleByValArgument(OpBuilder &builder, Operation *callable,
memoryEffects.getArgMem() != LLVM::ModRefInfo::Mod;
// Check if there's an alignment mismatch requiring us to copy.
DataLayout dataLayout = DataLayout::closest(callable);
- unsigned minimumAlignment = dataLayout.getTypeABIAlignment(elementType);
+ uint64_t minimumAlignment = dataLayout.getTypeABIAlignment(elementType);
if (isReadOnly) {
if (requestedAlignment <= minimumAlignment)
return argument;
- unsigned currentAlignment =
+ uint64_t currentAlignment =
tryToEnforceAlignment(argument, requestedAlignment, dataLayout);
if (currentAlignment >= requestedAlignment)
return argument;
}
- unsigned targetAlignment = std::max(requestedAlignment, minimumAlignment);
+ uint64_t targetAlignment = std::max(requestedAlignment, minimumAlignment);
return handleByValArgumentInit(builder, func.getLoc(), argument, elementType,
dataLayout.getTypeSize(elementType),
targetAlignment);
@@ -632,8 +652,7 @@ struct LLVMInlinerInterface : public DialectInlinerInterface {
bool wouldBeCloned) const final {
if (!wouldBeCloned)
return false;
- auto callOp = dyn_cast<LLVM::CallOp>(call);
- if (!callOp) {
+ if (!isa<LLVM::CallOp>(call)) {
LLVM_DEBUG(llvm::dbgs()
<< "Cannot inline: call is not an LLVM::CallOp\n");
return false;
@@ -684,7 +703,6 @@ struct LLVMInlinerInterface : public DialectInlinerInterface {
return true;
}
- /// All operations in the LLVM dialect are legal to inline.
bool isLegalToInline(Operation *op, Region *, bool, IRMapping &) const final {
return true;
}
@@ -706,15 +724,13 @@ struct LLVMInlinerInterface : public DialectInlinerInterface {
/// Handle the given inlined return by replacing the uses of the call with the
/// operands of the return. This overload is called when the inlined region
/// only contains one block.
- void handleTerminator(Operation *op,
- ArrayRef<Value> valuesToRepl) const final {
+ void handleTerminator(Operation *op, ValueRange valuesToRepl) const final {
// Return will be the only terminator present.
auto returnOp = cast<LLVM::ReturnOp>(op);
// Replace the values directly with the return operands.
assert(returnOp.getNumOperands() == valuesToRepl.size());
- for (const auto &[dst, src] :
- llvm::zip(valuesToRepl, returnOp.getOperands()))
+ for (auto [dst, src] : llvm::zip(valuesToRepl, returnOp.getOperands()))
dst.replaceAllUsesWith(src);
}
@@ -724,7 +740,7 @@ struct LLVMInlinerInterface : public DialectInlinerInterface {
if (std::optional<NamedAttribute> attr =
argumentAttrs.getNamed(LLVM::LLVMDialect::getByValAttrName())) {
Type elementType = cast<TypeAttr>(attr->getValue()).getValue();
- unsigned requestedAlignment = 1;
+ uint64_t requestedAlignment = 1;
if (std::optional<NamedAttribute> alignAttr =
argumentAttrs.getNamed(LLVM::LLVMDialect::getAlignAttrName())) {
requestedAlignment = cast<IntegerAttr>(alignAttr->getValue())
diff --git a/mlir/test/lib/Transforms/TestIntRangeInference.cpp b/mlir/test/lib/Transforms/TestIntRangeInference.cpp
index 7a0c97913af4..5758f6acf2f0 100644
--- a/mlir/test/lib/Transforms/TestIntRangeInference.cpp
+++ b/mlir/test/lib/Transforms/TestIntRangeInference.cpp
@@ -40,9 +40,8 @@ static LogicalResult replaceWithConstant(DataFlowSolver &solver, OpBuilder &b,
maybeDefiningOp ? maybeDefiningOp->getDialect()
: value.getParentRegion()->getParentOp()->getDialect();
Attribute constAttr = b.getIntegerAttr(value.getType(), *maybeConstValue);
- Value constant =
- folder.getOrCreateConstant(b.getInsertionBlock(), valueDialect, constAttr,
- value.getType(), value.getLoc());
+ Value constant = folder.getOrCreateConstant(
+ b.getInsertionBlock(), valueDialect, constAttr, value.getType());
if (!constant)
return failure();
@@ -108,7 +107,7 @@ struct TestIntRangeInference
Operation *op = getOperation();
DataFlowSolver solver;
solver.load<DeadCodeAnalysis>();
- solver.load<dataflow::SparseConstantPropagation>();
+ solver.load<SparseConstantPropagation>();
solver.load<IntegerRangeAnalysis>();
if (failed(solver.initializeAndRun(op)))
return signalPassFailure();
diff --git a/sycl/include/sycl/handler.hpp b/sycl/include/sycl/handler.hpp
index 858f056d0196..5fb9d234322a 100644
--- a/sycl/include/sycl/handler.hpp
+++ b/sycl/include/sycl/handler.hpp
@@ -29,6 +29,7 @@
#include <sycl/event.hpp>
#include <sycl/exception.hpp>
#include <sycl/exception_list.hpp>
+#include <sycl/ext/intel/experimental/fp_control_kernel_properties.hpp>
#include <sycl/ext/intel/experimental/kernel_execution_properties.hpp>
#include <sycl/ext/oneapi/bindless_images_descriptor.hpp>
#include <sycl/ext/oneapi/bindless_images_interop.hpp>
@@ -913,11 +914,20 @@ private:
///
/// Stores information about kernel properties into the handler.
template <
+ typename KernelName,
typename PropertiesT = ext::oneapi::experimental::empty_properties_t>
void processProperties(PropertiesT Props) {
+ using KI = detail::KernelInfo<KernelName>;
static_assert(
ext::oneapi::experimental::is_property_list<PropertiesT>::value,
"Template type is not a property list.");
+ static_assert(
+ !PropertiesT::template has_property<
+ sycl::ext::intel::experimental::fp_control_key>() ||
+ (PropertiesT::template has_property<
+ sycl::ext::intel::experimental::fp_control_key>() &&
+ KI::isESIMD()),
+ "Floating point control property is supported for ESIMD kernels only.");
if constexpr (PropertiesT::template has_property<
sycl::ext::intel::experimental::cache_config_key>()) {
auto Config = Props.template get_property<
@@ -1111,9 +1121,12 @@ private:
};
std::optional<std::array<size_t, 3>> getMaxWorkGroups();
+ // We need to use this version to support gcc 7.5.0. Remove when minimal
+ // supported gcc version is bumped.
+ std::tuple<std::array<size_t, 3>, bool> getMaxWorkGroups_v2();
template <int Dims>
- std::optional<range<Dims>> getRoundedRange(range<Dims> UserRange) {
+ std::tuple<range<Dims>, bool> getRoundedRange(range<Dims> UserRange) {
range<Dims> RoundedRange = UserRange;
// Disable the rounding-up optimizations under these conditions:
// 1. The env var SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING is set.
@@ -1133,7 +1146,7 @@ private:
// Perform range rounding if rounding-up is enabled.
if (this->DisableRangeRounding())
- return {};
+ return {range<Dims>{}, false};
// Range should be a multiple of this for reasonable performance.
size_t MinFactorX = 16;
@@ -1157,8 +1170,8 @@ private:
// kernel in a 32-bit global range.
auto Dev = detail::getSyclObjImpl(detail::getDeviceFromHandler(*this));
id<Dims> MaxNWGs = [&] {
- auto PiResult = getMaxWorkGroups();
- if (!PiResult.has_value()) {
+ auto [MaxWGs, HasMaxWGs] = getMaxWorkGroups_v2();
+ if (!HasMaxWGs) {
id<Dims> Default;
for (int i = 0; i < Dims; ++i)
Default[i] = (std::numeric_limits<int32_t>::max)();
@@ -1168,7 +1181,7 @@ private:
id<Dims> IdResult;
size_t Limit = (std::numeric_limits<int>::max)();
for (int i = 0; i < Dims; ++i)
- IdResult[i] = (std::min)(Limit, (*PiResult)[Dims - i - 1]);
+ IdResult[i] = (std::min)(Limit, MaxWGs[Dims - i - 1]);
return IdResult;
}();
auto M = (std::numeric_limits<uint32_t>::max)();
@@ -1204,8 +1217,8 @@ private:
Adjust(i, MaxRange[i]);
if (!DidAdjust)
- return {};
- return RoundedRange;
+ return {range<Dims>{}, false};
+ return {RoundedRange, true};
}
/// Defines and invokes a SYCL kernel function for the specified range.
@@ -1253,9 +1266,10 @@ private:
"first argument of sycl::item type, or of a type which is "
"implicitly convertible from sycl::item");
+ using RefLambdaArgType = std::add_lvalue_reference_t<LambdaArgType>;
static_assert(
- (std::is_invocable_v<KernelType, LambdaArgType> ||
- std::is_invocable_v<KernelType, LambdaArgType, kernel_handler>),
+ (std::is_invocable_v<KernelType, RefLambdaArgType> ||
+ std::is_invocable_v<KernelType, RefLambdaArgType, kernel_handler>),
"SYCL kernel lambda/functor has an unexpected signature, it should be "
"invocable with sycl::item and optionally sycl::kernel_handler");
#endif
@@ -1273,7 +1287,8 @@ private:
#if !defined(__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__) && \
!defined(DPCPP_HOST_DEVICE_OPENMP) && \
!defined(DPCPP_HOST_DEVICE_PERF_NATIVE) && SYCL_LANGUAGE_VERSION >= 202001
- if (auto RoundedRange = getRoundedRange(UserRange)) {
+ auto [RoundedRange, HasRoundedRange] = getRoundedRange(UserRange);
+ if (HasRoundedRange) {
using NameWT = typename detail::get_kernel_wrapper_name_t<NameT>::name;
auto Wrapper =
getRangeRoundedKernelLambda<NameWT, TransformedArgType, Dims>(
@@ -1291,7 +1306,7 @@ private:
// __SYCL_ASSUME_INT can still be violated. So check the bounds
// of the user range, instead of the rounded range.
detail::checkValueRange<Dims>(UserRange);
- MNDRDesc.set(*RoundedRange);
+ MNDRDesc.set(RoundedRange);
StoreLambda<KName, decltype(Wrapper), Dims, TransformedArgType>(
std::move(Wrapper));
setType(detail::CG::Kernel);
@@ -1306,7 +1321,7 @@ private:
kernel_parallel_for_wrapper<NameT, TransformedArgType, KernelType,
PropertiesT>(KernelFunc);
#ifndef __SYCL_DEVICE_ONLY__
- processProperties<PropertiesT>(Props);
+ processProperties<NameT, PropertiesT>(Props);
detail::checkValueRange<Dims>(UserRange);
MNDRDesc.set(std::move(UserRange));
StoreLambda<NameT, KernelType, Dims, TransformedArgType>(
@@ -1361,7 +1376,7 @@ private:
kernel_parallel_for_wrapper<NameT, TransformedArgType, KernelType,
PropertiesT>(KernelFunc);
#ifndef __SYCL_DEVICE_ONLY__
- processProperties<PropertiesT>(Props);
+ processProperties<NameT, PropertiesT>(Props);
detail::checkValueRange<Dims>(ExecutionRange);
MNDRDesc.set(std::move(ExecutionRange));
StoreLambda<NameT, KernelType, Dims, TransformedArgType>(
@@ -1418,7 +1433,7 @@ private:
kernel_parallel_for_work_group_wrapper<NameT, LambdaArgType, KernelType,
PropertiesT>(KernelFunc);
#ifndef __SYCL_DEVICE_ONLY__
- processProperties<PropertiesT>(Props);
+ processProperties<NameT, PropertiesT>(Props);
detail::checkValueRange<Dims>(NumWorkGroups);
MNDRDesc.setNumWorkGroups(NumWorkGroups);
StoreLambda<NameT, KernelType, Dims, LambdaArgType>(std::move(KernelFunc));
@@ -1458,7 +1473,7 @@ private:
kernel_parallel_for_work_group_wrapper<NameT, LambdaArgType, KernelType,
PropertiesT>(KernelFunc);
#ifndef __SYCL_DEVICE_ONLY__
- processProperties<PropertiesT>(Props);
+ processProperties<NameT, PropertiesT>(Props);
nd_range<Dims> ExecRange =
nd_range<Dims>(NumWorkGroups * WorkGroupSize, WorkGroupSize);
detail::checkValueRange<Dims>(ExecRange);
@@ -1638,8 +1653,8 @@ private:
// * Provide explicit template type parameters for the call
//
// Couldn't think of a better way to achieve both.
- template <typename KernelType, typename PropertiesT, bool HasKernelHandlerArg,
- typename FuncTy>
+ template <typename KernelName, typename KernelType, typename PropertiesT,
+ bool HasKernelHandlerArg, typename FuncTy>
void unpack(_KERNELFUNCPARAM(KernelFunc), FuncTy Lambda) {
#ifdef __SYCL_DEVICE_ONLY__
detail::CheckDeviceCopyable<KernelType>();
@@ -1648,13 +1663,15 @@ private:
typename detail::GetMergedKernelProperties<KernelType,
PropertiesT>::type;
using Unpacker = KernelPropertiesUnpacker<MergedPropertiesT>;
+#ifndef __SYCL_DEVICE_ONLY__
// If there are properties provided by get method then process them.
if constexpr (ext::oneapi::experimental::detail::
HasKernelPropertiesGetMethod<
_KERNELFUNCPARAMTYPE>::value) {
- processProperties(
+ processProperties<KernelName>(
KernelFunc.get(ext::oneapi::experimental::properties_tag{}));
}
+#endif
if constexpr (HasKernelHandlerArg) {
kernel_handler KH;
Lambda(Unpacker{}, this, KernelFunc, KH);
@@ -1670,7 +1687,7 @@ private:
typename KernelName, typename KernelType,
typename PropertiesT = ext::oneapi::experimental::empty_properties_t>
void kernel_single_task_wrapper(_KERNELFUNCPARAM(KernelFunc)) {
- unpack<KernelType, PropertiesT,
+ unpack<KernelName, KernelType, PropertiesT,
detail::KernelLambdaHasKernelHandlerArgT<KernelType>::value>(
KernelFunc, [&](auto Unpacker, auto... args) {
Unpacker.template kernel_single_task_unpack<KernelName, KernelType>(
@@ -1682,7 +1699,7 @@ private:
typename KernelName, typename ElementType, typename KernelType,
typename PropertiesT = ext::oneapi::experimental::empty_properties_t>
void kernel_parallel_for_wrapper(_KERNELFUNCPARAM(KernelFunc)) {
- unpack<KernelType, PropertiesT,
+ unpack<KernelName, KernelType, PropertiesT,
detail::KernelLambdaHasKernelHandlerArgT<KernelType,
ElementType>::value>(
KernelFunc, [&](auto Unpacker, auto... args) {
@@ -1695,7 +1712,7 @@ private:
typename KernelName, typename ElementType, typename KernelType,
typename PropertiesT = ext::oneapi::experimental::empty_properties_t>
void kernel_parallel_for_work_group_wrapper(_KERNELFUNCPARAM(KernelFunc)) {
- unpack<KernelType, PropertiesT,
+ unpack<KernelName, KernelType, PropertiesT,
detail::KernelLambdaHasKernelHandlerArgT<KernelType,
ElementType>::value>(
KernelFunc, [&](auto Unpacker, auto... args) {
@@ -1729,7 +1746,7 @@ private:
// No need to check if range is out of INT_MAX limits as it's compile-time
// known constant.
MNDRDesc.set(range<1>{1});
- processProperties<PropertiesT>(Props);
+ processProperties<NameT, PropertiesT>(Props);
StoreLambda<NameT, KernelType, /*Dims*/ 1, void>(KernelFunc);
setType(detail::CG::Kernel);
#endif
@@ -2332,7 +2349,6 @@ public:
std::enable_if_t<
ext::oneapi::experimental::is_property_list<PropertiesT>::value>
single_task(PropertiesT Props, _KERNELFUNCPARAM(KernelFunc)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
single_task_lambda_impl<KernelName, KernelType, PropertiesT>(Props,
KernelFunc);
}
@@ -2343,7 +2359,6 @@ public:
ext::oneapi::experimental::is_property_list<PropertiesT>::value>
parallel_for(range<1> NumWorkItems __SYCL_ANNOTATE(range), PropertiesT Props,
_KERNELFUNCPARAM(KernelFunc) __SYCL_ANNOTATE(kernel)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
parallel_for_lambda_impl<KernelName, KernelType, 1, PropertiesT>(
NumWorkItems, Props, std::move(KernelFunc));
}
@@ -2354,7 +2369,6 @@ public:
ext::oneapi::experimental::is_property_list<PropertiesT>::value>
parallel_for(range<2> NumWorkItems __SYCL_ANNOTATE(range), PropertiesT Props,
_KERNELFUNCPARAM(KernelFunc) __SYCL_ANNOTATE(kernel)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
parallel_for_lambda_impl<KernelName, KernelType, 2, PropertiesT>(
NumWorkItems, Props, std::move(KernelFunc));
}
@@ -2365,7 +2379,6 @@ public:
ext::oneapi::experimental::is_property_list<PropertiesT>::value>
parallel_for(range<3> NumWorkItems __SYCL_ANNOTATE(range), PropertiesT Props,
_KERNELFUNCPARAM(KernelFunc) __SYCL_ANNOTATE(kernel)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
parallel_for_lambda_impl<KernelName, KernelType, 3, PropertiesT>(
NumWorkItems, Props, std::move(KernelFunc));
}
@@ -2377,7 +2390,6 @@ public:
parallel_for(nd_range<Dims> Range __SYCL_ANNOTATE(nd_range),
PropertiesT Properties,
_KERNELFUNCPARAM(KernelFunc) __SYCL_ANNOTATE(kernel)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
parallel_for_impl<KernelName>(Range, Properties, std::move(KernelFunc));
}
@@ -2392,7 +2404,6 @@ public:
parallel_for(range<1> Range, PropertiesT Properties, RestT &&...Rest) {
throwIfGraphAssociated<ext::oneapi::experimental::detail::
UnsupportedGraphFeatures::sycl_reductions>();
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
detail::reduction_parallel_for<KernelName>(*this, Range, Properties,
std::forward<RestT>(Rest)...);
}
@@ -2406,7 +2417,6 @@ public:
parallel_for(range<2> Range, PropertiesT Properties, RestT &&...Rest) {
throwIfGraphAssociated<ext::oneapi::experimental::detail::
UnsupportedGraphFeatures::sycl_reductions>();
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
detail::reduction_parallel_for<KernelName>(*this, Range, Properties,
std::forward<RestT>(Rest)...);
}
@@ -2420,7 +2430,6 @@ public:
parallel_for(range<3> Range, PropertiesT Properties, RestT &&...Rest) {
throwIfGraphAssociated<ext::oneapi::experimental::detail::
UnsupportedGraphFeatures::sycl_reductions>();
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
detail::reduction_parallel_for<KernelName>(*this, Range, Properties,
std::forward<RestT>(Rest)...);
}
@@ -2481,7 +2490,6 @@ public:
int Dims, typename PropertiesT>
void parallel_for_work_group(range<Dims> NumWorkGroups, PropertiesT Props,
_KERNELFUNCPARAM(KernelFunc)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
parallel_for_work_group_lambda_impl<KernelName, KernelType, Dims,
PropertiesT>(NumWorkGroups, Props,
KernelFunc);
@@ -2492,7 +2500,6 @@ public:
void parallel_for_work_group(range<Dims> NumWorkGroups,
range<Dims> WorkGroupSize, PropertiesT Props,
_KERNELFUNCPARAM(KernelFunc)) {
- throwIfGraphAssociatedAndKernelProperties<PropertiesT>();
parallel_for_work_group_lambda_impl<KernelName, KernelType, Dims,
PropertiesT>(
NumWorkGroups, WorkGroupSize, Props, KernelFunc);
@@ -3641,17 +3648,6 @@ private:
"handler::require() before it can be used.");
}
- template <typename PropertiesT>
- std::enable_if_t<
- ext::oneapi::experimental::is_property_list<PropertiesT>::value>
- throwIfGraphAssociatedAndKernelProperties() const {
- if (!std::is_same_v<PropertiesT,
- ext::oneapi::experimental::empty_properties_t>)
- throwIfGraphAssociated<
- ext::oneapi::experimental::detail::UnsupportedGraphFeatures::
- sycl_ext_oneapi_kernel_properties>();
- }
-
// Set value of the gpu cache configuration for the kernel.
void setKernelCacheConfig(sycl::detail::pi::PiKernelCacheConfig);
diff --git a/sycl/test-e2e/Basic/built-ins/vec_common.cpp b/sycl/test-e2e/Basic/built-ins/vec_common.cpp
index 4ddb88883090..5cbc1efe98e9 100644
--- a/sycl/test-e2e/Basic/built-ins/vec_common.cpp
+++ b/sycl/test-e2e/Basic/built-ins/vec_common.cpp
@@ -1,7 +1,7 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
-// RUN: %if preview-breaking-changes-supported %{ %clangxx -fsycl -fpreview-breaking-changes %s -o %t2.out %}
-// RUN: %if preview-breaking-changes-supported %{ %{run} %t2.out %}
+// RUN: %if preview-breaking-changes-supported %{ %{build} -fpreview-breaking-changes -o %t2.out %}
+// RUN: %if preview-breaking-changes-supported %{ %{run} %t2.out %}
// https://github.com/intel/llvm/issues/7585 to fix the failure:
// XFAIL: gpu
@@ -60,6 +60,88 @@ int main() {
// sycl::clamp
TEST(sycl::clamp, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1, va2, va3);
+ TEST(sycl::clamp, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1, 1.0f, 3.0f);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::clamp, double, 2, EXPECTED(double, 1.0, 2.0), 0, va4, 1.0, 3.0);
+ }
+ // sycl::degrees
+ TEST(sycl::degrees, float, 3, EXPECTED(float, 180, 180, 180), 0, va5);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::degrees, double, 3, EXPECTED(double, 180, 180, 180), 0, va6);
+ }
+ if (dev.has(sycl::aspect::fp16)) {
+ TEST(sycl::degrees, sycl::half, 3, EXPECTED(sycl::half, 180, 180, 180), 0.2,
+ va7);
+ }
+ // sycl::max
+ TEST(sycl::max, float, 2, EXPECTED(float, 3.0f, 2.0f), 0, va1, va3);
+ TEST(sycl::max, float, 2, EXPECTED(float, 1.5f, 2.0f), 0, va1, 1.5f);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::max, double, 2, EXPECTED(double, 1.5, 2.0), 0, va4, 1.5);
+ }
+ // sycl::min
+ TEST(sycl::min, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1, va3);
+ TEST(sycl::min, float, 2, EXPECTED(float, 1.0f, 1.5f), 0, va1, 1.5f);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::min, double, 2, EXPECTED(double, 1.0, 1.5), 0, va4, 1.5);
+ }
+ // sycl::mix
+ TEST(sycl::mix, float, 2, EXPECTED(float, 1.6f, 2.0f), 0, va1, va3, va8);
+ TEST(sycl::mix, float, 2, EXPECTED(float, 1.4f, 2.0f), 0, va1, va3, 0.2);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::mix, double, 2, EXPECTED(double, 3.0, 5.0), 0, va4, va9, 0.5);
+ }
+ // sycl::radians
+ TEST(sycl::radians, float, 3, EXPECTED(float, M_PI, M_PI, M_PI), 0, va10);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::radians, double, 3, EXPECTED(double, M_PI, M_PI, M_PI), 0, va11);
+ }
+ if (dev.has(sycl::aspect::fp16)) {
+ TEST(sycl::radians, sycl::half, 3, EXPECTED(sycl::half, M_PI, M_PI, M_PI),
+ 0.002, va12);
+ }
+ // sycl::step
+ TEST(sycl::step, float, 2, EXPECTED(float, 1.0f, 1.0f), 0, va1, va3);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::step, double, 2, EXPECTED(double, 1.0, 1.0), 0, va4, va9);
+ }
+ if (dev.has(sycl::aspect::fp16)) {
+ TEST(sycl::step, sycl::half, 3, EXPECTED(sycl::half, 1.0, 0.0, 1.0), 0,
+ va12, va13);
+ }
+ TEST(sycl::step, float, 2, EXPECTED(float, 1.0f, 0.0f), 0, 2.5f, va3);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::step, double, 2, EXPECTED(double, 0.0f, 1.0f), 0, 6.0f, va9);
+ }
+ // sycl::smoothstep
+ TEST(sycl::smoothstep, float, 2, EXPECTED(float, 1.0f, 1.0f), 0, va8, va1,
+ va2);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::smoothstep, double, 2, EXPECTED(double, 1.0, 1.0f), 0.00000001,
+ va4, va9, va9);
+ }
+ if (dev.has(sycl::aspect::fp16)) {
+ TEST(sycl::smoothstep, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0),
+ 0, va7, va12, va13);
+ }
+ TEST(sycl::smoothstep, float, 2, EXPECTED(float, 0.0553936f, 0.0f), 0.0000001,
+ 2.5f, 6.0f, va3);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::smoothstep, double, 2, EXPECTED(double, 0.0f, 1.0f), 0, 6.0f,
+ 8.0f, va9);
+ }
+ // sign
+ TEST(sycl::sign, float, 2, EXPECTED(float, +0.0f, -1.0f), 0, va14);
+ if (dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::sign, double, 2, EXPECTED(double, -0.0, 1.0), 0, va15);
+ }
+ if (dev.has(sycl::aspect::fp16)) {
+ TEST(sycl::sign, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0), 0,
+ va12);
+ }
+
+#ifdef __INTEL_PREVIEW_BREAKING_CHANGES
+ // sycl::clamp swizzled
TEST(sycl::clamp, float, 2, EXPECTED(float, 3.0f, 2.0f), 0,
va16.swizzle<1, 0>(), va2, va3);
TEST(sycl::clamp, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1, va2,
@@ -70,59 +152,46 @@ int main() {
va16.swizzle<1, 0>(), va16.swizzle<1, 0>());
TEST(sycl::clamp, float, 2, EXPECTED(float, 360.0f, 180.0f), 0,
va16.swizzle<1, 0>(), va16.swizzle<1, 0>(), va16.swizzle<1, 0>());
- TEST(sycl::clamp, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1, 1.0f, 3.0f);
TEST(sycl::clamp, float, 2, EXPECTED(float, 3.0f, 3.0f), 0,
va16.swizzle<1, 0>(), 1.0f, 3.0f);
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::clamp, double, 2, EXPECTED(double, 1.0, 2.0), 0, va4, 1.0, 3.0);
TEST(sycl::clamp, double, 2, EXPECTED(double, 3.0, 3.0), 0,
va11.swizzle<1, 0>(), 1.0, 3.0);
}
- // sycl::degrees
- TEST(sycl::degrees, float, 3, EXPECTED(float, 180, 180, 180), 0, va5);
+ // sycl::degrees swizzled
TEST(sycl::degrees, float, 2, EXPECTED(float, 180, 180), 0,
va5.swizzle<1, 0>());
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::degrees, double, 3, EXPECTED(double, 180, 180, 180), 0, va6);
TEST(sycl::degrees, double, 2, EXPECTED(double, 180, 180), 0,
va6.swizzle<1, 0>());
}
if (dev.has(sycl::aspect::fp16)) {
- TEST(sycl::degrees, sycl::half, 3, EXPECTED(sycl::half, 180, 180, 180), 0.2,
- va7);
TEST(sycl::degrees, sycl::half, 2, EXPECTED(sycl::half, 180, 180), 0.2,
va7.swizzle<1, 0>());
}
- // sycl::max
- TEST(sycl::max, float, 2, EXPECTED(float, 3.0f, 2.0f), 0, va1, va3);
+ // sycl::max swizzled
TEST(sycl::max, float, 2, EXPECTED(float, 360.0f, 180.0f), 0,
va16.swizzle<1, 0>(), va3);
TEST(sycl::max, float, 2, EXPECTED(float, 360.0f, 180.0f), 0, va1,
va16.swizzle<1, 0>());
- TEST(sycl::max, float, 2, EXPECTED(float, 1.5f, 2.0f), 0, va1, 1.5f);
TEST(sycl::max, float, 2, EXPECTED(float, 360.0f, 190.0f), 0,
va16.swizzle<1, 0>(), 190.0f);
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::max, double, 2, EXPECTED(double, 1.5, 2.0), 0, va4, 1.5);
TEST(sycl::max, double, 2, EXPECTED(double, 360.0, 190.0), 0,
va17.swizzle<1, 0>(), 190.0);
}
- // sycl::min
- TEST(sycl::min, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1, va3);
+ // sycl::min swizzled
TEST(sycl::min, float, 2, EXPECTED(float, 3.0f, 2.0f), 0,
va16.swizzle<1, 0>(), va3);
TEST(sycl::min, float, 2, EXPECTED(float, 1.0f, 2.0f), 0, va1,
va16.swizzle<1, 0>());
- TEST(sycl::min, float, 2, EXPECTED(float, 1.0f, 1.5f), 0, va1, 1.5f);
TEST(sycl::min, float, 2, EXPECTED(float, 190.0f, 180.0f), 0,
va16.swizzle<1, 0>(), 190.0f);
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::min, double, 2, EXPECTED(double, 1.0, 1.5), 0, va4, 1.5);
TEST(sycl::min, double, 2, EXPECTED(double, 190.0f, 180.0f), 0,
va17.swizzle<1, 0>(), 190.0);
}
- // sycl::mix
- TEST(sycl::mix, float, 2, EXPECTED(float, 1.6f, 2.0f), 0, va1, va3, va8);
+ // sycl::mix swizzled
TEST(sycl::mix, float, 2, EXPECTED(float, 252.9f, 73.2f), 0,
va16.swizzle<1, 0>(), va3, va8);
TEST(sycl::mix, float, 2, EXPECTED(float, 252.9f, 73.2f), 0,
@@ -137,9 +206,7 @@ int main() {
va16.swizzle<1, 0>(), va16.swizzle<1, 0>(), va18.swizzle<0, 1>());
TEST(sycl::mix, float, 2, EXPECTED(float, 1.6f, 2.0f), 0, va1, va3,
va18.swizzle<0, 1>());
- TEST(sycl::mix, float, 2, EXPECTED(float, 1.4f, 2.0f), 0, va1, va3, 0.2);
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::mix, double, 2, EXPECTED(double, 3.0, 5.0), 0, va4, va9, 0.5);
TEST(sycl::mix, double, 2, EXPECTED(double, 182.5, 94.0), 0,
va17.swizzle<1, 0>(), va9, 0.5);
TEST(sycl::mix, double, 2, EXPECTED(double, 180.5, 91.0), 0, va4,
@@ -147,23 +214,18 @@ int main() {
TEST(sycl::mix, double, 2, EXPECTED(double, 360.0, 180.0), 0,
va17.swizzle<1, 0>(), va17.swizzle<1, 0>(), 0.5);
}
- // sycl::radians
- TEST(sycl::radians, float, 3, EXPECTED(float, M_PI, M_PI, M_PI), 0, va10);
+ // sycl::radians swizzled
TEST(sycl::radians, float, 2, EXPECTED(float, M_PI, M_PI), 0,
va10.swizzle<1, 0>());
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::radians, double, 3, EXPECTED(double, M_PI, M_PI, M_PI), 0, va11);
TEST(sycl::radians, double, 2, EXPECTED(double, M_PI, M_PI), 0,
va11.swizzle<1, 0>());
}
if (dev.has(sycl::aspect::fp16)) {
- TEST(sycl::radians, sycl::half, 3, EXPECTED(sycl::half, M_PI, M_PI, M_PI),
- 0.002, va12);
TEST(sycl::radians, sycl::half, 2, EXPECTED(sycl::half, M_PI, M_PI), 0.002,
va12.swizzle<1, 0>());
}
- // sycl::step
- TEST(sycl::step, float, 2, EXPECTED(float, 1.0f, 1.0f), 0, va1, va3);
+ // sycl::step swizzled
TEST(sycl::step, float, 2, EXPECTED(float, 0.0f, 0.0f), 0,
va16.swizzle<1, 0>(), va3);
TEST(sycl::step, float, 2, EXPECTED(float, 1.0f, 1.0f), 0, va1,
@@ -171,7 +233,6 @@ int main() {
TEST(sycl::step, float, 2, EXPECTED(float, 1.0f, 1.0f), 0,
va16.swizzle<1, 0>(), va16.swizzle<1, 0>());
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::step, double, 2, EXPECTED(double, 1.0, 1.0), 0, va4, va9);
TEST(sycl::step, double, 2, EXPECTED(double, 0.0, 0.0), 0,
va17.swizzle<1, 0>(), va9);
TEST(sycl::step, double, 2, EXPECTED(double, 1.0, 1.0), 0, va4,
@@ -180,8 +241,6 @@ int main() {
va17.swizzle<1, 0>(), va17.swizzle<1, 0>());
}
if (dev.has(sycl::aspect::fp16)) {
- TEST(sycl::step, sycl::half, 3, EXPECTED(sycl::half, 1.0, 0.0, 1.0), 0,
- va12, va13);
TEST(sycl::step, sycl::half, 3, EXPECTED(sycl::half, 1.0, 0.0, 1.0), 0,
va12.swizzle<0, 1, 2>(), va13);
TEST(sycl::step, sycl::half, 3, EXPECTED(sycl::half, 1.0, 0.0, 1.0), 0,
@@ -189,17 +248,13 @@ int main() {
TEST(sycl::step, sycl::half, 3, EXPECTED(sycl::half, 1.0, 0.0, 1.0), 0,
va12.swizzle<0, 1, 2>(), va13.swizzle<0, 1, 2>());
}
- TEST(sycl::step, float, 2, EXPECTED(float, 1.0f, 0.0f), 0, 2.5f, va3);
TEST(sycl::step, float, 2, EXPECTED(float, 0.0f, 1.0f), 0, 2.5f,
va3.swizzle<1, 0>());
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::step, double, 2, EXPECTED(double, 0.0f, 1.0f), 0, 6.0f, va9);
TEST(sycl::step, double, 2, EXPECTED(double, 1.0f, 0.0f), 0, 6.0f,
va9.swizzle<1, 0>());
}
- // sycl::smoothstep
- TEST(sycl::smoothstep, float, 2, EXPECTED(float, 1.0f, 1.0f), 0, va8, va1,
- va2);
+ // sycl::smoothstep swizzled
TEST(sycl::smoothstep, float, 2, EXPECTED(float, 1.0f, 1.0f), 0,
va8.swizzle<0, 1>(), va1, va2);
TEST(sycl::smoothstep, float, 2, EXPECTED(float, 1.0f, 1.0f), 0, va8,
@@ -215,8 +270,6 @@ int main() {
TEST(sycl::smoothstep, float, 2, EXPECTED(float, 1.0f, 1.0f), 0,
va8.swizzle<0, 1>(), va1.swizzle<0, 1>(), va2.swizzle<0, 1>());
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::smoothstep, double, 2, EXPECTED(double, 1.0, 1.0f), 0.00000001,
- va4, va9, va9);
TEST(sycl::smoothstep, double, 2, EXPECTED(double, 1.0, 1.0f), 0.00000001,
va4.swizzle<0, 1>(), va9, va9);
TEST(sycl::smoothstep, double, 2, EXPECTED(double, 1.0, 1.0f), 0.00000001,
@@ -233,8 +286,6 @@ int main() {
va4.swizzle<0, 1>(), va9.swizzle<0, 1>(), va9.swizzle<0, 1>());
}
if (dev.has(sycl::aspect::fp16)) {
- TEST(sycl::smoothstep, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0),
- 0, va7, va12, va13);
TEST(sycl::smoothstep, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0),
0, va7.swizzle<0, 1, 2>(), va12, va13);
TEST(sycl::smoothstep, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0),
@@ -251,8 +302,6 @@ int main() {
0, va7.swizzle<0, 1, 2>(), va12.swizzle<0, 1, 2>(),
va13.swizzle<0, 1, 2>());
}
- TEST(sycl::smoothstep, float, 2, EXPECTED(float, 0.0553936f, 0.0f), 0.0000001,
- 2.5f, 6.0f, va3);
TEST(sycl::smoothstep, float, 2, EXPECTED(float, 0.0f, 0.0553936f), 0.0000001,
2.5f, 6.0f, va3.swizzle<1, 0>());
if (dev.has(sycl::aspect::fp64)) {
@@ -261,21 +310,18 @@ int main() {
TEST(sycl::smoothstep, double, 2, EXPECTED(double, 1.0f, 0.0f), 0, 6.0f,
8.0f, va9.swizzle<1, 0>());
}
- // sign
- TEST(sycl::sign, float, 2, EXPECTED(float, +0.0f, -1.0f), 0, va14);
+ // sign swizzled
TEST(sycl::sign, float, 2, EXPECTED(float, -1.0f, +0.0f), 0,
va14.swizzle<1, 0>());
if (dev.has(sycl::aspect::fp64)) {
- TEST(sycl::sign, double, 2, EXPECTED(double, -0.0, 1.0), 0, va15);
TEST(sycl::sign, double, 2, EXPECTED(double, 1.0, -0.0), 0,
va15.swizzle<1, 0>());
}
if (dev.has(sycl::aspect::fp16)) {
- TEST(sycl::sign, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0), 0,
- va12);
TEST(sycl::sign, sycl::half, 3, EXPECTED(sycl::half, 1.0, 1.0, 1.0), 0,
va12.swizzle<2, 1, 0>());
}
+#endif // __INTEL_PREVIEW_BREAKING_CHANGES
return 0;
}
diff --git a/sycl/test-e2e/Basic/built-ins/vec_geometric.cpp b/sycl/test-e2e/Basic/built-ins/vec_geometric.cpp
index 0d0acdee0c82..5b9e6fdad06e 100644
--- a/sycl/test-e2e/Basic/built-ins/vec_geometric.cpp
+++ b/sycl/test-e2e/Basic/built-ins/vec_geometric.cpp
@@ -1,7 +1,7 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
-// RUN: %if preview-breaking-changes-supported %{ %clangxx -fsycl -fpreview-breaking-changes %s -o %t2.out %}
-// RUN: %if preview-breaking-changes-supported %{ %{run} %t2.out %}
+// RUN: %if preview-breaking-changes-supported %{ %{build} -fpreview-breaking-changes -o %t2.out %}
+// RUN: %if preview-breaking-changes-supported %{ %{run} %t2.out %}
// https://github.com/intel/llvm/issues/7585 to fix the failure:
// XFAIL: gpu
@@ -69,70 +69,38 @@ int main() {
VFloatD3_2);
TEST(sycl::cross, float, 4, EXPECTED(float, -1.f, -4.f, 3.f, 0.f), 0,
VFloatD4, VFloatD4_2);
- TEST(sycl::cross, float, 3, EXPECTED(float, -1.f, -4.f, 3.f), 0,
- VFloatD4.swizzle<0, 1, 2>(), VFloatD3_2);
- TEST(sycl::cross, float, 3, EXPECTED(float, -1.f, -4.f, 3.f), 0, VFloatD3,
- VFloatD4_2.swizzle<0, 1, 2>());
- TEST(sycl::cross, float, 3, EXPECTED(float, -1.f, -4.f, 3.f), 0,
- VFloatD4.swizzle<0, 1, 2>(), VFloatD4_2.swizzle<0, 1, 2>());
if (Dev.has(sycl::aspect::fp64)) {
TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
VDoubleD3, VDoubleD3_2);
TEST(sycl::cross, double, 4, EXPECTED(double, -1.f, -4.f, 3.f, 0.f), 0,
VDoubleD4, VDoubleD4_2);
- TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
- VDoubleD3, VDoubleD4_2.swizzle<0, 1, 2>());
- TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
- VDoubleD4.swizzle<0, 1, 2>(), VDoubleD3_2);
- TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
- VDoubleD4.swizzle<0, 1, 2>(), VDoubleD4_2.swizzle<0, 1, 2>());
}
TEST2(sycl::dot, float, 13.f, 0, VFloatD2, VFloatD2_2);
TEST2(sycl::dot, float, 32.f, 0, VFloatD3, VFloatD3_2);
TEST2(sycl::dot, float, 48.f, 0, VFloatD4, VFloatD4_2);
- TEST2(sycl::dot, float, 32.f, 0, VFloatD4.swizzle<0, 1, 2>(), VFloatD3_2);
- TEST2(sycl::dot, float, 32.f, 0, VFloatD3, VFloatD4_2.swizzle<0, 1, 2>());
- TEST2(sycl::dot, float, 32.f, 0, VFloatD4.swizzle<0, 1, 2>(),
- VFloatD4_2.swizzle<0, 1, 2>());
if (Dev.has(sycl::aspect::fp64)) {
TEST2(sycl::dot, double, 13, 0, VDoubleD2, VDoubleD2_2);
TEST2(sycl::dot, double, 32, 0, VDoubleD3, VDoubleD3_2);
TEST2(sycl::dot, double, 48, 0, VDoubleD4, VDoubleD4_2);
- TEST2(sycl::dot, double, 32, 0, VDoubleD4.swizzle<0, 1, 2>(), VDoubleD3_2);
- TEST2(sycl::dot, double, 32, 0, VDoubleD3, VDoubleD4_2.swizzle<0, 1, 2>());
- TEST2(sycl::dot, double, 32, 0, VDoubleD4.swizzle<0, 1, 2>(),
- VDoubleD4_2.swizzle<0, 1, 2>());
}
TEST2(sycl::length, float, 2.236068f, 1e-6, VFloatD2);
TEST2(sycl::length, float, 3.741657f, 1e-6, VFloatD3);
TEST2(sycl::length, float, 5.477225f, 1e-6, VFloatD4);
- TEST2(sycl::length, float, 3.741657f, 1e-6, VFloatD4.swizzle<0, 1, 2>());
if (Dev.has(sycl::aspect::fp64)) {
TEST2(sycl::length, double, 2.236068, 1e-6, VDoubleD2);
TEST2(sycl::length, double, 3.741657, 1e-6, VDoubleD3);
TEST2(sycl::length, double, 5.477225, 1e-6, VDoubleD4);
- TEST2(sycl::length, double, 3.741657, 1e-6, VDoubleD4.swizzle<0, 1, 2>());
}
TEST2(sycl::distance, float, 3.605551f, 1e-6, VFloatD2, VFloatD2_2);
TEST2(sycl::distance, float, 5.f, 0, VFloatD3, VFloatD3_2);
TEST2(sycl::distance, float, 5.f, 0, VFloatD4, VFloatD4_2);
- TEST2(sycl::distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(), VFloatD3_2);
- TEST2(sycl::distance, float, 5.f, 0, VFloatD3, VFloatD4_2.swizzle<0, 1, 2>());
- TEST2(sycl::distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(),
- VFloatD4_2.swizzle<0, 1, 2>());
if (Dev.has(sycl::aspect::fp64)) {
TEST2(sycl::distance, double, 3.605551, 1e-6, VDoubleD2, VDoubleD2_2);
TEST2(sycl::distance, double, 5.0, 0, VDoubleD3, VDoubleD3_2);
TEST2(sycl::distance, double, 5.0, 0, VDoubleD4, VDoubleD4_2);
- TEST2(sycl::distance, double, 5.0, 0, VDoubleD4.swizzle<0, 1, 2>(),
- VDoubleD3_2);
- TEST2(sycl::distance, double, 5.0, 0, VDoubleD3,
- VDoubleD4_2.swizzle<0, 1, 2>());
- TEST2(sycl::distance, double, 5.0, 0, VDoubleD4.swizzle<0, 1, 2>(),
- VDoubleD4_2.swizzle<0, 1, 2>());
}
TEST(sycl::normalize, float, 2, EXPECTED(float, 0.447213f, 0.894427f), 1e-6,
@@ -142,8 +110,6 @@ int main() {
TEST(sycl::normalize, float, 4,
EXPECTED(float, 0.182574f, 0.365148f, 0.547723f, 0.730297f), 1e-6,
VFloatD4);
- TEST(sycl::normalize, float, 3, EXPECTED(float, 0.267261, 0.534522, 0.801784),
- 1e-6, VFloatD4.swizzle<0, 1, 2>());
if (Dev.has(sycl::aspect::fp64)) {
TEST(sycl::normalize, double, 2, EXPECTED(double, 0.447213, 0.894427), 1e-6,
VDoubleD2);
@@ -152,25 +118,15 @@ int main() {
TEST(sycl::normalize, double, 4,
EXPECTED(double, 0.182574, 0.365148, 0.547723, 0.730297), 1e-6,
VDoubleD4);
- TEST(sycl::normalize, double, 3,
- EXPECTED(double, 0.267261, 0.534522, 0.801784), 1e-6,
- VDoubleD4.swizzle<0, 1, 2>());
}
TEST2(sycl::fast_distance, float, 3.605551f, 1e-6, VFloatD2, VFloatD2_2);
TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD3, VFloatD3_2);
TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD4, VFloatD4_2);
- TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(),
- VFloatD3_2);
- TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD3,
- VFloatD4_2.swizzle<0, 1, 2>());
- TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(),
- VFloatD4_2.swizzle<0, 1, 2>());
TEST2(sycl::fast_length, float, 2.236068f, 1e-6, VFloatD2);
TEST2(sycl::fast_length, float, 3.741657f, 1e-6, VFloatD3);
TEST2(sycl::fast_length, float, 5.477225f, 1e-6, VFloatD4);
- TEST2(sycl::fast_length, float, 3.741657f, 1e-6, VFloatD4.swizzle<0, 1, 2>());
TEST(sycl::fast_normalize, float, 2, EXPECTED(float, 0.447213f, 0.894427f),
1e-3, VFloatD2);
@@ -179,9 +135,73 @@ int main() {
TEST(sycl::fast_normalize, float, 4,
EXPECTED(float, 0.182574f, 0.365148f, 0.547723f, 0.730297f), 1e-3,
VFloatD4);
+
+#ifdef __INTEL_PREVIEW_BREAKING_CHANGES
+ TEST(sycl::cross, float, 3, EXPECTED(float, -1.f, -4.f, 3.f), 0,
+ VFloatD4.swizzle<0, 1, 2>(), VFloatD3_2);
+ TEST(sycl::cross, float, 3, EXPECTED(float, -1.f, -4.f, 3.f), 0, VFloatD3,
+ VFloatD4_2.swizzle<0, 1, 2>());
+ TEST(sycl::cross, float, 3, EXPECTED(float, -1.f, -4.f, 3.f), 0,
+ VFloatD4.swizzle<0, 1, 2>(), VFloatD4_2.swizzle<0, 1, 2>());
+ if (Dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
+ VDoubleD3, VDoubleD4_2.swizzle<0, 1, 2>());
+ TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
+ VDoubleD4.swizzle<0, 1, 2>(), VDoubleD3_2);
+ TEST(sycl::cross, double, 3, EXPECTED(double, -1.f, -4.f, 3.f), 0,
+ VDoubleD4.swizzle<0, 1, 2>(), VDoubleD4_2.swizzle<0, 1, 2>());
+ }
+
+ TEST2(sycl::dot, float, 32.f, 0, VFloatD4.swizzle<0, 1, 2>(), VFloatD3_2);
+ TEST2(sycl::dot, float, 32.f, 0, VFloatD3, VFloatD4_2.swizzle<0, 1, 2>());
+ TEST2(sycl::dot, float, 32.f, 0, VFloatD4.swizzle<0, 1, 2>(),
+ VFloatD4_2.swizzle<0, 1, 2>());
+ if (Dev.has(sycl::aspect::fp64)) {
+ TEST2(sycl::dot, double, 32, 0, VDoubleD4.swizzle<0, 1, 2>(), VDoubleD3_2);
+ TEST2(sycl::dot, double, 32, 0, VDoubleD3, VDoubleD4_2.swizzle<0, 1, 2>());
+ TEST2(sycl::dot, double, 32, 0, VDoubleD4.swizzle<0, 1, 2>(),
+ VDoubleD4_2.swizzle<0, 1, 2>());
+ }
+
+ TEST2(sycl::length, float, 3.741657f, 1e-6, VFloatD4.swizzle<0, 1, 2>());
+ if (Dev.has(sycl::aspect::fp64)) {
+ TEST2(sycl::length, double, 3.741657, 1e-6, VDoubleD4.swizzle<0, 1, 2>());
+ }
+
+ TEST2(sycl::distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(), VFloatD3_2);
+ TEST2(sycl::distance, float, 5.f, 0, VFloatD3, VFloatD4_2.swizzle<0, 1, 2>());
+ TEST2(sycl::distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(),
+ VFloatD4_2.swizzle<0, 1, 2>());
+ if (Dev.has(sycl::aspect::fp64)) {
+ TEST2(sycl::distance, double, 5.0, 0, VDoubleD4.swizzle<0, 1, 2>(),
+ VDoubleD3_2);
+ TEST2(sycl::distance, double, 5.0, 0, VDoubleD3,
+ VDoubleD4_2.swizzle<0, 1, 2>());
+ TEST2(sycl::distance, double, 5.0, 0, VDoubleD4.swizzle<0, 1, 2>(),
+ VDoubleD4_2.swizzle<0, 1, 2>());
+ }
+
+ TEST(sycl::normalize, float, 3, EXPECTED(float, 0.267261, 0.534522, 0.801784),
+ 1e-6, VFloatD4.swizzle<0, 1, 2>());
+ if (Dev.has(sycl::aspect::fp64)) {
+ TEST(sycl::normalize, double, 3,
+ EXPECTED(double, 0.267261, 0.534522, 0.801784), 1e-6,
+ VDoubleD4.swizzle<0, 1, 2>());
+ }
+
+ TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(),
+ VFloatD3_2);
+ TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD3,
+ VFloatD4_2.swizzle<0, 1, 2>());
+ TEST2(sycl::fast_distance, float, 5.f, 0, VFloatD4.swizzle<0, 1, 2>(),
+ VFloatD4_2.swizzle<0, 1, 2>());
+
+ TEST2(sycl::fast_length, float, 3.741657f, 1e-6, VFloatD4.swizzle<0, 1, 2>());
+
TEST(sycl::fast_normalize, float, 3,
EXPECTED(float, 0.267261f, 0.534522f, 0.801784f), 1e-3,
VFloatD4.swizzle<0, 1, 2>());
+#endif // __INTEL_PREVIEW_BREAKING_CHANGES
return 0;
}
diff --git a/sycl/test-e2e/Graph/RecordReplay/buffer_ordering.cpp b/sycl/test-e2e/Graph/RecordReplay/buffer_ordering.cpp
index f12bce4091a2..1e668d47356a 100644
--- a/sycl/test-e2e/Graph/RecordReplay/buffer_ordering.cpp
+++ b/sycl/test-e2e/Graph/RecordReplay/buffer_ordering.cpp
@@ -5,8 +5,8 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
-// Extra run to check for leaks in Level Zero using ZE_DEBUG
-// RUN: %if ext_oneapi_level_zero %{env ZE_DEBUG=4 %{run} %t.out 2>&1 | FileCheck %s %}
+// Extra run to check for leaks in Level Zero using UR_L0_LEAKS_DEBUG
+// RUN: %if level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
//
// CHECK-NOT: LEAK
diff --git a/sycl/test-e2e/Matrix/XMX8/element_wise_all_sizes.cpp b/sycl/test-e2e/Matrix/XMX8/element_wise_all_sizes.cpp
index eef91f0258e4..7548bf71f24d 100644
--- a/sycl/test-e2e/Matrix/XMX8/element_wise_all_sizes.cpp
+++ b/sycl/test-e2e/Matrix/XMX8/element_wise_all_sizes.cpp
@@ -7,6 +7,10 @@
//===----------------------------------------------------------------------===//
// REQUIRES: matrix-xmx8
+// TODO: Currently fails and regularly times out on DG2. Re-enable when this has
+// been addressed.
+// UNSUPPORTED: gpu-intel-dg2
+
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
diff --git a/sycl/test-e2e/format.py b/sycl/test-e2e/format.py
index c44fadde2680..4060e0c99ab2 100644
--- a/sycl/test-e2e/format.py
+++ b/sycl/test-e2e/format.py
@@ -12,9 +12,9 @@ import os
import re
def get_triple(test, backend):
- if backend == 'ext_oneapi_cuda':
+ if backend == 'cuda':
return 'nvptx64-nvidia-cuda-syclmlir'
- if backend == 'ext_oneapi_hip':
+ if backend == 'hip':
if test.config.hip_platform == 'NVIDIA':
return 'nvptx64-nvidia-cuda-syclmlir'
else:
@@ -164,10 +164,13 @@ class SYCLEndToEndTest(lit.formats.ShTest):
# so that device might still be accessible to some of the tests yet
# we won't set the environment variable below for such scenario.
extra_env = []
- if 'ext_oneapi_level_zero:gpu' in sycl_devices and litConfig.params.get('ze_debug'):
- extra_env.append('ZE_DEBUG={}'.format(test.config.ze_debug))
+ if 'level_zero:gpu' in sycl_devices and litConfig.params.get('ur_l0_debug'):
+ extra_env.append('UR_L0_DEBUG={}'.format(test.config.ur_l0_debug))
- if 'ext_oneapi_cuda:gpu' in sycl_devices:
+ if 'level_zero:gpu' in sycl_devices and litConfig.params.get('ur_l0_leaks_debug'):
+ extra_env.append('UR_L0_LEAKS_DEBUG={}'.format(test.config.ur_l0_leaks_debug))
+
+ if 'cuda:gpu' in sycl_devices:
extra_env.append('SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1')
return extra_env
@@ -203,9 +206,9 @@ class SYCLEndToEndTest(lit.formats.ShTest):
# Expand device-specific condtions (%if ... %{ ... %}).
tmp_script = [ cmd ]
conditions = {x: True for x in sycl_device.split(':')}
- for op_sys in ['linux', 'windows']:
- if op_sys in test.config.available_features:
- conditions[op_sys] = True
+ for cond_features in ['linux', 'windows', 'preview-breaking-changes-supported']:
+ if cond_features in test.config.available_features:
+ conditions[cond_features] = True
tmp_script = lit.TestRunner.applySubstitutions(
tmp_script, [], conditions, recursion_limit=test.config.recursiveExpansionLimit) |
69e5edd broke CI as it was not running on `sycl-mlir` PRs. Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Add new: - `Basic/image/image_accessor_readwrite.cpp` - `Basic/image/image_write_fp16.cpp` - `Basic/swizzle_op.cpp` - `Basic/vector/int-convert.cpp` - `InvokeSimd/Spec/ImplicitSubgroup/uniform_retval.cpp` - `Regression/vec_logical_ops.cpp` - `Regression/vec_rel_swizzle_ops.cpp` Drop: - `Plugin/level_zero_queue_profiling.cpp` - `Plugin/level_zero_usm_residency.cpp` Rename: - `sycl/test-e2e/Basic/vector/vec_bool.cpp` -> `sycl/test-e2e/Basic/vector/bool.cpp` - `sycl/test-e2e/Basic/vector/vector_operators.cpp` -> `sycl/test-e2e/Basic/vector/operators.cpp` - `sycl/test-e2e/Basic/vector/scalar_vec_access.cpp` -> `sycl/test-e2e/Basic/vector/scalar_access.cpp` Signed-off-by: Victor Perez <victor.perez@codeplay.com>
whitneywhtsang
approved these changes
Jan 5, 2024
sommerlukas
approved these changes
Jan 5, 2024
Opened issues arising from this rebase: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
disable-lint
Skip linter check step and proceed with build jobs
sycl-mlir
Pull requests or issues for sycl-mlir branch
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Review only files with conflict:
And commits: