-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL-MLIR] Merge from intel/llvm sycl branch #12142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Specify their restrictions w.r.t. `align` attribute.
…4423) The explicit 'align 4' caused the pointers to be underaligned on RV64.
…408) This PR creates the wrapper class AffineForOp and adds a testcase for it. A testcase for the AffineLoadOp is also added.
… macro with parentheses. (#68618) Error message: ``` In file included from ../src/amd/addrlib/src/core/addrobject.h:21: ../src/amd/addrlib/src/core/addrcommon.h:343:13: error: expected unqualified-id out = ::_tzcnt_u32(mask); ^ /usr/lib/llvm-15/lib/clang/15.0.6/include/bmiintrin.h:74:27: note: expanded from macro '_tzcnt_u32' #define _tzcnt_u32(a) (__tzcnt_u32((a))) ``` This is because both GCC/Clang doesn't support compiling the following code: ``` #ifdef _MSC_VER #include <intrin.h> #else #include <x86intrin.h> #endif int f(int a) { return ::(_tzcnt_u32)(a); } ``` This is because the return statement expects an expression or braced init list: http://eel.is/c++draft/stmt.jump#general-1 but we really only need to care about the expression form (there's no braces in sight). Grammatically, the leading :: will parse as a qualified-id because it matches the production for nested-name-specifier: http://eel.is/c++draft/expr.prim.id.qual#nt:qualified-id That needs to be followed by an unqualified-id (http://eel.is/c++draft/expr.prim.id.unqual#nt:unqualified-id), but the open paren does not match any of the grammar productions, so this is a syntax error. Closes: llvm/llvm-project#64467 Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Also eliminate an unused variable while here.
Follow up on a post-commit review of 9468de4 (TargetInstrInfo: make getOperandLatency return optional (NFC)) by Bjorn Pettersson to fix a couple of things that are not NFC: - std::optional<T>::operator<= returns true if the first operand is a std::nullopt and second operand is T. Fix a couple of places where we assumed it would return false. - In TargetSchedule, computeInstrCost could take another codepath, returning InstrLatency instead of DefaultDefLatency. Fix one instance not accounting for this behavior.
…#73523) Updates patterns for flattening `vector.transfer_read` by relaxing the requirement that the "collapsed" indices are all zero. This enables collapsing cases like this one: ```mlir %2 = vector.transfer_read %arg4[%c0, %arg0, %arg1, %c0] ... : memref<1x43x4x6xi32>, vector<1x2x6xi32> ``` Previously only the following case would be consider for collapsing (all indices are 0): ```mlir %2 = vector.transfer_read %arg4[%c0, %c0, %c0, %c0] ... : memref<1x43x4x6xi32>, vector<1x2x6xi32> ``` Also adds some new comments and renames the `firstContiguousInnerDim` parameter as `firstDimToCollapse` (the latter better matches the actual meaning). Similar updates for `vector.transfer_write` will be implemented in a follow-up patch.
… (#73501) Count how often the value is with signext/zeroext calls when determining the preferred extension type.
This commit refactors `verifyDominanceOfContainedRegions` to iterative algorithms similar to https://reviews.llvm.org/D154925 to fix stack overflow for deeply nested regions (e.g. llvm/circt#5316). There should be no functional change except that this could result in slightly different order of verification.
Because AA does not support vectors of pointers, we have to treat pointers that are inserted into a vector as captures. We mostly already do so, but missed the case where getelementptr is used to produce a vector.
AsmParser creates dummy values when value identifiers are not going in ascending order and tries to use those dummy values when/if they are being referenced. We need to postpone this until all required data is read.
Now that tbaa tags pass is enabled by default, I would like to remove these flags. `-fno-alias-analysis` was originally intended to be useful for debugging, but as it also disables tbaa tag generation in codegen, it turned out to be too noisy. @banach-space expressed that these flags felt too non-standard. The tbaa tags pass can be toggled using `-mllvm -disable-fir-alias-tags=0`
Translate debug emission kind into LLVM (the importer already supports this).
…l fcmp (#66505)"" This reverts commit d55692d. See discussion in #66505: assertion fires in OSS build of TensorFlow.
…h instead" (#74444) Reverts llvm/llvm-project#73939 This broke libc-aarch64-ubuntu build bot https://lab.llvm.org/buildbot/#/builders/138/builds/56186
…h instead (#73939) (#74446) Same as #73939 but also fix `libc/src/string/memory_utils/op_aarch64.h` that was still using `deferred_static_assert`.
This is a minor improvement to #73491.
Add support for frame pointers in MLIR. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com> Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
ARM64EC needs to handle both ARM and x86_64 exception tables. This is achieved by separating their chunks and sorting them separately. EXCEPTION_TABLE directory references x86_64 variant, while ARM variant is exposed using CHPE metadata, which references __arm64x_extra_rfe_table and __arm64x_extra_rfe_table_size symbols.
…… (#74336) …ng member pointers. This initially landed with a broken test due to a mid-air collision with a new requirement for Environment initialization before field modeling. Have added that initialization in the test. From first landing: getMethodDecl does not handle pointers to members and returns nullptr for them. getMethodDecl contains a decade-plus-old FIXME to handle pointers to members, but two approaches I looked at for fixing it are more invasive or complex than simply swapping to getCalleeDecl. The first, have getMethodDecl call getCalleeDecl, creates a large tree of const-ness mismatches due to getMethodDecl returning a non-const value while being a const member function and getCalleeDecl only being a const member function when it returns a const value. The second, implementing an AST walk to match how CXXMemberCallExpr::getImplicitObjectArgument grabs the LHS of the binary operator, is basically reimplementing Expr::getReferencedDeclOfCallee, which is used by Expr::getCalleeDecl. We don't need another copy of that code.
…te (#74093) Avoid editing a range of DPValues and then remapping them. This occurs when we try to de-duplicate dbg.values, but then re-use the same iterator range. We can instead remap them, and then erase any duplicates. At the same time refactor the computation of seen-intrinsic hashes, and account for a peculiarity of loop-rotates existing behaviour: it will only deduplicate dbg.values that are immediately before the preheaders branch instruction, not just any dbg.value in the preheader.
…2384) This includes some commonly needed information like how to add reviewers. This is implemented as a job before the labeler, so that on a new PR the comment is added before there are any subscribers and only the author gets a nofitication. The labeler job depends on the greeter having run or having been skipped. So if the PR wasn't just opened, or it's from a regular contributor, the labeling still happens. But we can be sure that when a greeting comment is left, it's the very first thing we do.
This patch adds a target_features (TargetFeaturesAttr) to the LLVM dialect to allow setting and querying the features in use on a function. The motivation for this comes from the Arm SME dialect where we would like a convenient way to check what variants of an operation are available based on the CPU features. Intended usage: The target_features attribute is populated manually or by a pass: ```mlir func.func @example() attributes { target_features = #llvm.target_features<["+sme", "+sve", "+sme-f64f64"]> } { // ... } ``` Then within a later rewrite the attribute can be checked, and used to make lowering decisions. ```c++ // Finds the "target_features" attribute on the parent // FunctionOpInterface. auto targetFeatures = LLVM::TargetFeaturesAttr::featuresAt(op); // Check a feature. // Returns false if targetFeatures is null or the feature is not in // the list. if (!targetFeatures.contains("+sme-f64f64")) return failure(); ``` For now, this is rather simple just checks if the exact feature is in the list, though it could be possible to extend with implied features using information from LLVM.
Continuation of llvm/llvm-project#74247 to fix llvm/llvm-project#56962. Fixes verifier for (Integer Attr): ```mlir llvm.mlir.constant(1 : index) : f32 ``` and (Dense Attr): ```mlir llvm.mlir.constant(dense<100.0> : vector<1xf64>) : f32 ``` ## Integer Attr The addition that this PR makes to `LLVM::ConstantOp::verify` is meant to be exactly verifying the code in `mlir::LLVM::detail::getLLVMConstant`: https://github.com/llvm/llvm-project/blob/9f78edbd20ed922cced9482f7791deb9899a6d82/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp#L350-L353 One failure mode is when the `type` (`llvm.mlir.constant(<value>) : <type>`) is not an `Integer`, because then the `cast` in `getIntegerBitWidth` will crash: https://github.com/llvm/llvm-project/blob/dca432cb7b1c282f5dc861095813c4f40f109619/llvm/include/llvm/IR/DerivedTypes.h#L97-L99 So that's now caught in the verifier. Apart from that, I don't see anything we could check for. `sextOrTrunc` means "Sign extend or truncate to width" and that one is quite permissive. For example, the following doesn't have to be caught in the verifier as it doesn't crash during `mlir-translate -mlir-to-llvmir`: ```mlir llvm.func @main() -> f32 { %cst = llvm.mlir.constant(100 : i64) : f32 llvm.return %cst : f32 } ``` ## Dense Attr Crash if not either a MLIR Vector type or one of these: https://github.com/llvm/llvm-project/blob/9f78edbd20ed922cced9482f7791deb9899a6d82/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp#L375-L391
First draft of an extension allowing opaque types to be passed to kernels via their raw byte representation. --------- Signed-off-by: John Pennycook <john.pennycook@intel.com> Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
Allows the use of sycl_ext_oneapi_kernel_properties extension in Sycl-Graph. Kernel properties passed to the backend using the sycl kernel object. Therefore, the graph implementation does not prevent the backend from accessing these properties. Removes exception throwing. Adds tests.
… pass (intel#12108) This PR removes some remnants of the opaque pointer transition from the internalization pass. For private internalization, we insert an `alloca` instruction to model a work-item's private storage. Currently, we have to infer the accessor's element type by looking at its users. Now, we simply pass the element size as metadata to the pass, and insert an `i8`-`alloca` with the desired size in bytes. I'm using separate `LocalSize` (= number of buffer elements associated with each work-item resp. work-group) and `ElemSize` here for now as the remapping of GEP instructions still works in terms of number-of-elements. If in the future these are rewritten to perform `i8*` arithmetic as well, we could go back to only passing a single size (then in bytes instead of number of elements). --------- Signed-off-by: Julian Oppermann <julian.oppermann@codeplay.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
It is cascade removal of some unused code after removing esimd/detail/host_util.hpp and removing __ESIMD_EMU_DNS namespace. Even with this change ESIMD headers still have lots of unused HOST-only code. Signed-off-by: Klochkov, Vyacheslav N <vyacheslav.n.klochkov@intel.com>
CreateProgram and CreateKernel APIs implicitly retain program and kernels. Extra call to retain is incorrect.
…intel#12119) The changes which prevent coverage and profile options from being used for device compilations had a problem with -fsanitize=address logic. We were allowing _any_ option that happened to have 'address' as an argument as a legal pass through. Adjust this exception to be specific to -fsanitize.
Extend intel#11674 by modifying the globaloffset optimization pass to always replace uses of Loads from the `llvm.nvvm.implicit.offset` and `llvm.amdgcn.implicit.offset intrinsics` with constant zeros in the original non-offset kernel. Hence, perform the optimization even when `-enable-global-offset=true` (default). Duplicate recursively functions containing calls to the implicit offset intrinsic and let the implicit offset kernel entry point only call the original functions (i.e. do not call the functions with added offset arguments). Remove zero allocations for the original kernel entry points.
…ntel#11541) When event list is null, a barrier is still needed for all previous commands, so fix it. --------- Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com> Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
It passes with a newer driver so just check for that. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
The CI E2E tests are failing because these tests are passing rather than expectedly failing * UR nightly CI - https://github.com/oneapi-src/unified-runtime/actions/runs/7160854862/job/19495692574 * intel/llvm CI - https://github.com/intel/llvm/actions/runs/7168248741/job/19516014849 Change the tests to be skipped rather than XFAIL while we investigate
Use Builder::createItem help function to create `item` object instead of adding one helper function in reduction namespace.
…cs. (intel#12005) Also handle translating these properties in pi2ur. oneapi-src/unified-runtime#1123 --------- Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
Educated guess merge conflict resolution before even trying to compile.
The API of mlir::InlinerInterface::handleTerminator() changed in mlir/include/mlir/Transforms/InliningUtils.h with llvm/llvm-project#72765 llvm/llvm-project@26a0b27
There are still a few problems related to |
CI failed because PR was missing Thanks a lot for the contribution @keryell! |
Failing tests:
|
I do not have the right to set tags on this repository. |
Explicit SIMD and SYCL Graphs are currently not yet supported with SYCL-MLIR, so you can just add the failing tests to the list of expectedly failing tests here, as described in the documentation. For the others, a quick investigation to see what causes the error would be useful. |
I'll close this PR now as it's outdated. |
Yes, you are right! |
Please do not squash and merge this PR.