[GEN] Update GENX branch to LLVM `3a83162` #13773

whitneywhtsang · 2024-05-13T18:41:18Z

No description provided.

We need to use InitField here, not SetField.

Referring to RISC-V, adding an MI level pass to optimize *W instructions for LoongArch. First it removes unneeded sext(addi.w rd, rs, 0) instructions. Either because the sign extended bits aren't consumed or because the input was already sign extended by an earlier instruction. Then: 1. Unless explicit disabled or the target prefers instructions with W suffix, it removes the -w suffix from opw instructions whenever all users are dependent only on the lower word of the result of the instruction. The cases handled are: * addi.w because it helps reduce test differences between LA32 and LA64 w/o being a pessimization. 2. Or if explicit enabled or the target prefers instructions with W suffix, it adds the W suffix to the instruction whenever all users are dependent only on the lower word of the result of the instruction. The cases handled are: * add.d/addi.d/sub.d/mul.d. * slli.d with imm < 32. * ld.d/ld.wu.

Noticed while investigating GFNI per-element vector shifts (we can form SHL but not SRL/SRA) Alive2: https://alive2.llvm.org/ce/z/fSH-rf

Change definition of expandBitCastI128ToF128 and expandBitCastF128ToI128 to allow for simplified use in atomic load/store. Update logic to split 128-bit loads and stores in DAGCombine to also handle the f128 case where appropriate. This fixes the regressions introduced by recent atomic load/store patches.

If we don't demand the same element from both single source shuffles (permutes), then attempt to blend the sources together first and then perform a merged permute. For vXi16 blends we have to be careful as these are much more likely to involve byte/word vector shuffles that will result in the creation of additional shuffle instructions. This fold might be worth it for VSELECT with constant masks on AVX512 targets, but I haven't investigated this yet, but I've tried to write combineBlendOfPermutes so to be prepared for this. The PR34592 -O0 regression is an unfortunate failure to cleanup with a later pass that calls SimplifyDemandedElts like the -O3 does - I'm not sure how worried we should be tbh.

Followup to #89727

… with 1 or 2 args (#89624) The destructors generated by the legacy IBM `xlclang++` compiler can take 1 or 2 arguments and the differences were handled by type `cast` where it is needed. Clang now treats the `cast` here as an error after llvm/llvm-project@999d4f8 landed with `-Xextra -Werror`. The issue had been worked around by using `#pragma GCC diagnostic push/pop`. This patch defines 2 separate destructor types for 1 argument and 2 arguments respectively so `cast` is not needed.

llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:40081:21: error: comparison of integers of different signs: 'int' and 'unsigned int' [-Werror,-Wsign-compare] for (int I = 0; I != NumElts; ++I) { ~ ^ ~~~~~~~ 1 error generated.

llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:3582:13: error: unused function 'isBlendOrUndef' [-Werror,-Wunused-function] static bool isBlendOrUndef(ArrayRef<int> Mask) { ^ 1 error generated.

Handle implicit firstprivate DSAs on task generating constructs. Fixes llvm/llvm-project#64480

…#90671) Fixes llvm/llvm-project#78936

Plugins are not loaded without the -cc1 phase. Do not report them when running on an assembly file or when linking. Many build tools add these options to all driver invocations, including LLVM's build system. Fixes #88173

An `emitc.expression` can only yield a single result, but some operations which have the `CExpression` trait can have multiple results, which can result in a crash when applying the `fold-expressions` pass. This change adds a check for the single-result condition and a simple test.

Pre-commit tests for an upcoming patch.

…xts (reland #90438) (#91172) This relands #90348 with a fix for a [buildbot failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446) caused by the test being run with `-fno-rtti`.

Fixes #90285.

@nickdesaulniers

… macros (#88816) Adds more FP test macros for the upcoming test adds for #61092 and the issues opened from it: #88768, #88769, #88770, #88771, #88772. Fix bug in `{EXPECT,ASSERT}_FP_EXCEPTION`. `EXPECT_FP_EXCEPTION(0)` seems to be used to test that an exception did not happen, but it always does `EXPECT_GE(... & 0, 0)` which never fails. Update and refactor tests that break after the above bug fix. An interesting way things broke after the above change is that `ForceRoundingMode` and `quick_get_round()` were raising the inexact exception, breaking a lot of the `atan*` tests. The changes for all files other than `FPMatcher.h` and `libc/test/src/math/smoke/RoundToIntegerTest.h` should have the same semantics as before. For `RoundToIntegerTest.h`, lines 56-58 before the changes do not always hold since this test is used for functions with different exception and errno behavior like `lrint` and `lround`. I've deleted those lines for now, but tests for those cases should be added for the different nearest int functions to account for this. Adding @nickdesaulniers for review.

…n ThinLTO summaries" (#90610)" (#91194) Reverts llvm/llvm-project#90692 Breaking PPC buildbots. The bots are not meant to test LLD, but are running a test that is using an old version of LLD without the change (so is incompatible). Revert until a fix is found.

@tedwoodward

…or. (#87649) `SBProcess::GetMemoryRegionInfo` uses `qMemoryRegionInfo` packet to get memory region info, but this is not supported in gdb-server and causing downstream lldb test failures. This change ignores the the error from `SBProcess::GetMemoryRegionInfo` . Reported by @tedwoodward @jerinphilip.

… (#91019) This reverts commit 1106644. As noted in the original patch, this was designed to reverted once https://reviews.llvm.org/D142479 and https://reviews.llvm.org/D142660 landed, which has long since happened.

Fix the issue that `char` constants are converted to `uint64_t` in the wrong way when doing the inlining.

…h, NFC.

Need to check that the signed operand has an extra sign bit to be sure that we do not skip signedness, when trying to minimize bitwidth for smin/smax intrinsics.

…m-parameter (#91160)

…1168) Because LiveVariables has been run, we no longer need to lookup the users in MachineRegisterInfo anymore and can instead just check for the dead flag.

This commit explicitly specifies the matching mode (C library function, any non-method function, or C++ method) for the `CallDescription`s constructed in the checker `osx.MIG`. The code was simplified to use a `CallDescriptionMap` instead of a raw vector of pairs. This change won't cause major functional changes, but isn't NFC because it ensures that e.g. call descriptions for a non-method function won't accidentally match a method that has the same name. Separate commits have already performed this change in other checkers: - easy cases: e2f1cba, 6d64f8e - MallocChecker: d6d84b5 - iterator checkers: 06eedff - InvalidPtr checker: 024281d - apiModeling.llvm.ReturnValue: 97dd8e3 ... and follow-up commits will handle the remaining few checkers. My goal is to ensure that the call description mode is always explicitly specified and eliminate (or strongly restrict) the vague "may be either a method or a simple function" mode that's the current default.

Add HLSLPackOffsetAttr to save packoffset in AST. Since we have to parse the attribute manually in ParseHLSLAnnotations, we could create the ParsedAttribute with a integer offset parameter instead of string. This approach avoids parsing the string if the offset is saved as a string in HLSLPackOffsetAttr. For #57914.

llvm/llvm-project#85592 https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453 llvm/llvm-project#88609

Adding myself to linalg dialect

…(#90217) This patch fixes an integer overflow in the SampleProfileLoader pass. The issue occurs when weights are saturated and Profi isn't being used. This patch also adds a newline to a debug message to make it more readable.

Debug-info metadata does not have a strictly defined order. Check that elements are linked to each other correctly, not that metadata appears in a particular order.

Emit named metadata "dx.version" for DXIL version. Default to DXIL 1.0

Add test coverage for missed simplification.

…tered. (#91329) PR #89664 introduced a regression that it unregistered llvm-tblgen option `-D` for macros. The test `TestOps.cpp` failed due to passing a macros to llvm-tblgen. It caused our internal build to fail because we append `-DLOCAL_NAME` into `LLVM_TABLEGEN_FLANGS` in `llvm/lib/cmake/llvm/TableGen.cmake` as ``` list(APPEND LLVM_TABLEGEN_FLAGS "-DLOCAL_NAME") ``` And in `./llvm/lib/Target/PowerPC/PPC.td`, we check it for some downstream code as: ``` ... #ifdef LOCAL_NAME ... #endif ``` Now we got error message from mlir-src-sharder as ``` mlir-src-sharder -op-shard-index=1 -DLOCAL_NAME llvm-project/mlir/test/lib/Dialect/Test/TestOps.cpp --write-if-changed -o tools/mlir/test/lib/Dialect/Test/TestOps.1.cpp -d tools/mlir/test/lib/Dialect/Test/TestOps.1.cpp.d mlir-src-sharder: Unknown command line argument '-DLOCAL_NAME'. Try: 'llvm-project/build/bin/mlir-src-sharder --help' mlir-src-sharder: Did you mean '-I'? ``` This PR is to fix the regression.

This reverts commit c5509fe.

This commit ensures that Mem2Reg reuses the `DominanceInfo` as well as block index maps to avoid expensive recomputations. Due to the recent migration to `OpBuilder`, the promotion of a slot does no longer replace blocks. Having stable blocks makes the `DominanceInfo` preservable and additionally allows to cache block index maps between different promotions. Performance measurements on very large functions show an up to 4x speedup by these changes.

…ic index lists (#90897) This patch is a first pass at making consistent syntax across the `LinalgTransformOp`s that use dynamic index lists for size parameters. Previously, there were two different forms: inline types in the list, or place them in the functional style tuple. This patch goes for the latter. In order to do this, the `printPackedOrDynamicIndexList`, `printDynamicIndexList` and their `parse` counterparts were modified so that the types can be optionally provided to the corresponding custom directives. All affected ops now use tablegen `assemblyFormat`, so custom `parse`/`print` functions have been removed. There are a couple ops that will likely add dynamic size support, and once that happens it should be made sure that the assembly remains consistent with the changes in this patch. The affected ops are as follows: `pack`, `pack_greedily`, `tile_using_forall`. The `tile_using_for` and `vectorize` ops already used this syntax, but their custom assembly was removed. --------- Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>

This allows the tests to be run against any implementation of `Solver` instead of begin specific to `WatchedLiteralsSolver` as they currently are.

The binary version is four times faster than current implementation in my setup, and generally considered a better implementation. Code inspired by https://en.algorithmica.org/hpc/algorithms/gcd/ which itself is inspired by https://lemire.me/blog/2013/12/26/fastest-way-to-compute-the-greatest-common-divisor/ Fix #77648

When a child process is forked with OpenMP already initialized, the child process resets its affinity mask and sets proc-bind-var to false so that the entire original affinity mask is used. This patch corrects an issue with the affinity initialization code setting affinity to compact instead of none for this special case of forked children. The test trying to catch this only testing explicit setting of KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting. Fixes: #91098

…hape involving dynamic dims (#89093) `fold-memref-alias-ops` bails out in presence of dynamic shapes in `memref.expand_shape` op. Handle this case.

…other type. Need to look through the SExt/ZExt scalars to be gathered, when trying to reduce their width after minbitwidth analysis to prevent permanent attempts to revectorize such gathered instructions.

`clang/tools/scan-build` is implemented in `perl`. However given `perl` is not mentioned as a required dependency in `GettingStarted.rst` we should make this optional. This adds a `find_package(Perl)` check to cmake and disables the `scan-build` tests when no perl executable is found. Ideally we would also check if dependent perl modules like `Hash::Util` are present on the system, but I don't see any pre-existing cmake macros to easily test this. So for now I go with a plain check for the `perl` package, at least this allows to use `cmake -DCMAKE_DISABLE_FIND_PACKAGE_Perl=ON` to manually disable `perl` and the tests.

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>

tbaederr and others added 30 commits May 6, 2024 10:37

[clang][Interp] Fix primitive lambda capture defaults

9a521e2

We need to use InitField here, not SetField.

[LoongArch] Mark data type i32 are sign-extended. NFC

d98a785

[clang][Interp] Fix creating functions with explicit instance parameters

69d740e

[LoongArch] Rename some OptWInstrs functions. NFC

0933a7a

[DAG] Fold bitreverse(shl/srl(bitreverse(x),y)) -> srl/shl(x,y) (#89897)

522b4bf

Noticed while investigating GFNI per-element vector shifts (we can form SHL but not SRL/SRA) Alive2: https://alive2.llvm.org/ce/z/fSH-rf

Add requires clause to risc-v clang driver tests

6217abc

Followup to #89727

[X86] Fix -Wsign-compare in X86ISelLowering.cpp (NFC)

a924199

llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:40081:21: error: comparison of integers of different signs: 'int' and 'unsigned int' [-Werror,-Wsign-compare] for (int I = 0; I != NumElts; ++I) { ~ ^ ~~~~~~~ 1 error generated.

[X86] Fix -Wunused-function in X86ISelLowering.cpp (NFC)

b2c2fef

llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:3582:13: error: unused function 'isBlendOrUndef' [-Werror,-Wunused-function] static bool isBlendOrUndef(ArrayRef<int> Mask) { ^ 1 error generated.

[flang][OpenMP] Support tasks' implicit firstprivate DSA (#85989)

1e9625e

Handle implicit firstprivate DSAs on task generating constructs. Fixes llvm/llvm-project#64480

[flang][OpenMP] Fix symbol handling in critical/sections constructs (…

e365ac8

…#90671) Fixes llvm/llvm-project#78936

[driver] Do not warn about unused plugin flags. (#88948)

6e31a49

Plugins are not loaded without the -cc1 phase. Do not report them when running on an assembly file or when linking. Many build tools add these options to all driver invocations, including LLVM's build system. Fixes #88173

[LAA] Update check line in test to fully match message.

148b721

[MLIR] fix _f64ElementsAttr in ir.py (#91176)

10ec0d2

[LAA] Add tests showing extra unnecessary runtime checks.

5f73d29

Pre-commit tests for an upcoming patch.

[clang][dataflow] Don't propagate result objects in unevaluated conte…

4d839d8

…xts (reland #90438) (#91172) This relands #90348 with a fix for a [buildbot failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446) caused by the test being run with `-fno-rtti`.

[NFC] Use const& avoiding copies (#90334)

d751e40

Fixes #90285.

[AggressiveInstCombine] Fix strncmp inlining (#91204)

1241e76

Fix the issue that `char` constants are converted to `uint64_t` in the wrong way when doing the inlining.

[SLP][NFC]Add a test with incorrect smin analysis for minimal bitwidt…

d584df6

…h, NFC.

[SLP]Fix PR91025: correctly handle smin/smax of signed operands.

a476032

Need to check that the signed operand has an extra sign bit to be sure that we do not skip signedness, when trying to minimize bitwidth for smin/smax intrinsics.

[NFC][clang-tidy]increase stability for bugprone-return-const-ref-fro…

5cb13bf

…m-parameter (#91160)

[RISCV] Check dead flag on VL def op in RISCVCoalesceVSETVLI. NFC (#9…

9d9bd76

…1168) Because LiveVariables has been run, we no longer need to lookup the users in MachineRegisterInfo anymore and can instead just check for the dead flag.

NagyDonat and others added 24 commits May 8, 2024 12:38

Typo fix; NFC

943617d

[GlobalIsel] combine ext of trunc with flags (#87115)

737e0bc

llvm/llvm-project#85592 https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453 llvm/llvm-project#88609

Update CODEOWNERS

db4cf7c

Adding myself to linalg dialect

[LLVM][CodeGen][SVE] Add tests for vector extracts from unpacked types.

c84c74e

[Coro] Relax a debug-info test (#91401)

3ceacd8

Debug-info metadata does not have a strictly defined order. Check that elements are linked to each other correctly, not that metadata appears in a particular order.

[DirectX backend] emits metadata for DXIL version. (#88350)

665af09

Emit named metadata "dx.version" for DXIL version. Default to DXIL 1.0

[SCEV] Add tests for missed NSW preservation during loop guard handling.

40b322b

Add test coverage for missed simplification.

Revert "[HLSL] Support packoffset attribute in AST (#89836)" (#91473)

9c09b08

This reverts commit c5509fe.

[clang][dataflow] Make SolverTest a type-parameterized test. (#91455)

d6d613a

This allows the tests to be run against any implementation of `Solver` instead of begin specific to `WatchedLiteralsSolver` as they currently are.

AMDGPU: Add some more ctlz_zero_undef tests

b5afda8

[mlir][fold-memref-alias-ops] Add support for folding memref.expand_s…

6ed8434

…hape involving dynamic dims (#89093) `fold-memref-alias-ops` bails out in presence of dynamic shapes in `memref.expand_shape` op. Handle this case.

[SLP]Fix PR91467: Look through scalar cast, when trying to cast to an…

2475efa

…other type. Need to look through the SExt/ZExt scalars to be gathered, when trying to reduce their width after minbitwidth analysis to prevent permanent attempts to revectorize such gathered instructions.

[bazel] Add missing dependency for 6ed8434

3a83162

Merge commit '3a8316216807d64a586b971f51695e23883331f7'

3729c80

[GENX] Update libGenISAIntrinsics

25f6fcb

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>

whitneywhtsang added the genx label May 13, 2024

whitneywhtsang requested a review from a team May 13, 2024 18:41

whitneywhtsang self-assigned this May 13, 2024

etiotto approved these changes May 14, 2024

View reviewed changes

whitneywhtsang merged commit 25f6fcb into intel:genx May 14, 2024
11 checks passed

whitneywhtsang deleted the merge branch May 14, 2024 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GEN] Update GENX branch to LLVM `3a83162` #13773

[GEN] Update GENX branch to LLVM `3a83162` #13773

whitneywhtsang commented May 13, 2024

[GEN] Update GENX branch to LLVM 3a83162 #13773

[GEN] Update GENX branch to LLVM 3a83162 #13773

Conversation

whitneywhtsang commented May 13, 2024

[GEN] Update GENX branch to LLVM `3a83162` #13773

[GEN] Update GENX branch to LLVM `3a83162` #13773