-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GEN] Update GENX branch to LLVM 3a83162
#13773
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We need to use InitField here, not SetField.
Referring to RISC-V, adding an MI level pass to optimize *W instructions for LoongArch. First it removes unneeded sext(addi.w rd, rs, 0) instructions. Either because the sign extended bits aren't consumed or because the input was already sign extended by an earlier instruction. Then: 1. Unless explicit disabled or the target prefers instructions with W suffix, it removes the -w suffix from opw instructions whenever all users are dependent only on the lower word of the result of the instruction. The cases handled are: * addi.w because it helps reduce test differences between LA32 and LA64 w/o being a pessimization. 2. Or if explicit enabled or the target prefers instructions with W suffix, it adds the W suffix to the instruction whenever all users are dependent only on the lower word of the result of the instruction. The cases handled are: * add.d/addi.d/sub.d/mul.d. * slli.d with imm < 32. * ld.d/ld.wu.
Noticed while investigating GFNI per-element vector shifts (we can form SHL but not SRL/SRA) Alive2: https://alive2.llvm.org/ce/z/fSH-rf
Change definition of expandBitCastI128ToF128 and expandBitCastF128ToI128 to allow for simplified use in atomic load/store. Update logic to split 128-bit loads and stores in DAGCombine to also handle the f128 case where appropriate. This fixes the regressions introduced by recent atomic load/store patches.
If we don't demand the same element from both single source shuffles (permutes), then attempt to blend the sources together first and then perform a merged permute. For vXi16 blends we have to be careful as these are much more likely to involve byte/word vector shuffles that will result in the creation of additional shuffle instructions. This fold might be worth it for VSELECT with constant masks on AVX512 targets, but I haven't investigated this yet, but I've tried to write combineBlendOfPermutes so to be prepared for this. The PR34592 -O0 regression is an unfortunate failure to cleanup with a later pass that calls SimplifyDemandedElts like the -O3 does - I'm not sure how worried we should be tbh.
Followup to #89727
… with 1 or 2 args (#89624) The destructors generated by the legacy IBM `xlclang++` compiler can take 1 or 2 arguments and the differences were handled by type `cast` where it is needed. Clang now treats the `cast` here as an error after llvm/llvm-project@999d4f8 landed with `-Xextra -Werror`. The issue had been worked around by using `#pragma GCC diagnostic push/pop`. This patch defines 2 separate destructor types for 1 argument and 2 arguments respectively so `cast` is not needed.
llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:40081:21: error: comparison of integers of different signs: 'int' and 'unsigned int' [-Werror,-Wsign-compare] for (int I = 0; I != NumElts; ++I) { ~ ^ ~~~~~~~ 1 error generated.
llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:3582:13: error: unused function 'isBlendOrUndef' [-Werror,-Wunused-function] static bool isBlendOrUndef(ArrayRef<int> Mask) { ^ 1 error generated.
Handle implicit firstprivate DSAs on task generating constructs. Fixes llvm/llvm-project#64480
Plugins are not loaded without the -cc1 phase. Do not report them when running on an assembly file or when linking. Many build tools add these options to all driver invocations, including LLVM's build system. Fixes #88173
An `emitc.expression` can only yield a single result, but some operations which have the `CExpression` trait can have multiple results, which can result in a crash when applying the `fold-expressions` pass. This change adds a check for the single-result condition and a simple test.
Pre-commit tests for an upcoming patch.
…xts (reland #90438) (#91172) This relands #90348 with a fix for a [buildbot failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446) caused by the test being run with `-fno-rtti`.
Fixes #90285.
… macros (#88816) Adds more FP test macros for the upcoming test adds for #61092 and the issues opened from it: #88768, #88769, #88770, #88771, #88772. Fix bug in `{EXPECT,ASSERT}_FP_EXCEPTION`. `EXPECT_FP_EXCEPTION(0)` seems to be used to test that an exception did not happen, but it always does `EXPECT_GE(... & 0, 0)` which never fails. Update and refactor tests that break after the above bug fix. An interesting way things broke after the above change is that `ForceRoundingMode` and `quick_get_round()` were raising the inexact exception, breaking a lot of the `atan*` tests. The changes for all files other than `FPMatcher.h` and `libc/test/src/math/smoke/RoundToIntegerTest.h` should have the same semantics as before. For `RoundToIntegerTest.h`, lines 56-58 before the changes do not always hold since this test is used for functions with different exception and errno behavior like `lrint` and `lround`. I've deleted those lines for now, but tests for those cases should be added for the different nearest int functions to account for this. Adding @nickdesaulniers for review.
…n ThinLTO summaries" (#90610)" (#91194) Reverts llvm/llvm-project#90692 Breaking PPC buildbots. The bots are not meant to test LLD, but are running a test that is using an old version of LLD without the change (so is incompatible). Revert until a fix is found.
…or. (#87649) `SBProcess::GetMemoryRegionInfo` uses `qMemoryRegionInfo` packet to get memory region info, but this is not supported in gdb-server and causing downstream lldb test failures. This change ignores the the error from `SBProcess::GetMemoryRegionInfo` . Reported by @tedwoodward @jerinphilip.
… (#91019) This reverts commit 1106644. As noted in the original patch, this was designed to reverted once https://reviews.llvm.org/D142479 and https://reviews.llvm.org/D142660 landed, which has long since happened.
Fix the issue that `char` constants are converted to `uint64_t` in the wrong way when doing the inlining.
Need to check that the signed operand has an extra sign bit to be sure that we do not skip signedness, when trying to minimize bitwidth for smin/smax intrinsics.
…m-parameter (#91160)
…1168) Because LiveVariables has been run, we no longer need to lookup the users in MachineRegisterInfo anymore and can instead just check for the dead flag.
This commit explicitly specifies the matching mode (C library function, any non-method function, or C++ method) for the `CallDescription`s constructed in the checker `osx.MIG`. The code was simplified to use a `CallDescriptionMap` instead of a raw vector of pairs. This change won't cause major functional changes, but isn't NFC because it ensures that e.g. call descriptions for a non-method function won't accidentally match a method that has the same name. Separate commits have already performed this change in other checkers: - easy cases: e2f1cba, 6d64f8e - MallocChecker: d6d84b5 - iterator checkers: 06eedff - InvalidPtr checker: 024281d - apiModeling.llvm.ReturnValue: 97dd8e3 ... and follow-up commits will handle the remaining few checkers. My goal is to ensure that the call description mode is always explicitly specified and eliminate (or strongly restrict) the vague "may be either a method or a simple function" mode that's the current default.
Add HLSLPackOffsetAttr to save packoffset in AST. Since we have to parse the attribute manually in ParseHLSLAnnotations, we could create the ParsedAttribute with a integer offset parameter instead of string. This approach avoids parsing the string if the offset is saved as a string in HLSLPackOffsetAttr. For #57914.
Adding myself to linalg dialect
…(#90217) This patch fixes an integer overflow in the SampleProfileLoader pass. The issue occurs when weights are saturated and Profi isn't being used. This patch also adds a newline to a debug message to make it more readable.
Debug-info metadata does not have a strictly defined order. Check that elements are linked to each other correctly, not that metadata appears in a particular order.
Emit named metadata "dx.version" for DXIL version. Default to DXIL 1.0
Add test coverage for missed simplification.
…tered. (#91329) PR #89664 introduced a regression that it unregistered llvm-tblgen option `-D` for macros. The test `TestOps.cpp` failed due to passing a macros to llvm-tblgen. It caused our internal build to fail because we append `-DLOCAL_NAME` into `LLVM_TABLEGEN_FLANGS` in `llvm/lib/cmake/llvm/TableGen.cmake` as ``` list(APPEND LLVM_TABLEGEN_FLAGS "-DLOCAL_NAME") ``` And in `./llvm/lib/Target/PowerPC/PPC.td`, we check it for some downstream code as: ``` ... #ifdef LOCAL_NAME ... #endif ``` Now we got error message from mlir-src-sharder as ``` mlir-src-sharder -op-shard-index=1 -DLOCAL_NAME llvm-project/mlir/test/lib/Dialect/Test/TestOps.cpp --write-if-changed -o tools/mlir/test/lib/Dialect/Test/TestOps.1.cpp -d tools/mlir/test/lib/Dialect/Test/TestOps.1.cpp.d mlir-src-sharder: Unknown command line argument '-DLOCAL_NAME'. Try: 'llvm-project/build/bin/mlir-src-sharder --help' mlir-src-sharder: Did you mean '-I'? ``` This PR is to fix the regression.
This reverts commit c5509fe.
This commit ensures that Mem2Reg reuses the `DominanceInfo` as well as block index maps to avoid expensive recomputations. Due to the recent migration to `OpBuilder`, the promotion of a slot does no longer replace blocks. Having stable blocks makes the `DominanceInfo` preservable and additionally allows to cache block index maps between different promotions. Performance measurements on very large functions show an up to 4x speedup by these changes.
…ic index lists (#90897) This patch is a first pass at making consistent syntax across the `LinalgTransformOp`s that use dynamic index lists for size parameters. Previously, there were two different forms: inline types in the list, or place them in the functional style tuple. This patch goes for the latter. In order to do this, the `printPackedOrDynamicIndexList`, `printDynamicIndexList` and their `parse` counterparts were modified so that the types can be optionally provided to the corresponding custom directives. All affected ops now use tablegen `assemblyFormat`, so custom `parse`/`print` functions have been removed. There are a couple ops that will likely add dynamic size support, and once that happens it should be made sure that the assembly remains consistent with the changes in this patch. The affected ops are as follows: `pack`, `pack_greedily`, `tile_using_forall`. The `tile_using_for` and `vectorize` ops already used this syntax, but their custom assembly was removed. --------- Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
This allows the tests to be run against any implementation of `Solver` instead of begin specific to `WatchedLiteralsSolver` as they currently are.
The binary version is four times faster than current implementation in my setup, and generally considered a better implementation. Code inspired by https://en.algorithmica.org/hpc/algorithms/gcd/ which itself is inspired by https://lemire.me/blog/2013/12/26/fastest-way-to-compute-the-greatest-common-divisor/ Fix #77648
When a child process is forked with OpenMP already initialized, the child process resets its affinity mask and sets proc-bind-var to false so that the entire original affinity mask is used. This patch corrects an issue with the affinity initialization code setting affinity to compact instead of none for this special case of forked children. The test trying to catch this only testing explicit setting of KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting. Fixes: #91098
…hape involving dynamic dims (#89093) `fold-memref-alias-ops` bails out in presence of dynamic shapes in `memref.expand_shape` op. Handle this case.
…other type. Need to look through the SExt/ZExt scalars to be gathered, when trying to reduce their width after minbitwidth analysis to prevent permanent attempts to revectorize such gathered instructions.
`clang/tools/scan-build` is implemented in `perl`. However given `perl` is not mentioned as a required dependency in `GettingStarted.rst` we should make this optional. This adds a `find_package(Perl)` check to cmake and disables the `scan-build` tests when no perl executable is found. Ideally we would also check if dependent perl modules like `Hash::Util` are present on the system, but I don't see any pre-existing cmake macros to easily test this. So for now I go with a plain check for the `perl` package, at least this allows to use `cmake -DCMAKE_DISABLE_FIND_PACKAGE_Perl=ON` to manually disable `perl` and the tests.
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
etiotto
approved these changes
May 14, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.