[GEN] Update GENX branch to LLVM `ed4e505` #13355

whitneywhtsang · 2024-04-10T17:24:41Z

No description provided.

…VRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.

Use the return type to measure the LMUL size for latency/throughput cost

…rhs` - When both operands are constant, the matcher runs into an infinite loop as the commutation should be applied only when LHS is a constant and RHS is not. Reviewers: arsenm Reviewed By: arsenm Pull Request: llvm/llvm-project#87426

… is vector. NFC If the type is vector, we can immediately know to use vector mapping. Previously we searched for FP uses, but then replaced it if the type was vector.

Operations must be created with the supplied builder. Otherwise, the dialect conversion / greedy pattern rewrite driver can break. This commit fixes a crash in the dialect conversion: ``` within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: error: failed to legalize operation 'tosa.add' %0 = tosa.add %1, %arg2 : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32> ^ within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: note: see current operation: %9 = "tosa.add"(%8, %arg2) : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32> mlir-opt: llvm-project/mlir/include/mlir/IR/UseDefLists.h:198: mlir::IRObjectWithUseList<mlir::OpOperand>::~IRObjectWithUseList() [OperandType = mlir::OpOperand]: Assertion `use_empty() && "Cannot destroy a value that still has uses!"' failed. ``` This commit is the proper fix for #87297 (which was reverted).

Implements: https://wg21.link/P2867R2 --------- Co-authored-by: Hristo Hristov <zingam@outlook.com>

Fixed vectors have their sext/zext operands legalized to _VL nodes, so we need to handle them in the patterns. This adds a riscv_ext_vl_oneuse pattern since we don't care about the type of extension used for the shift amount, and extends Low8BitsSplatPat to handle other _VL nodes. We don't actually need to check the mask or VL there since none of the _VL nodes have passthru operands. The remaining test cases that are widening from i8->i64 need to be handled by extending combineBinOp_VLToVWBinOp_VL. This also fixes Low8BitsSplatPat incorrectly checking the vector size instead of the element size to determine if the splat value might have been truncated below 8 bits.

This reverts commit 3ee93f4 because it broke Fuchsia Clang toolchain builders: https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8751633430491432833/+/u/clang/build/stdout

…87338) On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1), (splat_vector i64 0)), where the splat_vectors are implicitly truncating. When the vselect is used by a binop we want to pull the vselect out via foldSelectWithIdentityConstant. But because vectors with an element size < i64 will truncate, isNeutralConstant will return false. This patch handles truncating splats by getting the APInt value and truncating it. We almost don't need to do this since most of the neutral elements are either one/zero/all ones, but it will make a difference for smax and smin. I wasn't able to figure out a way to write the tests in terms of select, since we need the i1 zext legalization to create a truncating splat_vector. This supercedes #87236. Fixed vectors are unfortunately not handled by this patch (since they get legalized to _VL nodes), but they don't seem to appear in the wild.

I'm currently developing a new version of the indexed memprof format where we deduplicate call stacks in IndexedAllocationInfo::CallStack and IndexedMemProfRecord::CallSites. We refer to call stacks with integer IDs, namely CallStackId, just as we refer to Frame with FrameId. The deduplication will cut down the profile file size by 80% in a large memprof file of mine. As a step toward the goal, this patch teaches IndexedMemProfRecord::{serialize,deserialize} to speak Version2. A subsequent patch will add Version2 support to llvm-profdata. The essense of the patch is to replace the serialization of a call stack, a vector of FrameIDs, with that of a CallStackId. That is: const IndexedAllocationInfo &N = ...; ... LE.write<uint64_t>(N.CallStack.size()); for (const FrameId &Id : N.CallStack) LE.write<FrameId>(Id); becomes: LE.write<CallStackId>(N.CSId);

…RE_PAUTH` (#87545) Reland #85231 after fixing build failure https://lab.llvm.org/buildbot/#/builders/186/builds/15631. Use `PRIx64` for format output of `uint64_t` as hex. Original PR description below. This adds support for `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` feature (as defined in ARM-software/abi-aa#240) handling in llvm-readobj and llvm-readelf. The following constants for supported platforms are also introduced: - `AARCH64_PAUTH_PLATFORM_INVALID = 0x0` - `AARCH64_PAUTH_PLATFORM_BAREMETAL = 0x1` - `AARCH64_PAUTH_PLATFORM_LLVM_LINUX = 0x10000002` For the llvm_linux platform, output of the tools contains descriptions of PAuth features which are enabled/disabled depending on the version value. Version value bits correspond to the following `LangOptions` defined in #85232: - bit 0: `PointerAuthIntrinsics`; - bit 1: `PointerAuthCalls`; - bit 2: `PointerAuthReturns`; - bit 3: `PointerAuthAuthTraps`; - bit 4: `PointerAuthVTPtrAddressDiscrimination`; - bit 5: `PointerAuthVTPtrTypeDiscrimination`; - bit 6: `PointerAuthInitFini`. Support for `.note.AARCH64-PAUTH-ABI-tag` is dropped since it's deleted from the spec in ARM-software/abi-aa#250.

This adds handling of range attribute for return values of Call and Invoke in getFromRangeMetadata and handling of argument with range attribute in solveBlockValueNonLocal. There is one additional check of the range metadata at line 1120 in getValueFromSimpleICmpCondition that is not covered in this PR as after llvm/llvm-project#75311 there is no test that cover that check any more and I have not been able to create a test that trigger that code.

…ore (#87504) This commit relaxes Mem2Reg's type equality requirement for the LLVM dialect's load and store operations. For now, we only allow loads to be promoted if the reaching definition can be casted into a value of the target type. For stores, all type checks are removed, as a non-volatile store that does not write out the alloca's pointer can always be deleted.

SLES 15 comes with a GCC 7.5 as default, which does not support the C++17 `<charconv>` header. This results in build errors when trying to run `check-flang`. This patch addresses that and uses the older `std::stol` for the string -> number conversion to allow the SLES 15 buildbot (https://lab.llvm.org/staging/#/builders/193) to turn green.

…(#86098) There is an assertion that the stop condition is not satisfied for the the starting point at the beginning of `computeBound`. Therefore, that case does not have to be handled later on in that function.

…on in the constructor (#86099) This commit changes the API of `ValueBoundsConstraintSet`: the stop condition is now passed to the constructor instead of `processWorklist`. That makes it easier to add items to the worklist multiple times and process them in a consistent manner. The current `ValueBoundsConstraintSet` is passed as a reference to the stop function, so that the stop function can be defined before the the `ValueBoundsConstraintSet` is constructed. This change is in preparation of adding support for branches.

This patch moves most of the multiprecision logic to the `multiword` namespace and simplifies some logic in `BigInt`. It also fully implements the mask and count functions and increases test coverage. `math_extras.h` is also reworked to make it more concise.

This PR: * fixes OpVariable instructions place in a function (see llvm/llvm-project#66261), * improves type inference, * helps avoiding unneeded bitcasts when validating function call's This allows to improve existing and add new test cases with more strict checks. OpVariable fix refers to "All OpVariable instructions in a function must be the first instructions in the first block" requirement from SPIR-V spec.

Reverts llvm/llvm-project#86137 Some aarch64 compilers seem to consider that `uint128_t` is not `is_trivially_constructible` which prevents `bit_cast`-ing.

libclc is mentioned in the list of LLVM_ENABLE_PROJECTS but it isn't actually possible to build it in-tree for various reasons. Users currently have to build it via LLVM_ENABLE_EXTERNAL_PROJECTS, which isn't very well documented. We can't properly build in-tree because the current system needs to "see" clang and other tools at CMake configuration time. The general idea is that we could fix this in the future by moving the compilation and linking of bitcode libraries to custom commands, which would remove the dependency on CMake configuration and would allow us to build libclc after clang and other tools are built in-tree. Since that's a bigger change, it is being left for later. Note that with this commit it's *still* not possible to properly build in-tree - this commit just fixes a few little things that are in the way. We are now able to build in-tree in the sense that it can be built as a regular LLVM sub-project, but the tools it uses to compile the libraries are still picked up from a pre-existing installation of LLVM, and not from tools built during the same build as libclc. The things fixed by this commit include: * Its use of CMAKE_SOURCE_DIR (i.e., assuming it was the top-level project) * These have been converted to PROJECT_SOURCE_DIR - should have no consequences for out-of-tree builds. * Its prepare_builtins tool insisting on linking against the dynamic LLVM.so. * This has been turned from an "llvm executable" into an "llvm utility" which links against the static libraries. * It was also missing a link component for the IRReader library. * Assuming an output path for its builtin libraries (dependent on the working directory) * This has been changed to query CMake for the library target's output file. * The spirv-mesa3d and spirv64-mesa3d targets were enabled by default (or when asking to build 'all' libclc targets), when they require llvm-spirv as an external dependency. * They are now only built when the user explicitly asks for them, or when llvm-spirv is available and the user asks for 'all'.

…(#87539) Call generateWaitcnt unconditionally at the end of SIInsertWaitcnts::insertWaitcntInBlock. Even if we don't need to generate a new waitcnt instruction it has the effect of combining or removing redundant waitcnts that were already present. Tests show various small improvements in waitcnt placement.

The class `ScopedDbgInfoFormatSetter` was added as a convenient way to temporarily change the debug info format of a function or module, as part of IR printing; since this process is repeated in a number of other places, this patch uses the format-setter class in those places as well.

Darwin targets implement -mcmodel=large by forcing all global accesses to use the GOT, instead of the ELF movz/movk sequence. That means it's compatible with PIC so the Clang driver shouldn't reject the option.

…g (#72714) This patch adds lld support for: - Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH relocations) as described here: https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations - .note.AARCH64-PAUTH-ABI-tag section as defined here https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking Depends on #72713 and #85231 --------- Co-authored-by: Peter Collingbourne <peter@pcc.me.uk> Co-authored-by: Fangrui Song <i@maskray.me>

This is a reland of #86137 with a fix for platforms / compiler that do not support trivially constructible int128 types.

These are new in clang-19+, gcc-14+.

…le toolchain (#87684) Building the Apple way turns off plugin support, meaning we don't need to export unloadable symbols from all executables. While deadstripping effects aren't expected to change, enabling this across all tools prevents the creation of export tries. This saves us ~3.5 MB in just the universal build of `clang`.

The HOST_LINK_VERSION is a hardcoded string in Darwin clang that detects the linker version at configure time. The driver uses this information to build the correct set of arguments for the linker. This patch detects the linker version again during compiler-rt configuration and passes it to the libfuzzer tests. This allows a clang built on a machine with a new linker to run compiler-rt tests on a machine with an old linker. rdar://125932376

Take care of a TODO. This check makes sure that the fexcept_t value fits in an int value. TODO introduced in: llvm/llvm-project@9550f8b

fatal error for now Appeases build bots while being investigated.

This script+config should help us generate more consistent documentation wrt. what we currently support or not. As an example usage: $ ./libc/utils/docgen/docgen.py fenv.h Will spit out an RST formatted table that can be copy+pasted into our docs. The config is not filled out entirely, but doing so and then updating our docs would be great beginner bugs for new contributors. Having python+json generate things like docs, or headers (as imagined in https://github.com/nickdesaulniers/llvm-project/tree/hdr-gen2) is perhaps easier to work with than tablegen, and doesn't introduce a dependency on a host tool that needs to be compiled from llvm sources before building the rest of the libc. This can probably be merged with whatever we end up doing to replace libc-hdrgen. Please use https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting for keeping this file formatted.

…822) When building flang out-of-tree with relative paths in LLVM_DIR, CLANG_DIR and MLIR_DIR, we need to compute the absolute paths based on the CMake build directory (i.e. where the cmake is invoked from).

The lowering of n-D vector.extract/insert ops to LLVM is not supported but if one of these accidentally reaches the vector-to-llvm conversion patterns, we end up with a kind of puzzling crash. This PR fixes that crash and gracefully bails out in those cases.

…Decimal. (#87827) I will add `toupper` implementation into it in the next PR.

Context: llvm/llvm-project#87017 - Add proxy header `libc/hdr/math_macros.h` that will: - include `<math.h>` in overlay mode, - include `"include/llvm-libc-macros/math-macros.h"` in full build mode. - Its corresponding CMake target `libc.hdr.math_macros` will only depend on `libc.include.math` and `libc.include.llvm-libc-macros.math_macros` in full build mode. - Replace all `#include "include/llvm-libc-macros/math-macros.h"` with `#include "hdr/math_macros.h"`. - Add dependency to `libc.hdr.math_macros` CMake target when using `add_fp_unittest`. - Update the remaining dependency. - Update bazel overlay: add `libc:hdr_math_macros` target, and replacing all dependency on `libc:llvm_libc_macros_math_macros` with `libc:hdr_math_macros`.

Provide a mechanism to resolve call target information for calls from non-BAT functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for future use in BAT-to-BAT calls. Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test Reviewers: ayermolo, maksfb, rafaelauler, dcci Reviewed By: maksfb Pull Request: llvm/llvm-project#86219

llvm-project/flang/lib/Lower/Bridge.cpp:3775:14: error: variable 'nbDeviceResidentObject' set but not used [-Werror,-Wunused-but-set-variable] unsigned nbDeviceResidentObject = 0; ^ 1 error generated.

This has been updated in the isa-manual riscv/riscv-isa-manual#1311

BAT writeMaps encoded the assumption that functions are only split into two fragments (hot and cold). However, BOLT supports splitting into arbitrary number of fragments. Relax that assumption and look up primary (hot) fragment explicitly. Depends on: llvm/llvm-project#86219 Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s Reviewers: ayermolo, rafaelauler, maksfb, dcci Reviewed By: maksfb, dcci Pull Request: llvm/llvm-project#87123

Emit the recorded number of blocks, not the number of basic block hashes. There might be differences in corner cases (openssl BN_BLINDING_convert_ex function). Test Plan: Updated openssl.test in rafaelauler/bolt-tests#31 Reviewers: rafaelauler, ayermolo, maksfb, dcci Reviewed By: ayermolo Pull Request: llvm/llvm-project#87830

…o (#87433) Completely skip include directives that form the filename using macros. fixes #87303

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>

topperc and others added 30 commits April 3, 2024 17:16

[RISCV] Remove G_TRUNC/ZEXT/SEXT/ANYEXT from the first switch in RISC…

7e2a1d6

…VRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.

[RISCV][TTI] Scale the cost of intrinsic stepvector with LMUL (#87301)

97523e5

Use the return type to measure the LMUL size for latency/throughput cost

[clang] Init fields added by #87357

abd05eb

[RISCV][GISel] Don't check for FP uses of of IMPLICIT_DEF if the type…

a853d79

… is vector. NFC If the type is vector, we can immediately know to use vector mapping. Previously we searched for FP uses, but then replaced it if the type was vector.

[libc++] P2867R1: Remove Deprecated strstreams From C++26 (#87107)

fb635be

Implements: https://wg21.link/P2867R2 --------- Co-authored-by: Hristo Hristov <zingam@outlook.com>

[flang][OpenMP] Fix for #86393 (#87452)

698bf3d

Revert "[libc] Added transitive bindings for OffsetType (#87397)"

e8aaa3e

This reverts commit 3ee93f4 because it broke Fuchsia Clang toolchain builders: https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8751633430491432833/+/u/clang/build/stdout

[mlir][Interfaces][NFC] ValueBoundsConstraintSet: Delete dead code …

d542cb3

…(#86098) There is an assertion that the stop condition is not satisfied for the the starting point at the beginning of `computeBound`. Therefore, that case does not have to be handled later on in that function.

[clang] Remove an unintended statement, NFC

35886dc

[clangd][NFC] Delete dead code

550e09d

Revert "[libc] Refactor BigInt" (#87612)

1273591

Reverts llvm/llvm-project#86137 Some aarch64 compilers seem to consider that `uint128_t` is not `is_trivially_constructible` which prevents `bit_cast`-ing.

AArch64-Darwin: allow -mcmodel=large with (default) PIC

7a8cf95

Darwin targets implement -mcmodel=large by forcing all global accesses to use the GOT, instead of the ELF movz/movk sequence. That means it's compatible with PIC so the Clang driver shouldn't reject the option.

[reland][libc] Refactor BigInt (#87613)

71c3f5d

This is a reland of #86137 with a fix for platforms / compiler that do not support trivially constructible int128 types.

[mlir][OpenMP][NFC] Use SmallVectorImpl for function arguments (#86978)

5334b31

nickdesaulniers and others added 18 commits April 5, 2024 14:29

[libc][support][bit] use new type generic builtins (#86746)

2744a24

These are new in clang-19+, gcc-14+.

[flang][runtime] Support for offload build of FortranDecimal. (#87653)

b329da8

[libc][fenv] Add compile time check (#87826)

80deb82

Take care of a TODO. This check makes sure that the fexcept_t value fits in an int value. TODO introduced in: llvm/llvm-project@9550f8b

[cmake] Back out of making unsupported -no_exported_symbols linker a

fe45029

fatal error for now Appeases build bots while being investigated.

[flang][build] Fixed paths discrovery for the out-of-tree build. (#87…

9202984

…822) When building flang out-of-tree with relative paths in LLVM_DIR, CLANG_DIR and MLIR_DIR, we need to compute the absolute paths based on the CMake build directory (i.e. where the cmake is invoked from).

[NFC][flang][runtime] Moved freestanding-tools.h to use it in Fortran…

3b33724

…Decimal. (#87827) I will add `toupper` implementation into it in the next PR.

[flang] Fix -Wunused-but-set-variable in Bridge.cpp (NFC)

3f2f700

llvm-project/flang/lib/Lower/Bridge.cpp:3775:14: error: variable 'nbDeviceResidentObject' set but not used [-Werror,-Wunused-but-set-variable] unsigned nbDeviceResidentObject = 0; ^ 1 error generated.

[RISCV] Rename OP-P to OP-VE. (#87546)

8bd3914

This has been updated in the isa-manual riscv/riscv-isa-manual#1311

[clang-tidy] Fix readability-duplicate-include for includes with macr…

ed4e505

…o (#87433) Completely skip include directives that form the filename using macros. fixes #87303

Merge commit 'ed4e505c219fe6c7464ea5a056e90d8cd94c7332'

70ca11b

whitneywhtsang self-assigned this Apr 10, 2024

whitneywhtsang added the genx label Apr 10, 2024

[GENX] Update libGenISAIntrinsics

39b8333

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>

whitneywhtsang changed the title ~~[GENX] Update GENX branch to LLVM ed4e505~~ [GEN] Update GENX branch to LLVM ed4e505 Apr 10, 2024

whitneywhtsang requested a review from a team April 10, 2024 17:41

whitneywhtsang marked this pull request as ready for review April 10, 2024 17:41

etiotto approved these changes Apr 10, 2024

View reviewed changes

whitneywhtsang requested a review from a team April 10, 2024 19:16

victor-eds approved these changes Apr 10, 2024

View reviewed changes

whitneywhtsang merged commit 39b8333 into intel:genx Apr 10, 2024
9 checks passed

whitneywhtsang deleted the merge branch April 10, 2024 19:23

whitneywhtsang mentioned this pull request Apr 16, 2024

Merge OpenAI Triton till April 26th intel/intel-xpu-backend-for-triton#884

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GEN] Update GENX branch to LLVM `ed4e505` #13355

[GEN] Update GENX branch to LLVM `ed4e505` #13355

whitneywhtsang commented Apr 10, 2024

[GEN] Update GENX branch to LLVM ed4e505 #13355

[GEN] Update GENX branch to LLVM ed4e505 #13355

Conversation

whitneywhtsang commented Apr 10, 2024

[GEN] Update GENX branch to LLVM `ed4e505` #13355

[GEN] Update GENX branch to LLVM `ed4e505` #13355