-
Notifications
You must be signed in to change notification settings - Fork 768
[GEN] Update GENX branch to LLVM ed4e505
#13355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…VRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.
Use the return type to measure the LMUL size for latency/throughput cost
…rhs` - When both operands are constant, the matcher runs into an infinite loop as the commutation should be applied only when LHS is a constant and RHS is not. Reviewers: arsenm Reviewed By: arsenm Pull Request: llvm/llvm-project#87426
… is vector. NFC If the type is vector, we can immediately know to use vector mapping. Previously we searched for FP uses, but then replaced it if the type was vector.
Operations must be created with the supplied builder. Otherwise, the dialect conversion / greedy pattern rewrite driver can break. This commit fixes a crash in the dialect conversion: ``` within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: error: failed to legalize operation 'tosa.add' %0 = tosa.add %1, %arg2 : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32> ^ within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: note: see current operation: %9 = "tosa.add"(%8, %arg2) : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32> mlir-opt: llvm-project/mlir/include/mlir/IR/UseDefLists.h:198: mlir::IRObjectWithUseList<mlir::OpOperand>::~IRObjectWithUseList() [OperandType = mlir::OpOperand]: Assertion `use_empty() && "Cannot destroy a value that still has uses!"' failed. ``` This commit is the proper fix for #87297 (which was reverted).
Implements: https://wg21.link/P2867R2 --------- Co-authored-by: Hristo Hristov <zingam@outlook.com>
Fixed vectors have their sext/zext operands legalized to _VL nodes, so we need to handle them in the patterns. This adds a riscv_ext_vl_oneuse pattern since we don't care about the type of extension used for the shift amount, and extends Low8BitsSplatPat to handle other _VL nodes. We don't actually need to check the mask or VL there since none of the _VL nodes have passthru operands. The remaining test cases that are widening from i8->i64 need to be handled by extending combineBinOp_VLToVWBinOp_VL. This also fixes Low8BitsSplatPat incorrectly checking the vector size instead of the element size to determine if the splat value might have been truncated below 8 bits.
This reverts commit 3ee93f4 because it broke Fuchsia Clang toolchain builders: https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8751633430491432833/+/u/clang/build/stdout
…87338) On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1), (splat_vector i64 0)), where the splat_vectors are implicitly truncating. When the vselect is used by a binop we want to pull the vselect out via foldSelectWithIdentityConstant. But because vectors with an element size < i64 will truncate, isNeutralConstant will return false. This patch handles truncating splats by getting the APInt value and truncating it. We almost don't need to do this since most of the neutral elements are either one/zero/all ones, but it will make a difference for smax and smin. I wasn't able to figure out a way to write the tests in terms of select, since we need the i1 zext legalization to create a truncating splat_vector. This supercedes #87236. Fixed vectors are unfortunately not handled by this patch (since they get legalized to _VL nodes), but they don't seem to appear in the wild.
I'm currently developing a new version of the indexed memprof format where we deduplicate call stacks in IndexedAllocationInfo::CallStack and IndexedMemProfRecord::CallSites. We refer to call stacks with integer IDs, namely CallStackId, just as we refer to Frame with FrameId. The deduplication will cut down the profile file size by 80% in a large memprof file of mine. As a step toward the goal, this patch teaches IndexedMemProfRecord::{serialize,deserialize} to speak Version2. A subsequent patch will add Version2 support to llvm-profdata. The essense of the patch is to replace the serialization of a call stack, a vector of FrameIDs, with that of a CallStackId. That is: const IndexedAllocationInfo &N = ...; ... LE.write<uint64_t>(N.CallStack.size()); for (const FrameId &Id : N.CallStack) LE.write<FrameId>(Id); becomes: LE.write<CallStackId>(N.CSId);
…RE_PAUTH` (#87545) Reland #85231 after fixing build failure https://lab.llvm.org/buildbot/#/builders/186/builds/15631. Use `PRIx64` for format output of `uint64_t` as hex. Original PR description below. This adds support for `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` feature (as defined in ARM-software/abi-aa#240) handling in llvm-readobj and llvm-readelf. The following constants for supported platforms are also introduced: - `AARCH64_PAUTH_PLATFORM_INVALID = 0x0` - `AARCH64_PAUTH_PLATFORM_BAREMETAL = 0x1` - `AARCH64_PAUTH_PLATFORM_LLVM_LINUX = 0x10000002` For the llvm_linux platform, output of the tools contains descriptions of PAuth features which are enabled/disabled depending on the version value. Version value bits correspond to the following `LangOptions` defined in #85232: - bit 0: `PointerAuthIntrinsics`; - bit 1: `PointerAuthCalls`; - bit 2: `PointerAuthReturns`; - bit 3: `PointerAuthAuthTraps`; - bit 4: `PointerAuthVTPtrAddressDiscrimination`; - bit 5: `PointerAuthVTPtrTypeDiscrimination`; - bit 6: `PointerAuthInitFini`. Support for `.note.AARCH64-PAUTH-ABI-tag` is dropped since it's deleted from the spec in ARM-software/abi-aa#250.
This adds handling of range attribute for return values of Call and Invoke in getFromRangeMetadata and handling of argument with range attribute in solveBlockValueNonLocal. There is one additional check of the range metadata at line 1120 in getValueFromSimpleICmpCondition that is not covered in this PR as after llvm/llvm-project#75311 there is no test that cover that check any more and I have not been able to create a test that trigger that code.
…ore (#87504) This commit relaxes Mem2Reg's type equality requirement for the LLVM dialect's load and store operations. For now, we only allow loads to be promoted if the reaching definition can be casted into a value of the target type. For stores, all type checks are removed, as a non-volatile store that does not write out the alloca's pointer can always be deleted.
SLES 15 comes with a GCC 7.5 as default, which does not support the C++17 `<charconv>` header. This results in build errors when trying to run `check-flang`. This patch addresses that and uses the older `std::stol` for the string -> number conversion to allow the SLES 15 buildbot (https://lab.llvm.org/staging/#/builders/193) to turn green.
…(#86098) There is an assertion that the stop condition is not satisfied for the the starting point at the beginning of `computeBound`. Therefore, that case does not have to be handled later on in that function.
…on in the constructor (#86099) This commit changes the API of `ValueBoundsConstraintSet`: the stop condition is now passed to the constructor instead of `processWorklist`. That makes it easier to add items to the worklist multiple times and process them in a consistent manner. The current `ValueBoundsConstraintSet` is passed as a reference to the stop function, so that the stop function can be defined before the the `ValueBoundsConstraintSet` is constructed. This change is in preparation of adding support for branches.
This patch moves most of the multiprecision logic to the `multiword` namespace and simplifies some logic in `BigInt`. It also fully implements the mask and count functions and increases test coverage. `math_extras.h` is also reworked to make it more concise.
This PR: * fixes OpVariable instructions place in a function (see llvm/llvm-project#66261), * improves type inference, * helps avoiding unneeded bitcasts when validating function call's This allows to improve existing and add new test cases with more strict checks. OpVariable fix refers to "All OpVariable instructions in a function must be the first instructions in the first block" requirement from SPIR-V spec.
Reverts llvm/llvm-project#86137 Some aarch64 compilers seem to consider that `uint128_t` is not `is_trivially_constructible` which prevents `bit_cast`-ing.
libclc is mentioned in the list of LLVM_ENABLE_PROJECTS but it isn't actually possible to build it in-tree for various reasons. Users currently have to build it via LLVM_ENABLE_EXTERNAL_PROJECTS, which isn't very well documented. We can't properly build in-tree because the current system needs to "see" clang and other tools at CMake configuration time. The general idea is that we could fix this in the future by moving the compilation and linking of bitcode libraries to custom commands, which would remove the dependency on CMake configuration and would allow us to build libclc after clang and other tools are built in-tree. Since that's a bigger change, it is being left for later. Note that with this commit it's *still* not possible to properly build in-tree - this commit just fixes a few little things that are in the way. We are now able to build in-tree in the sense that it can be built as a regular LLVM sub-project, but the tools it uses to compile the libraries are still picked up from a pre-existing installation of LLVM, and not from tools built during the same build as libclc. The things fixed by this commit include: * Its use of CMAKE_SOURCE_DIR (i.e., assuming it was the top-level project) * These have been converted to PROJECT_SOURCE_DIR - should have no consequences for out-of-tree builds. * Its prepare_builtins tool insisting on linking against the dynamic LLVM.so. * This has been turned from an "llvm executable" into an "llvm utility" which links against the static libraries. * It was also missing a link component for the IRReader library. * Assuming an output path for its builtin libraries (dependent on the working directory) * This has been changed to query CMake for the library target's output file. * The spirv-mesa3d and spirv64-mesa3d targets were enabled by default (or when asking to build 'all' libclc targets), when they require llvm-spirv as an external dependency. * They are now only built when the user explicitly asks for them, or when llvm-spirv is available and the user asks for 'all'.
…(#87539) Call generateWaitcnt unconditionally at the end of SIInsertWaitcnts::insertWaitcntInBlock. Even if we don't need to generate a new waitcnt instruction it has the effect of combining or removing redundant waitcnts that were already present. Tests show various small improvements in waitcnt placement.
The class `ScopedDbgInfoFormatSetter` was added as a convenient way to temporarily change the debug info format of a function or module, as part of IR printing; since this process is repeated in a number of other places, this patch uses the format-setter class in those places as well.
Darwin targets implement -mcmodel=large by forcing all global accesses to use the GOT, instead of the ELF movz/movk sequence. That means it's compatible with PIC so the Clang driver shouldn't reject the option.
…g (#72714) This patch adds lld support for: - Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH relocations) as described here: https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations - .note.AARCH64-PAUTH-ABI-tag section as defined here https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking Depends on #72713 and #85231 --------- Co-authored-by: Peter Collingbourne <peter@pcc.me.uk> Co-authored-by: Fangrui Song <i@maskray.me>
This is a reland of #86137 with a fix for platforms / compiler that do not support trivially constructible int128 types.
These are new in clang-19+, gcc-14+.
…le toolchain (#87684) Building the Apple way turns off plugin support, meaning we don't need to export unloadable symbols from all executables. While deadstripping effects aren't expected to change, enabling this across all tools prevents the creation of export tries. This saves us ~3.5 MB in just the universal build of `clang`.
The HOST_LINK_VERSION is a hardcoded string in Darwin clang that detects the linker version at configure time. The driver uses this information to build the correct set of arguments for the linker. This patch detects the linker version again during compiler-rt configuration and passes it to the libfuzzer tests. This allows a clang built on a machine with a new linker to run compiler-rt tests on a machine with an old linker. rdar://125932376
Take care of a TODO. This check makes sure that the fexcept_t value fits in an int value. TODO introduced in: llvm/llvm-project@9550f8b
fatal error for now Appeases build bots while being investigated.
This script+config should help us generate more consistent documentation wrt. what we currently support or not. As an example usage: $ ./libc/utils/docgen/docgen.py fenv.h Will spit out an RST formatted table that can be copy+pasted into our docs. The config is not filled out entirely, but doing so and then updating our docs would be great beginner bugs for new contributors. Having python+json generate things like docs, or headers (as imagined in https://github.com/nickdesaulniers/llvm-project/tree/hdr-gen2) is perhaps easier to work with than tablegen, and doesn't introduce a dependency on a host tool that needs to be compiled from llvm sources before building the rest of the libc. This can probably be merged with whatever we end up doing to replace libc-hdrgen. Please use https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting for keeping this file formatted.
…822) When building flang out-of-tree with relative paths in LLVM_DIR, CLANG_DIR and MLIR_DIR, we need to compute the absolute paths based on the CMake build directory (i.e. where the cmake is invoked from).
The lowering of n-D vector.extract/insert ops to LLVM is not supported but if one of these accidentally reaches the vector-to-llvm conversion patterns, we end up with a kind of puzzling crash. This PR fixes that crash and gracefully bails out in those cases.
…Decimal. (#87827) I will add `toupper` implementation into it in the next PR.
Context: llvm/llvm-project#87017 - Add proxy header `libc/hdr/math_macros.h` that will: - include `<math.h>` in overlay mode, - include `"include/llvm-libc-macros/math-macros.h"` in full build mode. - Its corresponding CMake target `libc.hdr.math_macros` will only depend on `libc.include.math` and `libc.include.llvm-libc-macros.math_macros` in full build mode. - Replace all `#include "include/llvm-libc-macros/math-macros.h"` with `#include "hdr/math_macros.h"`. - Add dependency to `libc.hdr.math_macros` CMake target when using `add_fp_unittest`. - Update the remaining dependency. - Update bazel overlay: add `libc:hdr_math_macros` target, and replacing all dependency on `libc:llvm_libc_macros_math_macros` with `libc:hdr_math_macros`.
Provide a mechanism to resolve call target information for calls from non-BAT functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for future use in BAT-to-BAT calls. Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test Reviewers: ayermolo, maksfb, rafaelauler, dcci Reviewed By: maksfb Pull Request: llvm/llvm-project#86219
llvm-project/flang/lib/Lower/Bridge.cpp:3775:14: error: variable 'nbDeviceResidentObject' set but not used [-Werror,-Wunused-but-set-variable] unsigned nbDeviceResidentObject = 0; ^ 1 error generated.
This has been updated in the isa-manual riscv/riscv-isa-manual#1311
BAT writeMaps encoded the assumption that functions are only split into two fragments (hot and cold). However, BOLT supports splitting into arbitrary number of fragments. Relax that assumption and look up primary (hot) fragment explicitly. Depends on: llvm/llvm-project#86219 Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s Reviewers: ayermolo, rafaelauler, maksfb, dcci Reviewed By: maksfb, dcci Pull Request: llvm/llvm-project#87123
Emit the recorded number of blocks, not the number of basic block hashes. There might be differences in corner cases (openssl BN_BLINDING_convert_ex function). Test Plan: Updated openssl.test in rafaelauler/bolt-tests#31 Reviewers: rafaelauler, ayermolo, maksfb, dcci Reviewed By: ayermolo Pull Request: llvm/llvm-project#87830
…o (#87433) Completely skip include directives that form the filename using macros. fixes #87303
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
ed4e505
ed4e505
etiotto
approved these changes
Apr 10, 2024
victor-eds
approved these changes
Apr 10, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.