Skip to content

[GEN] Update GENX branch to LLVM ed4e505 #13355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,316 commits into from
Apr 10, 2024
Merged

Conversation

whitneywhtsang
Copy link
Contributor

No description provided.

topperc and others added 30 commits April 3, 2024 17:16
…VRegisterBankInfo::getInstrMapping.

This removes the special case for vectors. The default case in the
second switch can handle GPR in addition to vectors. We just won't
use the static ValueMapping entry.
Use the return type to measure the LMUL size for latency/throughput cost
…rhs`

- When both operands are constant, the matcher runs into an infinite
  loop as the commutation should be applied only when LHS is a constant
  and RHS is not.

Reviewers: arsenm

Reviewed By: arsenm

Pull Request: llvm/llvm-project#87426
… is vector. NFC

If the type is vector, we can immediately know to use vector mapping.
Previously we searched for FP uses, but then replaced it if the type
was vector.
Operations must be created with the supplied builder. Otherwise, the
dialect conversion / greedy pattern rewrite driver can break.

This commit fixes a crash in the dialect conversion:
```
within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: error: failed to legalize operation 'tosa.add'
  %0 = tosa.add %1, %arg2 : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32>
       ^
within split at llvm-project/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-invalid.mlir:1 offset :8:8: note: see current operation: %9 = "tosa.add"(%8, %arg2) : (tensor<10x10xf32>, tensor<*xf32>) -> tensor<*xf32>
mlir-opt: llvm-project/mlir/include/mlir/IR/UseDefLists.h:198: mlir::IRObjectWithUseList<mlir::OpOperand>::~IRObjectWithUseList() [OperandType = mlir::OpOperand]: Assertion `use_empty() && "Cannot destroy a value that still has uses!"' failed.
```

This commit is the proper fix for #87297 (which was reverted).
Implements: https://wg21.link/P2867R2

---------

Co-authored-by: Hristo Hristov <zingam@outlook.com>
Fixed vectors have their sext/zext operands legalized to _VL nodes, so
we need to handle them in the patterns.

This adds a riscv_ext_vl_oneuse pattern since we don't care about the
type of extension used for the shift amount, and extends
Low8BitsSplatPat to handle other _VL nodes. We don't actually need to
check the mask or VL there since none of the _VL nodes have passthru
operands.

The remaining test cases that are widening from i8->i64 need to be
handled by extending combineBinOp_VLToVWBinOp_VL.

This also fixes Low8BitsSplatPat incorrectly checking the vector size
instead of the element size to determine if the splat value might have
been truncated below 8 bits.
…87338)

On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1),
(splat_vector i64 0)), where the splat_vectors are implicitly
truncating.

When the vselect is used by a binop we want to pull the vselect out via
foldSelectWithIdentityConstant. But because vectors with an element size
< i64 will truncate, isNeutralConstant will return false.

This patch handles truncating splats by getting the APInt value and
truncating it. We almost don't need to do this since most of the neutral
elements are either one/zero/all ones, but it will make a difference for
smax and smin.

I wasn't able to figure out a way to write the tests in terms of select,
since we need the i1 zext legalization to create a truncating
splat_vector.

This supercedes #87236. Fixed vectors are unfortunately not handled by
this patch (since they get legalized to _VL nodes), but they don't seem
to appear in the wild.
I'm currently developing a new version of the indexed memprof format
where we deduplicate call stacks in IndexedAllocationInfo::CallStack
and IndexedMemProfRecord::CallSites.  We refer to call stacks with
integer IDs, namely CallStackId, just as we refer to Frame with
FrameId.  The deduplication will cut down the profile file size by 80%
in a large memprof file of mine.

As a step toward the goal, this patch teaches
IndexedMemProfRecord::{serialize,deserialize} to speak Version2.  A
subsequent patch will add Version2 support to llvm-profdata.

The essense of the patch is to replace the serialization of a call
stack, a vector of FrameIDs, with that of a CallStackId.  That is:

  const IndexedAllocationInfo &N = ...;
  ...
  LE.write<uint64_t>(N.CallStack.size());
  for (const FrameId &Id : N.CallStack)
    LE.write<FrameId>(Id);

becomes:

  LE.write<CallStackId>(N.CSId);
…RE_PAUTH` (#87545)

Reland #85231 after fixing build failure
https://lab.llvm.org/buildbot/#/builders/186/builds/15631.
Use `PRIx64` for format output of `uint64_t` as hex.
Original PR description below.

This adds support for `GNU_PROPERTY_AARCH64_FEATURE_PAUTH` feature (as
defined in ARM-software/abi-aa#240) handling in
llvm-readobj and llvm-readelf. The following constants for supported
platforms are also introduced:

- `AARCH64_PAUTH_PLATFORM_INVALID = 0x0`
- `AARCH64_PAUTH_PLATFORM_BAREMETAL = 0x1`
- `AARCH64_PAUTH_PLATFORM_LLVM_LINUX = 0x10000002`

For the llvm_linux platform, output of the tools contains descriptions
of PAuth features which are enabled/disabled depending on the version
value. Version value bits correspond to the following `LangOptions`
defined in #85232:

- bit 0: `PointerAuthIntrinsics`;
- bit 1: `PointerAuthCalls`;
- bit 2: `PointerAuthReturns`;
- bit 3: `PointerAuthAuthTraps`;
- bit 4: `PointerAuthVTPtrAddressDiscrimination`;
- bit 5: `PointerAuthVTPtrTypeDiscrimination`;
- bit 6: `PointerAuthInitFini`.

Support for `.note.AARCH64-PAUTH-ABI-tag` is dropped since it's deleted
from the spec in ARM-software/abi-aa#250.
This adds handling of range attribute for return values of Call and
Invoke in getFromRangeMetadata and handling of argument with range
attribute in solveBlockValueNonLocal.
There is one additional check of the range metadata at line 1120 in
getValueFromSimpleICmpCondition that is not covered in this PR as after
llvm/llvm-project#75311 there is no test that
cover that check any more and I have not been able to create a test that
trigger that code.
…ore (#87504)

This commit relaxes Mem2Reg's type equality requirement for the LLVM
dialect's load and store operations. For now, we only allow loads to be
promoted if the reaching definition can be casted into a value of the
target type.

For stores, all type checks are removed, as a non-volatile store that
does not write out the alloca's pointer can always be deleted.
SLES 15 comes with a GCC 7.5 as default, which does not support the
C++17 `<charconv>` header. This results in build errors when trying to
run `check-flang`.
This patch addresses that and uses the older `std::stol` for the string
-> number conversion to allow the SLES 15 buildbot
(https://lab.llvm.org/staging/#/builders/193) to turn green.
…(#86098)

There is an assertion that the stop condition is not satisfied for the
the starting point at the beginning of `computeBound`. Therefore, that
case does not have to be handled later on in that function.
…on in the constructor (#86099)

This commit changes the API of `ValueBoundsConstraintSet`: the stop
condition is now passed to the constructor instead of `processWorklist`.
That makes it easier to add items to the worklist multiple times and
process them in a consistent manner. The current
`ValueBoundsConstraintSet` is passed as a reference to the stop
function, so that the stop function can be defined before the the
`ValueBoundsConstraintSet` is constructed.

This change is in preparation of adding support for branches.
This patch moves most of the multiprecision logic to the `multiword`
namespace and simplifies some logic in `BigInt`. It also fully
implements the mask and count functions and increases test coverage.

`math_extras.h` is also reworked to make it more concise.
This PR:
* fixes OpVariable instructions place in a function (see
llvm/llvm-project#66261),
* improves type inference,
* helps avoiding unneeded bitcasts when validating function call's

This allows to improve existing and add new test cases with more strict
checks. OpVariable fix refers to "All OpVariable instructions in a
function must be the first instructions in the first block" requirement
from SPIR-V spec.
Reverts llvm/llvm-project#86137

Some aarch64 compilers seem to consider that `uint128_t` is not
`is_trivially_constructible` which prevents `bit_cast`-ing.
libclc is mentioned in the list of LLVM_ENABLE_PROJECTS but it isn't
actually possible to build it in-tree for various reasons. Users
currently have to build it via LLVM_ENABLE_EXTERNAL_PROJECTS, which
isn't very well documented.

We can't properly build in-tree because the current system needs to
"see" clang and other tools at CMake configuration time. The general
idea is that we could fix this in the future by moving the compilation
and linking of bitcode libraries to custom commands, which would remove
the dependency on CMake configuration and would allow us to build libclc
after clang and other tools are built in-tree. Since that's a bigger
change, it is being left for later.

Note that with this commit it's *still* not possible to properly build
in-tree - this commit just fixes a few little things that are in the
way. We are now able to build in-tree in the sense that it can be built
as a regular LLVM sub-project, but the tools it uses to compile the
libraries are still picked up from a pre-existing installation of LLVM,
and not from tools built during the same build as libclc.

The things fixed by this commit include:

* Its use of CMAKE_SOURCE_DIR (i.e., assuming it was the top-level
project)
* These have been converted to PROJECT_SOURCE_DIR - should have no
consequences for out-of-tree builds.
* Its prepare_builtins tool insisting on linking against the dynamic
LLVM.so.
* This has been turned from an "llvm executable" into an "llvm utility"
which links against the static libraries.
  * It was also missing a link component for the IRReader library.
* Assuming an output path for its builtin libraries (dependent on the
working directory)
* This has been changed to query CMake for the library target's output
file.
* The spirv-mesa3d and spirv64-mesa3d targets were enabled by default
(or when asking to build 'all' libclc targets), when they require
llvm-spirv as an external dependency.
* They are now only built when the user explicitly asks for them, or
when llvm-spirv is available and the user asks for 'all'.
…(#87539)

Call generateWaitcnt unconditionally at the end of
SIInsertWaitcnts::insertWaitcntInBlock. Even if we don't need to
generate a new waitcnt instruction it has the effect of combining or
removing redundant waitcnts that were already present. Tests show
various small improvements in waitcnt placement.
The class `ScopedDbgInfoFormatSetter` was added as a convenient way to
temporarily change the debug info format of a function or module, as
part of IR printing; since this process is repeated in a number of other
places, this patch uses the format-setter class in those places as well.
Darwin targets implement -mcmodel=large by forcing all global accesses to use
the GOT, instead of the ELF movz/movk sequence. That means it's compatible with
PIC so the Clang driver shouldn't reject the option.
…g (#72714)

This patch adds lld support for:

- Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH
relocations) as described here:
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations

- .note.AARCH64-PAUTH-ABI-tag section as defined here
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking

Depends on #72713 and #85231

---------

Co-authored-by: Peter Collingbourne <peter@pcc.me.uk>
Co-authored-by: Fangrui Song <i@maskray.me>
This is a reland of #86137 with a fix for platforms / compiler that do
not support trivially constructible int128 types.
nickdesaulniers and others added 18 commits April 5, 2024 14:29
…le toolchain (#87684)

Building the Apple way turns off plugin support, meaning we don't need
to export unloadable symbols from all executables. While deadstripping
effects aren't expected to change, enabling this across all tools
prevents the creation of export tries. This saves us ~3.5 MB in just the
universal build of `clang`.
The HOST_LINK_VERSION is a hardcoded string in Darwin clang that detects
the linker version at configure time. The driver uses this information
to build the correct set of arguments for the linker. This patch detects
the linker version again during compiler-rt configuration and passes it
to the libfuzzer tests. This allows a clang built on a machine with a
new linker to run compiler-rt tests on a machine with an old linker.

rdar://125932376
Take care of a TODO. This
check makes sure that the fexcept_t
value fits in an int value.

TODO introduced in:
llvm/llvm-project@9550f8b
fatal error for now

Appeases build bots while being investigated.
This script+config should help us generate more consistent documentation wrt.
what we currently support or not.

As an example usage:

    $ ./libc/utils/docgen/docgen.py fenv.h

Will spit out an RST formatted table that can be copy+pasted into our docs.

The config is not filled out entirely, but doing so and then updating our docs
would be great beginner bugs for new contributors.

Having python+json generate things like docs, or headers (as imagined in
https://github.com/nickdesaulniers/llvm-project/tree/hdr-gen2) is perhaps
easier to work with than tablegen, and doesn't introduce a dependency on a host
tool that needs to be compiled from llvm sources before building the rest of
the libc. This can probably be merged with whatever we end up doing to replace
libc-hdrgen.

Please use

https://llvm.org/docs/CodingStandards.html#python-version-and-source-code-formatting
for keeping this file formatted.
…822)

When building flang out-of-tree with relative paths in LLVM_DIR,
CLANG_DIR and MLIR_DIR, we need to compute the absolute paths
based on the CMake build directory (i.e. where the cmake is invoked
from).
The lowering of n-D vector.extract/insert ops to LLVM is not supported
but if one of these accidentally reaches the vector-to-llvm conversion
patterns, we end up with a kind of puzzling crash. This PR fixes that
crash and gracefully bails out in those cases.
…Decimal. (#87827)

I will add `toupper` implementation into it in the next PR.
Context: llvm/llvm-project#87017

- Add proxy header `libc/hdr/math_macros.h` that will:
  - include `<math.h>` in overlay mode,
- include `"include/llvm-libc-macros/math-macros.h"` in full build mode.
- Its corresponding CMake target `libc.hdr.math_macros` will only depend
on `libc.include.math` and `libc.include.llvm-libc-macros.math_macros`
in full build mode.
- Replace all `#include "include/llvm-libc-macros/math-macros.h"` with
`#include "hdr/math_macros.h"`.
- Add dependency to `libc.hdr.math_macros` CMake target when using
`add_fp_unittest`.
- Update the remaining dependency.
- Update bazel overlay: add `libc:hdr_math_macros` target, and replacing
all dependency on `libc:llvm_libc_macros_math_macros` with
`libc:hdr_math_macros`.
Provide a mechanism to resolve call target information for calls from non-BAT
functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for
future use in BAT-to-BAT calls.

Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: ayermolo, maksfb, rafaelauler, dcci

Reviewed By: maksfb

Pull Request: llvm/llvm-project#86219
llvm-project/flang/lib/Lower/Bridge.cpp:3775:14:
error: variable 'nbDeviceResidentObject' set but not used [-Werror,-Wunused-but-set-variable]
    unsigned nbDeviceResidentObject = 0;
             ^
1 error generated.
BAT writeMaps encoded the assumption that functions are only split into
two fragments (hot and cold). However, BOLT supports splitting into
arbitrary number of fragments. Relax that assumption and look up primary
(hot) fragment explicitly.

Depends on: llvm/llvm-project#86219

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, rafaelauler, maksfb, dcci

Reviewed By: maksfb, dcci

Pull Request: llvm/llvm-project#87123
Emit the recorded number of blocks, not the number of basic block
hashes. There might be differences in corner cases (openssl
BN_BLINDING_convert_ex function).

Test Plan:
Updated openssl.test in rafaelauler/bolt-tests#31

Reviewers: rafaelauler, ayermolo, maksfb, dcci

Reviewed By: ayermolo

Pull Request: llvm/llvm-project#87830
…o (#87433)

Completely skip include directives that form the filename using macros.

fixes #87303
@whitneywhtsang whitneywhtsang self-assigned this Apr 10, 2024
@whitneywhtsang whitneywhtsang added the genx Pull requests or issues for genx branch label Apr 10, 2024
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
@whitneywhtsang whitneywhtsang changed the title [GENX] Update GENX branch to LLVM ed4e505 [GEN] Update GENX branch to LLVM ed4e505 Apr 10, 2024
@whitneywhtsang whitneywhtsang requested a review from a team April 10, 2024 17:41
@whitneywhtsang whitneywhtsang marked this pull request as ready for review April 10, 2024 17:41
@whitneywhtsang whitneywhtsang requested a review from a team April 10, 2024 19:16
@whitneywhtsang whitneywhtsang merged commit 39b8333 into intel:genx Apr 10, 2024
9 checks passed
@whitneywhtsang whitneywhtsang deleted the merge branch April 10, 2024 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
genx Pull requests or issues for genx branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.