[GENX] Update GENX branch to LLVM `5e5a22c` #12112

whitneywhtsang · 2023-12-07T15:58:23Z

No description provided.

This showed up when simplifying some large testcase, where the cfi directives became out of sync with the proc's they enclose. Now restricted to platforms that support .subsections_via_symbols. This reverts commit 797b68c. Fixes: #72802 Differential revision: https://reviews.llvm.org/D153167 rdar://111459507

This is part-2 change to improve codegen for vec_fabs. In this patch, v16f16 and v132f16 fabs are improved. There will be at least two followups patches after this one. 1) fixing the ISEL crash when fabs.v32f16 uses custom lowering with AVX512 2) better expansion for v16f16, v32f16 types on AVX1 subtargets.

We can use short forward branch to conditionally negate if the value is negative.

Update tests in "vector-contract-matvec-transforms.mlir" so that they are consistent with similar tests in: * "vector-contract-to-outerproduct-transforms.mlir". This is to enable further refactoring in a follow-up patch, namely to: * remove duplication (this will be much easier once consistent naming is used), * extend tests in "vector-contract-matvec-transforms.mlir" with cases for scalable vectors, * merge "vector-contract-matvec-transforms.mlir" and "vector-contract-to-outerproduct-transforms.mlir" (there's no need for 2 different files testing identical transformations). Overview of changes in this patch: 1. Simplify the test by removing MemRef wrappers - this test verifies Vector -> Vector transformations and MemRefs are not needed. 2. Use (m, k) indices instead of (i, j). 3. Rename function names. This is part of a larger effort to improve test coverage for scalable vectors in the Vector dialect. Implements #72834.

`atomic` is required to be followed by a special `atomic clause`, so this patch manages the parsing of that. We are representing each of the variants of the atomic construct as separate kinds, because they have distinct rules/application/etc, and this should make it easier to check rules in the future.

…ntrinisc (#70362)" This reverts commit f79676a.

… definitions and declarations (#72452) https://reviews.llvm.org/D128482 regressed certain cases of VTT emission which are no longer hidden with -fvisibility=hidden. Fix this regression by marking both declarations and definitions. Fixes [clang codegen][regression] VTT definitions missing dso_local/hidden/etc markings #72451

The bots have been running smoothly for a while. Check out https://libcxx.efcs.ca/cistats.html for more info.

Jakub Jelínek reports: As mentioned in https://gcc.gnu.org/PR112563, the new DECLARE_WRAPPER macro added in 37445e9 and ammended in 85d3873 doesn't work on SPARC/Solaris with Solaris as. While clang and GNU as when used from GCC seems to be forgiving on most architectures and allow both %function and @function (with the latter not being allowed on ARM/AArch64 I believe because @ is assembler comment start there), Solaris as doesn't allow the %function form. Fix it by using %function only for ARM. Co-developed-by: Jakub Jelínek <jakub@redhat.com> Reported-by: Jakub Jelínek <jakub@redhat.com> Closes: llvm/llvm-project#72970

These assertions can only be triggered by bugs in the algorithm's implementation; all user inputs should be handled gracefully.

OutputUnformattedBlock and InputUnformattedBlock are not used.

…2576) Build clang with the host compiler and ccache enabled in order to speed up the phase 1 builds. This helps reduce the amount of time spent running on the non-free builders.

…le reader and writer (#73026) Test fixture `MaybeSparseInstrProfTest` parameterize InstrProfWriter by whether output is sparse or not. This test fixture has 20 test cases, and 6 of them doesn't use profile reader and writer. Undo the parameterization for these test cases will reduce redundant tests. This is one clean-up PR. (A few more clean-ups to come soon, but they are not inter-dependent)

This will help users analyze whether high register usage is coming from inability of scheduler to reduce RP, or from sacrificing good RP to improve ILP.

The NonNeg flag was being Anded with the Exact flag.

Baremetal targets tend to implement their own runtime support for sanitizers. Clang driver gatekeeping of allowed sanitizer types is counter productive. This change allows anything that does not crash and burn in compilation, and leaves any potential runtime issues for the user to figure out.

This skips the build of all the unittests and llvm/clang tools, reducing the number of ninja targets from 4,826 to 3,816 in phase 1 and phase 2.

Fixes #71498

…acePrinter. (#73029) Make some methods of StackTracPrinter that will have a common implementation, non virtual.

Adds a new Implementation of StackTracePrinter that only emits symbolizer markup. Currently this change only affects Fuchsia OS. Should be NFC.

… test of value profiles (#73038) This patch factor out the common code among three similar test cases. The input data and test logic are pretty similar. Parameterize the differences (prof-weight and endianness) as advised in llvm/llvm-project#72611. - Remove duplicated tests

removed two unused methods, removed obsoleted FIXME

Remove the run lines that check for the FIR lowering. HLFIR lowering produce the same check lines.

…ebugger (#71564) Use "__this" in DataMemberRecord, make vs debugger can be parsed normally Fixes #71562

…ests (#73035) HLFIR lowering as been set by default now and FIR lowering support will be removed in the near future. This patch removes the specific FIR check lines on enter/exit data tests.

Update regex to _explicitly_ show which exp versions are added. The previous regex used `exp[^e]` to avoid matching calls like: `@llvm.experimental.stepvector`. Note: ArmPL Mappings for scalable types are not yet utilized (eg, `llvm.exp10.nxv2f64`, `llvm.exp10.nxv4f32`), as `replace-with-veclib` pass needs improvements.

…(#72526) The code in the CloneInstructionsIntoPredec... function modified by this patch has a long history that dates back to 2011, see d715ec8. There, when folding branches, all dbg.value intrinsics seen when folding would be saved and then re-inserted at the end of whatever was folded. Over the last 12 years this behaviour has been preserved. However, IMO it's bad behaviour. If we have: inst1 dbg.value1 inst2 dbg.value2 And we fold that sequence into a different block, then we would want the instructions and variable assignments to appear in the same order. However because of this old behaviour, the dbg.values are sunk, and we get: inst1 inst2 dbg.value1 dbg.value2 This clustering of dbg.values can make assignments to the same variable invisible, as well as reducing the coverage of other assignments. This patch relaxes the CloneInstructions... function and allows it to clone and update dbg.values in-place, causing them to appear in the original order in the destination block. I've added some extra dbg.values to the updated test: without the changes to the pass, the dbg.values sink into a blob ahead of the select. The RemoveDIs code can't cope with this right now so I've removed the "--try..." flag, restored in a commit to land in a couple of hours. (Metadata changes to make the LLVM-IR parser not drop the debug-info for it being out of date. The RemoveDIs related RUN line has been removed because it was spuriously passing due to the debug-info being dropped).

…n type of function. (#69724) Import of a function with `auto` return type that is expanded to a `SubstTemplateTypeParmType` could fail if the function itself is the template specialization where the parameter was replaced.

Explicitly include some headers or forward-declare types, in preparation for removing an include that pulls in many transitive headers.

If GOTSym is not defined, we cannot call `GOTSym.getBlock()`. It failed with: ``` Assertion failed: (Base->isDefined() && "Not a defined symbol"), function getBlock, file JITLink.h, line 554. ```

Manually add clobbers for various register combinations to tests. This highlights incorrectly performing shrink-wrapping, with StoreSwiftAsyncContext expansion clobbering a live register.

The new libcxx workflow are run and failing in non llvm repo. Skip it similar to other workflow.

There have been some minor but pervasive changes to the generated CHECK lines, so regenerate all of them, to minimize future diffs.

Avoid some spurious changes in a future patch.

…ction (#73311) Floating point properties are a combination of target OS, target architecture and compiler support. - Adding target OS detection, - Moving floating point type detection to its own file. This is in preparation of adding support for `_Float16` which requires testing compiler **version** and target architecture.

Signed-off-by: Tsang, Whitney <whitney.tsang@intel.com>

whitneywhtsang · 2023-12-09T04:43:36Z

Merge when triton-dse repo is ready to move to 34f165d6633713a2c9b667927c3baaf018157ef1.

jroelofs and others added 30 commits November 21, 2023 10:33

[AMDGPU] NFC. Run auto-update on a few tests

b072ec5

[RISCV] Add rv32 command line to short-forward-branch-opt.ll. NFC

ce61274

[RISCV] Use short forward branch for ISD::ABS.

7a6fd49

We can use short forward branch to conditionally negate if the value is negative.

Revert "[SVE2.1][Clang][LLVM]Add BFloat16 builtin in Clang and LLVM i…

e1ee0e8

…ntrinisc (#70362)" This reverts commit f79676a.

[RISCV] Replace XLenVT in RV64 only pattern with i64. NFC

c9fd76f

[libc++] Promote android to supported. (#72949)

46a8479

The bots have been running smoothly for a while. Check out https://libcxx.efcs.ca/cistats.html for more info.

[Bazel][clang] Fix build for e6ef315

f544533

[libc++][hardening] Categorize all ryu assertions as internal (#71853)

bed1a5b

These assertions can only be triggered by bugs in the algorithm's implementation; all user inputs should be handled gracefully.

[flang] Remove dead code and update test (NFC) (#73004)

8cf6e94

OutputUnformattedBlock and InputUnformattedBlock are not used.

workflows/release-binaries: Do a preliminary build to fill ccache (#7…

e746b56

…2576) Build clang with the host compiler and ccache enabled in order to speed up the phase 1 builds. This helps reduce the amount of time spent running on the non-free builders.

[AMDGPU] NFC: Add flag to disable clustered low occupancy phase (#73025)

5b2fee8

This will help users analyze whether high register usage is coming from inability of scheduler to reduce RP, or from sacrificing good RP to improve ILP.

[SelectionDAG] Fix copy/paste mistake in SDNodeFlags::intersectWith

6a082ed

The NonNeg flag was being Anded with the Exact flag.

[clang][NFC] Reorder Atomic builtins to be consistent. (#72718)

752c21b

test-release.sh: Only build the clang target in stage 1 and 2 (#72703)

907ed77

This skips the build of all the unittests and llvm/clang tools, reducing the number of ninja targets from 4,826 to 3,816 in phase 1 and phase 2.

[libc++] Make common_iterator's data member private (#72564)

ca9b1d1

Fixes #71498

[NFC sanitizer_symbolizer] Make some functions non virtual in StackTr…

23c84fb

…acePrinter. (#73029) Make some methods of StackTracPrinter that will have a common implementation, non virtual.

[sanitizer_symbolizer] Add MarkupStackTracePrinter (#73032)

cc21287

Adds a new Implementation of StackTracePrinter that only emits symbolizer markup. Currently this change only affects Fuchsia OS. Should be NFC.

[mlir][sparse] code cleanup (#73047)

d2d2928

removed two unused methods, removed obsoleted FIXME

[flang][openacc][NFC] Remove run line for FIR only checks (#73050)

d82b521

Remove the run lines that check for the FIR lowering. HLFIR lowering produce the same check lines.

Supports viewing class member variables in lambda when using the vs d…

7c3c243

…ebugger (#71564) Use "__this" in DataMemberRecord, make vs debugger can be parsed normally Fixes #71562

[flang][openacc][NFC] Check only HLFIR lowering for enter/exit data t…

c384888

…ests (#73035) HLFIR lowering as been set by default now and FIR lowering support will be removed in the near future. This patch removes the specific FIR check lines on enter/exit data tests.

paschalis-mpeis and others added 14 commits November 24, 2023 12:24

Remove extraneous ` in AttrDocs.td

e8cd401

[libc][NFC] Remove dead code (#73315)

dc9787c

[clang] Classify vector types in __builtin_classify_type (#73299)

a79a561

[CodeGen] Make some includes explicit (NFC)

7eeedc1

Explicitly include some headers or forward-declare types, in preparation for removing an include that pulls in many transitive headers.

[llvm-jitlink] Avoid assertion failure in make_error parameter

c8562e8

If GOTSym is not defined, we cannot call `GOTSym.getBlock()`. It failed with: ``` Assertion failed: (Base->isDefined() && "Not a defined symbol"), function getBlock, file JITLink.h, line 554. ```

[JITLink] Fix typos: symobls -> symbols (NFC)

962829b

[AArch64] Add artificial clobbers to swift async context test.

820b358

Manually add clobbers for various register combinations to tests. This highlights incorrectly performing shrink-wrapping, with StoreSwiftAsyncContext expansion clobbering a live register.

[CI] Skip libcxx in non-llvm repo (#73282)

85ee351

The new libcxx workflow are run and failing in non llvm repo. Skip it similar to other workflow.

[SCEV] Regenerate test checks (NFC)

88f7dc1

There have been some minor but pervasive changes to the generated CHECK lines, so regenerate all of them, to minimize future diffs.

[CVP] Regenerate test checks (NFC)

50c298f

Avoid some spurious changes in a future patch.

whitneywhtsang added the genx label Dec 7, 2023

whitneywhtsang self-assigned this Dec 7, 2023

whitneywhtsang force-pushed the merge branch 2 times, most recently from 12c04ad to aa06c2e Compare December 7, 2023 23:24

whitneywhtsang requested a review from pengtu December 8, 2023 01:21

pengtu approved these changes Dec 8, 2023

View reviewed changes

Merge commit '5e5a22caf88ac1ccfa8dc5720295fdeba0ad9372'

db6ab73

whitneywhtsang force-pushed the merge branch from a468919 to 0dd4c2d Compare December 8, 2023 11:27

whitneywhtsang and others added 3 commits December 8, 2023 17:02

[GENX] Use opaque pointer

150c0a8

Signed-off-by: Tsang, Whitney <whitney.tsang@intel.com>

[GENX] Update GenIntrinsicEnum.h GenIntrinsics.h

970ef1a

[GENX] Pass 0 as cache control option for 2DBlock[Read|Write]

728ce37

Signed-off-by: Tsang, Whitney <whitney.tsang@intel.com>

whitneywhtsang force-pushed the merge branch from cf01699 to 728ce37 Compare December 9, 2023 01:03

etiotto marked this pull request as ready for review December 13, 2023 19:14

etiotto merged commit 728ce37 into intel:genx Dec 13, 2023

whitneywhtsang deleted the merge branch December 14, 2023 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GENX] Update GENX branch to LLVM `5e5a22c` #12112

[GENX] Update GENX branch to LLVM `5e5a22c` #12112

whitneywhtsang commented Dec 7, 2023

whitneywhtsang commented Dec 9, 2023

[GENX] Update GENX branch to LLVM 5e5a22c #12112

[GENX] Update GENX branch to LLVM 5e5a22c #12112

Conversation

whitneywhtsang commented Dec 7, 2023

whitneywhtsang commented Dec 9, 2023

[GENX] Update GENX branch to LLVM `5e5a22c` #12112

[GENX] Update GENX branch to LLVM `5e5a22c` #12112