Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GENX] Update GENX branch to LLVM 4017f04 #12711

Merged
merged 1,236 commits into from
Feb 14, 2024
Merged

Conversation

whitneywhtsang
Copy link
Contributor

No description provided.

cyndyishida and others added 30 commits January 26, 2024 16:12
A distinction that doesn't _usually_ matter is that the
MachO::SymbolKind is really a mapping of entries in TBD files not
symbols. To better understand this, rename the enum so it represents an
encoding mapped to TBDs as opposed to symbols alone.

For example, it can be a bit confusing that "GlobalSymbol" is a enum
value when all of those values can represent a GlobalSymbol.
Use definitions from `<linux/mman.h>` to dispatch arch-specific flag
values.
For example, `MCL_CURRENT/MCL_FUTURE/MCL_ONFAULT` are different on
different architectures.
This patch refactors the instantiation of BenchmarkMeasure within all
the unit tests to use BenchmarkMeasure::Create rather than through
direct struct instantialization. This allows us to change what values
are stored in BenchmarkMeasure without getting compiler warnings on
every instantiation in the unit tests, and is also just a cleanup in
general as the Create function didn't seem to exist at the time the unit
tests were originally written.
The use of SmallDenseSet saves 0.39% of heap allocations during the
compilation of a large preprocessed file, namely X86ISelLowering.cpp,
for the X86 target.  During the experiment, WL.size() was 2 or less
99.9% of the time.  The inline size of 4 should accommodate up to 2
entries at the 3/4 occupancy rate.
Rushing this one out before vacation starts. Refactoring on top of
#66505
On Darwin, the Makefile already (ad-hoc) signs everything it builds.
There's also no need to use lldb_codesign for this.
In trying to set up python headers in an out-of-tree bazel MLIR project,
I encountered the `pybind11_bazel` project, and found that the
`@python_runtime` target used here is not defined by it.

Instead, it seems that `@python_runtime` is an alias used in some
projects like Tensorflow (see

https://github.com/tensorflow/tensorflow/blob/322936ffdd96ee59e27d028467fe458859cf3855/third_party/python_runtime/BUILD#L7-L7),
where it is aliased to `@local_config_python`. In fact,
`@local_config_python` is defined by `@pybind11_bazel`, and so it seems
that this layer of indirection no longer serves a purpose, and instead
just prevents anyone who doesn't clone Tensorflow's config from using
the python bindings here.

This commit updates the dependent targets to their canonical de-aliased
equivalents, and I suspect this will not even break any downstream users
since the new target is defined in those projects already.

Without this change, running, for example

```
bazel build @llvm-project//mlir:MLIRBindingsPythonCore
```

gives the error

```
no such package '@python_runtime//': The repository '@python_runtime'
could not be resolved: Repository '@python_runtime' is not defined and
referenced by '@llvm-project//mlir:MLIRBindingsPythonCore'
```

Minimal reproduction in https://github.com/j2kun/test_mlir_bazel_pybind,
which, when pointing to a local LLVM repository that has this change
(see `bazel/import_llvm.bzl` in that repository), results in that build
succeeding.

Hat tip to Maksim Levental for going on an hours-long investigation with
me to figure this out.
On Gentoo, libc++ is indeed in /usr/include/c++/*, but libstdc++ is at
e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14.

Use '/include/g++' as it should be unique enough. Note that the omission of
a trailing slash is intentional to match g++-*.

See llvm/llvm-project#78534 (comment).

Reviewed by: mgorny
Closes: llvm/llvm-project#79264

Signed-off-by: Sam James <sam@gentoo.org>
…nside a constraint scope (#79568)

We preserve the trailing requires-expression during the lambda
expression transformation. In order to get those referenced parameters
inside a requires-expression properly resolved to the instantiated
decls, we intended to inject these 'original' `ParmVarDecls` to the
current instantiaion scope, at `Sema::SetupConstraintScope`.

The previous approach seems to overlook nested instantiation chains,
leading to the crash within a nested lambda followed by a requires
clause.

This fixes llvm/llvm-project#73418.
classifyComplexElementType() doesn't return a std::optional anymore.
This patch bumps the mlgo-utils version to 19.0.0 as 18.0.0 got branched
recently.
Implements https://isocpp.org/files/papers/P2662R3.pdf

The feature is exposed as an extension in older language modes.
Mangling is not yet supported and that is something we will have to do before release.
…s (#79371)

This pull request would solve
llvm/llvm-project#78449 .
There is also a discussion about this on stackoverflow:
https://stackoverflow.com/questions/77832658/stdtype-identity-to-support-several-variadic-argument-lists
.

The following program is well formed:
```cpp
#include <type_traits>

template <typename... T>
struct args_tag
{
    using type = std::common_type_t<T...>;
};

template <typename... T>
void bar(args_tag<T...>, std::type_identity_t<T>..., int, std::type_identity_t<T>...) {}

// example
int main() {
    bar(args_tag<int, int>{}, 4, 8, 15, 16, 23);
}
```

but Clang rejects it, while GCC and MSVC doesn't. The reason for this is
that, in `Sema::DeduceTemplateArguments` we are not prepared for this
case.

# Substitution/deduction of parameter packs
The logic that handles substitution when we have explicit template
arguments (`SubstituteExplicitTemplateArguments`) does not work here,
since the types of the pack are not pushed to `ParamTypes` before the
loop starts that does the deduction.
The other "candidate" that may could have handle this case would be the
loop that does the deduction for trailing packs, but we are not dealing
with trailing packs here.

# Solution proposed in this PR
The solution proposed in this PR works similar to the trailing pack
deduction. The main difference here is the end of the deduction cycle.
When a non-trailing template pack argument is found, whose type is not
explicitly specified and the next type is not a pack type, the length of
the previously deduced pack is retrieved (let that length be `s`). After
that the next `s` arguments are processed in the same way as in the case
of non trailing packs.

# Another possible solution
There is another possible approach that would be less efficient. In the
loop when we get to an element of `ParamTypes` that is a pack and could
be substituted because the type is deduced from a previous argument,
then `s` number of arg types would be inserted before the current
element of `ParamTypes` type. Then we would "cancel" the processing of
the current element, first process the previously inserted elements and
the after that re-process the current element.
Basically we would do what `SubstituteExplicitTemplateArguments` does
but during deduction.

# Adjusted test cases
In
`clang/test/CXX/temp/temp.fct.spec/temp.deduct/temp.deduct.call/p1-0x.cpp`
there is a test case named `test_pack_not_at_end` that should work, but
still does not. This test case is relevant because the note for the
error message has changed.
This is what the test case looks like currently:
```cpp
template<typename ...Types>
void pack_not_at_end(tuple<Types...>, Types... values, int); // expected-note {{<int *, double *> vs. <int, int>}}

void test_pack_not_at_end(tuple<int*, double*> t2) {
  pack_not_at_end(t2, 0, 0, 0); // expected-error {{no match}}
  // FIXME: Should the "original argument type must match deduced parameter
  // type" rule apply here?
  pack_not_at_end<int*, double*>(t2, 0, 0, 0); // ok
}

```

The previous note said (before my changes):
```
deduced conflicting types for parameter 'Types' (<int *, double *> vs. <>)
````
The current note says (after my changesand also clang 14 would say this
if the pack was not trailing):
```
deduced conflicting types for parameter 'Types' (<int *, double *> vs. <int, int>)
```
GCC says: 
```
error: no matching function for call to ‘pack_not_at_end(std::tuple<int*, double*>&, int, int, int)’
   70 |     pack_not_at_end(t2, 0, 0, 9); // expected-error {{no match}}
````

---------

Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: Erich Keane <ekeane@nvidia.com>
This patch is aiming at resolving the below missed-optimization case. 

### Code
```
define <8 x i64> @vwadd_mask_v8i32(<8 x i32> %x, <8 x i64> %y) {
    %mask = icmp slt <8 x i32> %x, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
    %a = select <8 x i1> %mask, <8 x i32> %x, <8 x i32> zeroinitializer
    %sa = sext <8 x i32> %a to <8 x i64>
    %ret = add <8 x i64> %sa, %y
    ret <8 x i64> %ret
}
```

### Before this patch
[Compiler Explorer](https://godbolt.org/z/cd1bKTrx6)
```
vwadd_mask_v8i32:
        li      a0, 42
        vsetivli        zero, 8, e32, m2, ta, ma
        vmslt.vx        v0, v8, a0
        vmv.v.i v10, 0
        vmerge.vvm      v16, v10, v8, v0
        vwadd.wv        v8, v12, v16
        ret
```

### After this patch
```
vwadd_mask_v8i32:
        li a0, 42
        vsetivli zero, 8, e32, m2, ta, ma
        vmslt.vx v0, v8, a0
        vsetvli zero, zero, e32, m2, tu, mu
        vwadd.wv v12, v12, v8, v0.t
        vmv4r.v v8, v12
        ret
```
This pattern could be found in a reduction with a widening destination

Specifically, we first do a fold like `(vwadd.wv y, (vmerge cond, x, 0))
-> (vwadd.wv y, x, y, cond)`, then do pattern matching on it.
…(#79657)

The `map` clause in OpenMP allows structure components to be specified
(unlike other clauses). Structure components do get their own symbols,
but these are not meant to be instantiated.

When a component reference is passed as an argument to the omp.target
op, it gets a corresponding parameter in the target op's entry block.
The original symbols are then bound to the same kind of an extended
value as before, but the value is now based on the parameters.

To handle structure components more gracefully, put their symbols on the
list of mapped objects, but skip them when creating extended values.

Fixes llvm/llvm-project#79478.
llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:13754:12:
error: unused variable 'Opc' [-Werror,-Wunused-variable]
  unsigned Opc = N->getOpcode();
           ^
1 error generated.
This patch implements cloning for VPlans and recipes. Cloning is used in
the epilogue vectorization path, to clone the VPlan for the main vector
loop. This means we won't re-use a VPlan when executing the VPlan for
the epilogue vector loop, which in turn will enable us to perform
optimizations based on UF & VF.
Annotating tokens can invalid the stack of Peaked tokens.
tbaederr and others added 21 commits January 30, 2024 11:25
Reverts llvm/llvm-project#78120

Buildbot is broken:

llvm/lib/Support/RISCVISAInfo.cpp:910:18: error: call to deleted
constructor of 'llvm::Error'
          return E;
                 ^
… LLVMIR (#79828)

There is no `SHL` used in canonicalization in `arith`

---------

Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
reserveRegisterTuples is slow because it uses MCRegAliasIterator and
hence ends up reserving the same aliased registers many times. This
patch changes getReservedRegs not to use it for reserving SGPRs, VGPRs
and AGPRs. Instead it iterates through base register classes, which
should come closer to reserving each register once only.

Overall this speeds up the time to run check-llvm-codegen-amdgpu in my
Release build from 18.4 seconds to 16.9 seconds (all timings +/- 0.2).
…mparison (#79698)

This is a follow-up for the comparison of constraints on out-of-line
function template definitions. We require the instantiation of a
ParmVarDecl while transforming the expression if that Decl gets
referenced by a DeclRefExpr. However, we're not actually performing the
class or function template instantiation at the time of such comparison.
Therefore, let's map these parameters to themselves so that they get
preserved after the substitution.

Fixes llvm/llvm-project#74447.
…ressed' switch code. NFC.

Stop clang-format trying to expand manually compressed lookup switch() code - if it still fits into 80col, then keep it to a single line instead of expanding across multiple lines each.
…tructions

Minor correction for #79775 - noticed in EXPENSIVE_CHECKS builds
`OpaqueValueExpr` doesn't necessarily contain a source expression.
Particularly, after #78041, it is used to carry the type and the value
kind of a non-type template argument of floating-point type or referring
to a subobject (those are so called `StructuralValue` arguments).

This fixes #79575.
Those were deprecated and basically not used anymore after we renamed
them in batch. This patch removes the macros entirely.
…U (#79322)

This patch tries to better explain the differences between the
`IsTargetDevice` and `IsGPU` flags of the `OpenMPIRBuilderConfig`.
Another round of additional tests for
llvm/llvm-project#7863
with different sext/zext and use variants.
…ass template explict specializations (#78720)

According to [[dcl.type.elab]
p2](http://eel.is/c++draft/dcl.type.elab#2):
> If an
[elaborated-type-specifier](http://eel.is/c++draft/dcl.type.elab#nt:elaborated-type-specifier)
is the sole constituent of a declaration, the declaration is ill-formed
unless it is an explicit specialization, an explicit instantiation or it
has one of the following forms [...]

Consider the following:
```cpp
template<typename T>
struct A 
{
    template<typename U>
    struct B;
};

template<>
template<typename U>
struct A<int>::B; // intel#1
```
The _elaborated-type-specifier_ at `intel#1` declares an explicit
specialization (which is itself a template). We currently (incorrectly)
reject this, and this PR fixes that.

I moved the point at which _elaborated-type-specifiers_ with
_nested-name-specifiers_ are diagnosed from `ParsedFreeStandingDeclSpec`
to `ActOnTag` for two reasons: `ActOnTag` isn't called for explicit
instantiations and partial/explicit specializations, and because it's
where we determine if a member specialization is being declared.

With respect to diagnostics, I am currently issuing the diagnostic
without marking the declaration as invalid or returning early, which
results in more diagnostics that I think is necessary. I would like
feedback regarding what the "correct" behavior should be here.
This prevents having to use double parentheses in common cases.
The <__threading_support> header is a huge beast and it's really
difficult to navigate. I find myself struggling to find what I want
every time I have to open it, and I've been considering splitting it up
for years for that reason.

This patch aims not to contain any functional change. The various
implementations of the threading base are simply moved to separate
headers and then the individual headers are simplified in mechanical
ways. For example, we used to have redundant declarations of all the
functions at the top of `__threading_support`, and those are removed
since they are not needed anymore. The various #ifdefs are also
simplified and removed when they become unnecessary.

Finally, this patch adds documentation for the API we expect from any
threading implementation.
…(#79871)

Some of the checks in sfinae_helpers.h were not used anymore since we
refactored the std::tuple implementation and were now dead code. This
patch removes the code.
This macro is unnecessary with `basic_string& operator=(value_type
__c)`.
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
@whitneywhtsang whitneywhtsang changed the title [GENX] Update GENX branch to LLVM 0784b1e [GENX] Update GENX branch to LLVM 4017f04 Feb 14, 2024
@whitneywhtsang whitneywhtsang requested a review from a team February 14, 2024 04:24
@whitneywhtsang whitneywhtsang self-assigned this Feb 14, 2024
@whitneywhtsang whitneywhtsang added the genx Pull requests or issues for genx branch label Feb 14, 2024
@whitneywhtsang whitneywhtsang merged commit 23e2542 into intel:genx Feb 14, 2024
11 checks passed
@whitneywhtsang whitneywhtsang deleted the merge branch February 14, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
genx Pull requests or issues for genx branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.