-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GENX] Update GENX branch to LLVM 4017f04
#12711
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A distinction that doesn't _usually_ matter is that the MachO::SymbolKind is really a mapping of entries in TBD files not symbols. To better understand this, rename the enum so it represents an encoding mapped to TBDs as opposed to symbols alone. For example, it can be a bit confusing that "GlobalSymbol" is a enum value when all of those values can represent a GlobalSymbol.
Use definitions from `<linux/mman.h>` to dispatch arch-specific flag values. For example, `MCL_CURRENT/MCL_FUTURE/MCL_ONFAULT` are different on different architectures.
This patch refactors the instantiation of BenchmarkMeasure within all the unit tests to use BenchmarkMeasure::Create rather than through direct struct instantialization. This allows us to change what values are stored in BenchmarkMeasure without getting compiler warnings on every instantiation in the unit tests, and is also just a cleanup in general as the Create function didn't seem to exist at the time the unit tests were originally written.
The use of SmallDenseSet saves 0.39% of heap allocations during the compilation of a large preprocessed file, namely X86ISelLowering.cpp, for the X86 target. During the experiment, WL.size() was 2 or less 99.9% of the time. The inline size of 4 should accommodate up to 2 entries at the 3/4 occupancy rate.
…aryFunctionsChecker (#78895)
Rushing this one out before vacation starts. Refactoring on top of #66505
On Darwin, the Makefile already (ad-hoc) signs everything it builds. There's also no need to use lldb_codesign for this.
In trying to set up python headers in an out-of-tree bazel MLIR project, I encountered the `pybind11_bazel` project, and found that the `@python_runtime` target used here is not defined by it. Instead, it seems that `@python_runtime` is an alias used in some projects like Tensorflow (see https://github.com/tensorflow/tensorflow/blob/322936ffdd96ee59e27d028467fe458859cf3855/third_party/python_runtime/BUILD#L7-L7), where it is aliased to `@local_config_python`. In fact, `@local_config_python` is defined by `@pybind11_bazel`, and so it seems that this layer of indirection no longer serves a purpose, and instead just prevents anyone who doesn't clone Tensorflow's config from using the python bindings here. This commit updates the dependent targets to their canonical de-aliased equivalents, and I suspect this will not even break any downstream users since the new target is defined in those projects already. Without this change, running, for example ``` bazel build @llvm-project//mlir:MLIRBindingsPythonCore ``` gives the error ``` no such package '@python_runtime//': The repository '@python_runtime' could not be resolved: Repository '@python_runtime' is not defined and referenced by '@llvm-project//mlir:MLIRBindingsPythonCore' ``` Minimal reproduction in https://github.com/j2kun/test_mlir_bazel_pybind, which, when pointing to a local LLVM repository that has this change (see `bazel/import_llvm.bzl` in that repository), results in that build succeeding. Hat tip to Maksim Levental for going on an hours-long investigation with me to figure this out.
See #78920. This reverts commit ce3e767.
On Gentoo, libc++ is indeed in /usr/include/c++/*, but libstdc++ is at e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14. Use '/include/g++' as it should be unique enough. Note that the omission of a trailing slash is intentional to match g++-*. See llvm/llvm-project#78534 (comment). Reviewed by: mgorny Closes: llvm/llvm-project#79264 Signed-off-by: Sam James <sam@gentoo.org>
…nside a constraint scope (#79568) We preserve the trailing requires-expression during the lambda expression transformation. In order to get those referenced parameters inside a requires-expression properly resolved to the instantiated decls, we intended to inject these 'original' `ParmVarDecls` to the current instantiaion scope, at `Sema::SetupConstraintScope`. The previous approach seems to overlook nested instantiation chains, leading to the crash within a nested lambda followed by a requires clause. This fixes llvm/llvm-project#73418.
classifyComplexElementType() doesn't return a std::optional anymore.
This patch bumps the mlgo-utils version to 19.0.0 as 18.0.0 got branched recently.
Implements https://isocpp.org/files/papers/P2662R3.pdf The feature is exposed as an extension in older language modes. Mangling is not yet supported and that is something we will have to do before release.
…s (#79371) This pull request would solve llvm/llvm-project#78449 . There is also a discussion about this on stackoverflow: https://stackoverflow.com/questions/77832658/stdtype-identity-to-support-several-variadic-argument-lists . The following program is well formed: ```cpp #include <type_traits> template <typename... T> struct args_tag { using type = std::common_type_t<T...>; }; template <typename... T> void bar(args_tag<T...>, std::type_identity_t<T>..., int, std::type_identity_t<T>...) {} // example int main() { bar(args_tag<int, int>{}, 4, 8, 15, 16, 23); } ``` but Clang rejects it, while GCC and MSVC doesn't. The reason for this is that, in `Sema::DeduceTemplateArguments` we are not prepared for this case. # Substitution/deduction of parameter packs The logic that handles substitution when we have explicit template arguments (`SubstituteExplicitTemplateArguments`) does not work here, since the types of the pack are not pushed to `ParamTypes` before the loop starts that does the deduction. The other "candidate" that may could have handle this case would be the loop that does the deduction for trailing packs, but we are not dealing with trailing packs here. # Solution proposed in this PR The solution proposed in this PR works similar to the trailing pack deduction. The main difference here is the end of the deduction cycle. When a non-trailing template pack argument is found, whose type is not explicitly specified and the next type is not a pack type, the length of the previously deduced pack is retrieved (let that length be `s`). After that the next `s` arguments are processed in the same way as in the case of non trailing packs. # Another possible solution There is another possible approach that would be less efficient. In the loop when we get to an element of `ParamTypes` that is a pack and could be substituted because the type is deduced from a previous argument, then `s` number of arg types would be inserted before the current element of `ParamTypes` type. Then we would "cancel" the processing of the current element, first process the previously inserted elements and the after that re-process the current element. Basically we would do what `SubstituteExplicitTemplateArguments` does but during deduction. # Adjusted test cases In `clang/test/CXX/temp/temp.fct.spec/temp.deduct/temp.deduct.call/p1-0x.cpp` there is a test case named `test_pack_not_at_end` that should work, but still does not. This test case is relevant because the note for the error message has changed. This is what the test case looks like currently: ```cpp template<typename ...Types> void pack_not_at_end(tuple<Types...>, Types... values, int); // expected-note {{<int *, double *> vs. <int, int>}} void test_pack_not_at_end(tuple<int*, double*> t2) { pack_not_at_end(t2, 0, 0, 0); // expected-error {{no match}} // FIXME: Should the "original argument type must match deduced parameter // type" rule apply here? pack_not_at_end<int*, double*>(t2, 0, 0, 0); // ok } ``` The previous note said (before my changes): ``` deduced conflicting types for parameter 'Types' (<int *, double *> vs. <>) ```` The current note says (after my changesand also clang 14 would say this if the pack was not trailing): ``` deduced conflicting types for parameter 'Types' (<int *, double *> vs. <int, int>) ``` GCC says: ``` error: no matching function for call to ‘pack_not_at_end(std::tuple<int*, double*>&, int, int, int)’ 70 | pack_not_at_end(t2, 0, 0, 9); // expected-error {{no match}} ```` --------- Co-authored-by: cor3ntin <corentinjabot@gmail.com> Co-authored-by: Erich Keane <ekeane@nvidia.com>
As it breaks buildkite CI
This patch is aiming at resolving the below missed-optimization case. ### Code ``` define <8 x i64> @vwadd_mask_v8i32(<8 x i32> %x, <8 x i64> %y) { %mask = icmp slt <8 x i32> %x, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42> %a = select <8 x i1> %mask, <8 x i32> %x, <8 x i32> zeroinitializer %sa = sext <8 x i32> %a to <8 x i64> %ret = add <8 x i64> %sa, %y ret <8 x i64> %ret } ``` ### Before this patch [Compiler Explorer](https://godbolt.org/z/cd1bKTrx6) ``` vwadd_mask_v8i32: li a0, 42 vsetivli zero, 8, e32, m2, ta, ma vmslt.vx v0, v8, a0 vmv.v.i v10, 0 vmerge.vvm v16, v10, v8, v0 vwadd.wv v8, v12, v16 ret ``` ### After this patch ``` vwadd_mask_v8i32: li a0, 42 vsetivli zero, 8, e32, m2, ta, ma vmslt.vx v0, v8, a0 vsetvli zero, zero, e32, m2, tu, mu vwadd.wv v12, v12, v8, v0.t vmv4r.v v8, v12 ret ``` This pattern could be found in a reduction with a widening destination Specifically, we first do a fold like `(vwadd.wv y, (vmerge cond, x, 0)) -> (vwadd.wv y, x, y, cond)`, then do pattern matching on it.
…(#79657) The `map` clause in OpenMP allows structure components to be specified (unlike other clauses). Structure components do get their own symbols, but these are not meant to be instantiated. When a component reference is passed as an argument to the omp.target op, it gets a corresponding parameter in the target op's entry block. The original symbols are then bound to the same kind of an extended value as before, but the value is now based on the parameters. To handle structure components more gracefully, put their symbols on the list of mapped objects, but skip them when creating extended values. Fixes llvm/llvm-project#79478.
llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:13754:12: error: unused variable 'Opc' [-Werror,-Wunused-variable] unsigned Opc = N->getOpcode(); ^ 1 error generated.
This patch implements cloning for VPlans and recipes. Cloning is used in the epilogue vectorization path, to clone the VPlan for the main vector loop. This means we won't re-use a VPlan when executing the VPlan for the epilogue vector loop, which in turn will enable us to perform optimizations based on UF & VF.
Annotating tokens can invalid the stack of Peaked tokens.
Reverts llvm/llvm-project#78120 Buildbot is broken: llvm/lib/Support/RISCVISAInfo.cpp:910:18: error: call to deleted constructor of 'llvm::Error' return E; ^
… LLVMIR (#79828) There is no `SHL` used in canonicalization in `arith` --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com> Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
reserveRegisterTuples is slow because it uses MCRegAliasIterator and hence ends up reserving the same aliased registers many times. This patch changes getReservedRegs not to use it for reserving SGPRs, VGPRs and AGPRs. Instead it iterates through base register classes, which should come closer to reserving each register once only. Overall this speeds up the time to run check-llvm-codegen-amdgpu in my Release build from 18.4 seconds to 16.9 seconds (all timings +/- 0.2).
…mparison (#79698) This is a follow-up for the comparison of constraints on out-of-line function template definitions. We require the instantiation of a ParmVarDecl while transforming the expression if that Decl gets referenced by a DeclRefExpr. However, we're not actually performing the class or function template instantiation at the time of such comparison. Therefore, let's map these parameters to themselves so that they get preserved after the substitution. Fixes llvm/llvm-project#74447.
…ressed' switch code. NFC. Stop clang-format trying to expand manually compressed lookup switch() code - if it still fits into 80col, then keep it to a single line instead of expanding across multiple lines each.
…tructions Minor correction for #79775 - noticed in EXPENSIVE_CHECKS builds
`OpaqueValueExpr` doesn't necessarily contain a source expression. Particularly, after #78041, it is used to carry the type and the value kind of a non-type template argument of floating-point type or referring to a subobject (those are so called `StructuralValue` arguments). This fixes #79575.
Those were deprecated and basically not used anymore after we renamed them in batch. This patch removes the macros entirely.
…U (#79322) This patch tries to better explain the differences between the `IsTargetDevice` and `IsGPU` flags of the `OpenMPIRBuilderConfig`.
Another round of additional tests for llvm/llvm-project#7863 with different sext/zext and use variants.
…ass template explict specializations (#78720) According to [[dcl.type.elab] p2](http://eel.is/c++draft/dcl.type.elab#2): > If an [elaborated-type-specifier](http://eel.is/c++draft/dcl.type.elab#nt:elaborated-type-specifier) is the sole constituent of a declaration, the declaration is ill-formed unless it is an explicit specialization, an explicit instantiation or it has one of the following forms [...] Consider the following: ```cpp template<typename T> struct A { template<typename U> struct B; }; template<> template<typename U> struct A<int>::B; // intel#1 ``` The _elaborated-type-specifier_ at `intel#1` declares an explicit specialization (which is itself a template). We currently (incorrectly) reject this, and this PR fixes that. I moved the point at which _elaborated-type-specifiers_ with _nested-name-specifiers_ are diagnosed from `ParsedFreeStandingDeclSpec` to `ActOnTag` for two reasons: `ActOnTag` isn't called for explicit instantiations and partial/explicit specializations, and because it's where we determine if a member specialization is being declared. With respect to diagnostics, I am currently issuing the diagnostic without marking the declaration as invalid or returning early, which results in more diagnostics that I think is necessary. I would like feedback regarding what the "correct" behavior should be here.
This prevents having to use double parentheses in common cases.
The <__threading_support> header is a huge beast and it's really difficult to navigate. I find myself struggling to find what I want every time I have to open it, and I've been considering splitting it up for years for that reason. This patch aims not to contain any functional change. The various implementations of the threading base are simply moved to separate headers and then the individual headers are simplified in mechanical ways. For example, we used to have redundant declarations of all the functions at the top of `__threading_support`, and those are removed since they are not needed anymore. The various #ifdefs are also simplified and removed when they become unnecessary. Finally, this patch adds documentation for the API we expect from any threading implementation.
…(#79871) Some of the checks in sfinae_helpers.h were not used anymore since we refactored the std::tuple implementation and were now dead code. This patch removes the code.
This macro is unnecessary with `basic_string& operator=(value_type __c)`.
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
0784b1e
4017f04
etiotto
approved these changes
Feb 14, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.