Implement D TypeSystem and DWARFASTParser #2

ljmf00 · 2021-12-23T06:52:53Z

No description provided.

By creating the header and latch blocks up front and adding blocks and recipes in between those 2 blocks we ensure that the entry and exits of the plan remain valid throughout construction. In order to avoid test changes and keep printing of the plans the same, we use the new header block instead of creating a new block on the first iteration of the loop traversing the original loop. We also fold the latch into its predecessor. This is a follow up to a post-commit suggestion in D114586. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D115793

This finishes the GetSupportedArchitectureAtIndex migration. There are opportunities to simplify this even further, but I am going to leave that to the platform owners. Differential Revision: https://reviews.llvm.org/D116028

…ades The test is currently marked XFAIL for mingw environments, but latest mingw-w64 got support for timespec_get: mingw-w64/mingw-w64@e62a0a9 The CI environment will probably be upgraded to a state where this test is passing only after 14.x is branched in the llvm-project monorepo. If we'd just go from having an XFAIL to no marking at all (when CI is passing), we'd have to update both main and 14.x branches in sync exactly when the CI runners are updated to a newer version. Instead, mark the test as temporarily unsupported (so it doesn't cause failed builds when the CI environment is updated); after the CI environments are upgraded to such a state, we can remove the UNSUPPORTED marking to start requiring it to pass on the main branch, without needing to synchronize that change to anything else. Differential Revision: https://reviews.llvm.org/D116132

While there's little value in polishing the old config system, I ran into this function and was confused for a while, while grepping around and trying to wrap my head around things. Differential Revision: https://reviews.llvm.org/D116131

The paths to the compiler and to the python executable may need to be quoted (if they're installed into e.g. C:\Program Files). All testing commands that are executed expect a gcc compatible command line interface, while clang-cl uses different command line options. In the original testing config, if the chosen compiler was clang-cl, it was replaced with clang++ by looking for such an executable in the path. For the new from-scratch test configs, I instead chose to add "--driver-mode=g++" to flags - invoking "clang-cl --driver-mode=g++" has the same effect as invoking "clang++", without needing to run any heuristics for picking a different compiler executable. Differential Revision: https://reviews.llvm.org/D111202

After D116148 the memccpy gets optimized away and the expected uninitialized memory access does not occur. Make sure the call does not get optimized away.

…se of OpenMP task construct Currently variables appearing inside shared clause of OpenMP task construct are not visible inside lldb debugger. After the current patch, lldb is able to show the variable ``` * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000400934 a.out`.omp_task_entry. [inlined] .omp_outlined.(.global_tid.=0, .part_id.=0x000000000071f0d0, .privates.=0x000000000071f0e8, .copy_fn.=(a.out`.omp_task_privates_map. at testshared.cxx:8), .task_t.=0x000000000071f0c0, __context=0x000000000071f0f0) at testshared.cxx:10:34 7 else { 8 #pragma omp task shared(svar) firstprivate(n) 9 { -> 10 printf("Task svar = %d\n", svar); 11 printf("Task n = %d\n", n); 12 svar = fib(n - 1); 13 } (lldb) p svar (int) $0 = 9 ``` Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D115510

Introduce initial support for using libkvm on FreeBSD. The library can be used as an alternate implementation for processing kernel coredumps but it can also be used to access live kernel memory through specifying "/dev/mem" as the core file, i.e.: lldb --core /dev/mem /boot/kernel/kernel Differential Revision: https://reviews.llvm.org/D116005

…lee saved range Currently the return address ABI registers s[30:31], which fall in the call clobbered register range, are added as a live-in on the function entry to preserve its value when we have calls so that it gets saved and restored around the calls. But the DWARF unwind information (CFI) needs to track where the return address resides in a frame and the above approach makes it difficult to track the return address when the CFI information is emitted during the frame lowering, due to the involvment of understanding the control flow. This patch moves the return address ABI registers s[30:31] into callee saved registers range and stops adding live-in for return address registers, so that the CFI machinery will know where the return address resides when CSR save/restore happen during the frame lowering. And doing the above poses an issue that now the return instruction uses undefined register `sgpr30_sgpr31`. This is resolved by hiding the return address register use by the return instruction through the `SI_RETURN` pseudo instruction, which doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the `S_SETPC_B64_return` during the `expandPostRAPseudo()`. As an added benefit, this patch simplifies overall return instruction handling. Note: The AMDGPU CFI changes are there only in the downstream code and another version of this patch will be posted for review for the downstream code. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D114652

…ges::end. As discussed with ldionne. The problem with this static_assert is that it makes ranges::begin a pitfall for anyone ever to use inside a constraint or decltype. Many Ranges things, such as ranges::size, are specified as "Does X if X is well-formed, or else Y if Y is well-formed, or else `ranges::end(t) - ranges::begin(t)` if that is well-formed, or else..." And if there's a static_assert hidden inside `ranges::begin(t)`, then you get a hard error as soon as you ask the question -- even if the answer would have been "no, that's not well-formed"! Constraining on `requires { t + 0; }` or `requires { t + N; }` is verboten because of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103700 . For ranges::begin, we can just decay to a pointer even in the incomplete-type case. For ranges::end, we can safely constrain on `sizeof(*t)`. Yes, this means that an array of incomplete type has a `ranges::begin` but no `ranges::end`... just like an unbounded array of complete type. This is a valid manifestation of IFNDR. All of the new libcxx/test/std/ cases are mandatory behavior, as far as I'm aware. Tests for the IFNDR cases in ranges::begin and ranges::end remain in `libcxx/test/libcxx/`. The similar tests for ranges::empty and ranges::data were simply wrong, AFAIK. Differential Revision: https://reviews.llvm.org/D115838

Suggest converting `std::string::rfind()` calls to `absl::StartsWith()` where possible.

gen_ast_dump_json_test.py adds these lines of whitespace. Precommit it to avoid spurious diffs in future changes.

Regenerate test checks to reduce diff for an upcoming patch.

…er` into anonymous namespace Just to keep code consistent as `OpenMPAtomicUpdateChecker` is defined in anonymous namespace. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D116068

Over in D114631 I turned this debug-info feature on by default, for x86_64 only. I'd previously stripped out the clang cc1 option that controlled it in 651122f, unfortunately that turned out to not be completely effective, and the two things deleted in this patch continued to keep it off-by-default. Oooff. As a follow-up, this patch removes the last few things to do with ValueTrackingVariableLocations from clang, which was the original purpose of D114631. In an ideal world, if this patch causes you trouble you'd revert 3c04507 instead, which was where this behaviour was supposed to start being the default, although that might not be practical any more.

…] to callee saved range" This reverts commit 9075009. Failed amdgpu runtime buildbot # 3514

Add callback to enable us to test target nodes if they are splat vectors Added some basic X86ISD::VBROADCAST + X86ISD::VBROADCAST_LOAD handling

Previously, the folding assumed that it always operates on scalar types. Differential Revision: https://reviews.llvm.org/D116151

Clang is gaining `auto(x)` support in D113393; sadly there seems to be no feature-test macro for it. Zhihao is opening a core issue for that macro. Use `_LIBCPP_AUTO_CAST` where C++20 specifies we should use `auto(x)`; stop using `__decay_copy(x)` in those places. In fact, remove `__decay_copy` entirely. As of C++20, it's purely a paper specification tool signifying "Return just `x`, but it was perfect-forwarded, so we understand you're going to have to call its move-constructor sometimes." I believe there's no reason we'd ever need to do its operation explicitly in code. This heisenbugs away a test failure on MinGW; see D112214. Differential Revision: https://reviews.llvm.org/D115686

Missed by MSVC

Apply the formatting rules that were applied to the libc/src directory to the libc/test directory, as well as the files in libc/utils that are included by the tests. This does not include automated enforcement. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D116127

There is no way to programmatically configure the list of disabled and enabled patterns in the canonicalizer pass, other than the duplicate the whole pass. This patch exposes the `disabledPatterns` and `enabledPatterns` options. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D116055

Previously, we defined a struct named `RootOrderingCost`, which stored the cost (a pair consisting of the depth of the connector and a tie breaking ID), as well as the connector itself. This created some confusion, because we would sometimes write, e.g., `cost.cost.first` (the first `cost` referring to the struct, the second one referring to the `cost` field, and `first` referring to the depth). In order to address this confusion, here we rename `RootOrderingCost` to `RootOrderingEntry` (keeping the fields and their names as-is). This clarification exposed non-determinism in the optimal branching algorithm. When choosing the best local parent, we were previuosly only considering its depth (`cost.first`) and not the tie-breaking ID (`cost.second`). This led to non-deterministic choice of the parent when multiple potential parents had the same depth. The solution is to compare both the depth and the tie-breaking ID. Testing: Rely on existing unit tests. Non-detgerminism is hard to unit-test. Reviewed By: rriddle, Mogball Differential Revision: https://reviews.llvm.org/D116079

I missed two instances of "SetUp" being replaced by "set_up" and "TearDown" being replaced by "tear_down" when finalizing the formatting change. This fixes that. Differential Revision: https://reviews.llvm.org/D116178

Segmentation fault in ompt_tsan_dependences function due to an unchecked NULL pointer dereference is as follows: ``` ThreadSanitizer:DEADLYSIGNAL ==140865==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000050 (pc 0x7f217c2d3652 bp 0x7ffe8cfc7e00 sp 0x7ffe8cfc7d90 T140865) ==140865==The signal is caused by a READ memory access. ==140865==Hint: address points to the zero page. /usr/bin/addr2line: DWARF error: could not find variable specification at offset 1012a /usr/bin/addr2line: DWARF error: could not find variable specification at offset 133b5 /usr/bin/addr2line: DWARF error: could not find variable specification at offset 1371a /usr/bin/addr2line: DWARF error: could not find variable specification at offset 13a58 #0 ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 (libarcher.so+0x15652) #1 __kmpc_doacross_post /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:4280 (libomp.so+0x74d98) #2 .omp_outlined. for_ordered_01.c:? (for_ordered_01.exe+0x5186cb) #3 __kmp_invoke_microtask /ptmp/bhararit/llvm-project/openmp/runtime/src/z_Linux_asm.S:1166 (libomp.so+0x14e592) #4 __kmp_invoke_task_func /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:7556 (libomp.so+0x909ad) #5 __kmp_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:2284 (libomp.so+0x8461a) #6 __kmpc_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:308 (libomp.so+0x6db55) #7 main ??:? (for_ordered_01.exe+0x51828f) #8 __libc_start_main ??:? (libc.so.6+0x24349) #9 _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 (for_ordered_01.exe+0x4214e9) ThreadSanitizer can not provide additional info. SUMMARY: ThreadSanitizer: SEGV /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 in ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) ==140865==ABORTING ``` To reproduce the error, use the following openmp code snippet: ``` /* initialise testMatrixInt Matrix, cols, r and c */ #pragma omp parallel private(r,c) shared(testMatrixInt) { #pragma omp for ordered(2) for (r=1; r < rows; r++) { for (c=1; c < cols; c++) { #pragma omp ordered depend(sink:r-1, c+1) depend(sink:r-1,c-1) testMatrixInt[r][c] = (testMatrixInt[r-1][c] + testMatrixInt[r-1][c-1]) % cols ; #pragma omp ordered depend (source) } } } ``` Compilation: ``` clang -g -stdlib=libc++ -fsanitize=thread -fopenmp -larcher test_case.c ``` It seems like the changes introduced by the commit https://reviews.llvm.org/D114005 causes this particular SEGV while using Archer. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D115328

For code below: { r7 = addasl(r3,r0,#2) r8 = addasl(r3,r2,#2) r5 = memw(r3+r0<<#2) r6 = memw(r3+r2<<#2) } { p1 = cmp.gtu(r6,r5) if (p1.new) memw(r8+#0) = r5 if (p1.new) memw(r7+#0) = r6 } { r0 = mux(p1,r2,r4) } In packetizer, a new packet is created for the cmp instruction since there arent enough resources in previous packet. Also it is determined that the cmp stalls by 2 cycles since it depends on the prior load of r5. In current packetizer implementation, the predicated store is evaluated for whether it can go in the same packet as compare, and since the compare stalls, the stall of the predicated store does not matter and it can go in the same packet as the cmp. However the predicated store will stall for more cycles because of its dependence on the addasl instruction and to avoid that stall we can put it in a new packet. Improve the packetizer to check if an instruction being added to packet will stall longer than instruction already in packet and if so create a new packet.

This patch generalizes the logic to represent an EncodingDataType from a DWARF tag, decoupling from the current clang-specific DAWRF parser, allowing other languages to be integrated without code duplication. Signed-off-by: Luís Ferreira <contact@lsferreira.net>

We experienced some deadlocks when we used multiple threads for logging using `scan-builds` intercept-build tool when we used multiple threads by e.g. logging `make -j16` ``` (gdb) bt #0 0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f2bb3d152e4 in ?? () #3 0x00007ffcc5f0cc80 in ?? () #4 0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2 #5 0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #6 0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6 #7 0x00007f2bb3d144ee in ?? () #8 0x746e692f706d742f in ?? () #9 0x692d747065637265 in ?? () #10 0x2f653631326b3034 in ?? () #11 0x646d632e35353532 in ?? () #12 0x0000000000000000 in ?? () ``` I think the gcc's exit call caused the injected `libear.so` to be unloaded by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`. That tried to acquire an already locked mutex which was left locked in the `bear_report_call()` call, that probably encountered some error and returned early when it forgot to unlock the mutex. All of these are speculation since from the backtrace I could not verify if frames 2 and 3 are in fact corresponding to the `libear.so` module. But I think it's a fairly safe bet. So, hereby I'm releasing the held mutex on *all paths*, even if some failure happens. PS: I would use lock_guards, but it's C. Reviewed-by: NoQ Differential Revision: https://reviews.llvm.org/D118439

There is a clangd crash at `__memcmp_avx2_movbe`. Short problem description is below. The method `HeaderIncludes::addExistingInclude` stores `Include` objects by reference at 2 places: `ExistingIncludes` (primary storage) and `IncludesByPriority` (pointer to the object's location at ExistingIncludes). `ExistingIncludes` is a map where value is a `SmallVector`. A new element is inserted by `push_back`. The operation might do resize. As result pointers stored at `IncludesByPriority` might become invalid. Typical stack trace ``` frame #0: 0x00007f11460dcd94 libc.so.6`__memcmp_avx2_movbe + 308 frame #1: 0x00000000004782b8 clangd`llvm::StringRef::compareMemory(Lhs=" \"t2.h\"", Rhs="", Length=6) at StringRef.h:76:22 frame #2: 0x0000000000701253 clangd`llvm::StringRef::compare(this=0x0000 7f10de7d8610, RHS=(Data = "", Length = 7166742329480737377)) const at String Ref.h:206:34 * frame #3: 0x00000000007603ab clangd`llvm::operator<(llvm::StringRef, llv m::StringRef)(LHS=(Data = "\"t2.h\"", Length = 6), RHS=(Data = "", Length = 7166742329480737377)) at StringRef.h:907:23 frame #4: 0x0000000002d0ad9f clangd`clang::tooling::HeaderIncludes::inse rt(this=0x00007f10de7fb1a0, IncludeName=(Data = "t2.h\"", Length = 4), IsAng led=false) const at HeaderIncludes.cpp:365:22 frame #5: 0x00000000012ebfdd clangd`clang::clangd::IncludeInserter::inse rt(this=0x00007f10de7fb148, VerbatimHeader=(Data = "\"t2.h\"", Length = 6)) const at Headers.cpp:262:70 ``` A unit test test for the crash was created (`HeaderIncludesTest.RepeatedIncludes`). The proposed solution is to use std::list instead of llvm::SmallVector Test Plan ``` ./tools/clang/unittests/Tooling/ToolingTests --gtest_filter=HeaderIncludesTest.RepeatedIncludes ``` Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D118755

This patch fixes a data race in IOHandlerProcessSTDIO. The race is happens between the main thread and the event handling thread. The main thread is running the IOHandler (IOHandlerProcessSTDIO::Run()) when an event comes in that makes us pop the process IO handler which involves cancelling the IOHandler (IOHandlerProcessSTDIO::Cancel). The latter calls SetIsDone(true) which modifies m_is_done. At the same time, we have the main thread reading the variable through GetIsDone(). This patch avoids the race by using a mutex to synchronize the two threads. On the event thread, in IOHandlerProcessSTDIO ::Cancel method, we obtain the lock before changing the value of m_is_done. On the main thread, in IOHandlerProcessSTDIO::Run(), we obtain the lock before reading the value of m_is_done. Additionally, we delay calling SetIsDone until after the loop exists, to avoid a potential race between the two writes. Write of size 1 at 0x00010b66bb68 by thread T7 (mutexes: write M2862, write M718324145051843688): #0 lldb_private::IOHandler::SetIsDone(bool) IOHandler.h:90 (liblldb.15.0.0git.dylib:arm64+0x971d84) #1 IOHandlerProcessSTDIO::Cancel() Process.cpp:4382 (liblldb.15.0.0git.dylib:arm64+0x5ddfec) #2 lldb_private::Debugger::PopIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1156 (liblldb.15.0.0git.dylib:arm64+0x3cb2a8) #3 lldb_private::Debugger::RemoveIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1063 (liblldb.15.0.0git.dylib:arm64+0x3cbd2c) #4 lldb_private::Process::PopProcessIOHandler() Process.cpp:4487 (liblldb.15.0.0git.dylib:arm64+0x5c583c) #5 lldb_private::Debugger::HandleProcessEvent(std::__1::shared_ptr<lldb_private::Event> const&) Debugger.cpp:1549 (liblldb.15.0.0git.dylib:arm64+0x3ceabc) #6 lldb_private::Debugger::DefaultEventHandler() Debugger.cpp:1622 (liblldb.15.0.0git.dylib:arm64+0x3cf2c0) #7 std::__1::__function::__func<lldb_private::Debugger::StartEventHandlerThread()::$_2, std::__1::allocator<lldb_private::Debugger::StartEventHandlerThread()::$_2>, void* ()>::operator()() function.h:352 (liblldb.15.0.0git.dylib:arm64+0x3d1bd8) #8 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) HostNativeThreadBase.cpp:62 (liblldb.15.0.0git.dylib:arm64+0x4c71ac) #9 lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(void*) HostThreadMacOSX.mm:18 (liblldb.15.0.0git.dylib:arm64+0x29ef544) Previous read of size 1 at 0x00010b66bb68 by main thread: #0 lldb_private::IOHandler::GetIsDone() IOHandler.h:92 (liblldb.15.0.0git.dylib:arm64+0x971db8) #1 IOHandlerProcessSTDIO::Run() Process.cpp:4339 (liblldb.15.0.0git.dylib:arm64+0x5ddc7c) #2 lldb_private::Debugger::RunIOHandlers() Debugger.cpp:982 (liblldb.15.0.0git.dylib:arm64+0x3cb48c) #3 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) CommandInterpreter.cpp:3298 (liblldb.15.0.0git.dylib:arm64+0x506478) #4 lldb::SBDebugger::RunCommandInterpreter(bool, bool) SBDebugger.cpp:1166 (liblldb.15.0.0git.dylib:arm64+0x53604) #5 Driver::MainLoop() Driver.cpp:634 (lldb:arm64+0x100006294) #6 main Driver.cpp:853 (lldb:arm64+0x100007344) Differential revision: https://reviews.llvm.org/D120762

…view Since the threads/frame view is taking only a small part on the right side of the screen, only a part of the function name of each frame is visible. It seems rather wasteful to spell out 'frame' there when it's obvious that it is a frame, it's better to use the space for more of the function name. Differential Revision: https://reviews.llvm.org/D122998

Detected on many lld tests with -fsanitize-memory-use-after-dtor. Also https://lab.llvm.org/buildbot/#/builders/sanitizer-x86_64-linux-fast after D122869 will report a lot of them. Threads may outlive static variables. Even if ~__thread_specific_ptr() does nothing, lifetime of members ends with ~ and accessing the value is UB https://eel.is/c++draft/basic.life#1 ``` ==9214==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x557e1cec4539 in __libcpp_tls_set ../include/c++/v1/__threading_support:428:12 #1 0x557e1cec4539 in set_pointer ../include/c++/v1/thread:196:5 #2 0x557e1cec4539 in void* std::__msan::__thread_proxy< std::__msan::tuple<...>, llvm::parallel::detail::(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(llvm::ThreadPoolStrategy)::'lambda'()::operator()() const::'lambda'()> >(void*) ../include/c++/v1/thread:285:27 Memory was marked as uninitialized #0 0x557e10a0759d in __sanitizer_dtor_callback compiler-rt/lib/msan/msan_interceptors.cpp:940:5 #1 0x557e1d8c478d in std::__msan::__thread_specific_ptr<std::__msan::__thread_struct>::~__thread_specific_ptr() libcxx/include/thread:188:1 #2 0x557e10a07dc0 in MSanCxaAtExitWrapper(void*) compiler-rt/lib/msan/msan_interceptors.cpp:1151:3 ``` The test needs D123979 or -fsanitize-memory-param-retval enabled by default. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D122864

…ified offset and its parents or children with spcified depth." This reverts commit a3b7cb0. symbol-offset.test fails under MSAN: [ 1] ; RUN: llvm-pdbutil yaml2pdb %p/Inputs/symbol-offset.yaml --pdb=%t.pdb [FAIL] llvm-pdbutil yaml2pdb <REDACTED>/llvm/test/tools/llvm-pdbutil/Inputs/symbol-offset.yaml --pdb=<REDACTED>/tmp/symbol-offset.test/symbol-offset.test.tmp.pdb ==9283==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55f975e5eb91 in __libcpp_tls_set <REDACTED>/include/c++/v1/__threading_support:428:12 #1 0x55f975e5eb91 in set_pointer <REDACTED>/include/c++/v1/thread:196:5 #2 0x55f975e5eb91 in void* std::__msan::__thread_proxy<std::__msan::tuple<std::__msan::unique_ptr<std::__msan::__thread_struct, std::__msan::default_delete<std::__msan::__thread_struct> >, llvm::parallel::detail::(anonymous namespace)::ThreadPoolExecutor::ThreadPoolExecutor(llvm::ThreadPoolStrategy)::'lambda'()::operator()() const::'lambda'()> >(void*) <REDACTED>/include/c++/v1/thread:285:27 #3 0x7f74a1e55b54 in start_thread (<REDACTED>/libpthread.so.0+0xbb54) (BuildId: 64752de50ebd1a108f4b3f8d0d7e1a13) #4 0x7f74a1dc9f7e in clone (<REDACTED>/libc.so.6+0x13cf7e) (BuildId: 7cfed7708e5ab7fcb286b373de21ee76)

This reverts commit c274b6e. The x86_64 debian bot got a failure with this patch, https://lab.llvm.org/buildbot#builders/68/builds/33078 where SymbolFile/DWARF/x86/DW_TAG_variable-DW_AT_decl_file-DW_AT_abstract_origin-crosscu1.s is crashing here - #2 0x0000000000425a9f SignalHandler(int) Signals.cpp:0:0 #3 0x00007f57160e9140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14140) #4 0x00007f570d911e43 lldb_private::SourceManager::GetFile(lldb_private::FileSpec const&) crtstuff.c:0:0 #5 0x00007f570d914270 lldb_private::SourceManager::DisplaySourceLinesWithLineNumbers(lldb_private::FileSpec const&, unsigned int, unsigned int, unsigned int, unsigned int, char const*, lldb_private::Stream*, lldb_private::SymbolContextList const*) crtstuff.c:0:0 #6 0x00007f570da662c8 lldb_private::StackFrame::GetStatus(lldb_private::Stream&, bool, bool, bool, char const*) crtstuff.c:0:0 I don't get a failure here my mac, I'll review this method more closely tomorrow.

…ned form The DWARF spec says: Any debugging information entry representing the declaration of an object, module, subprogram or type may have DW_AT_decl_file, DW_AT_decl_line and DW_AT_decl_column attributes, each of whose value is an unsigned integer ^^^^^^^^ constant. If however, a producer happens to emit DW_AT_decl_file / DW_AT_decl_line using a signed integer form, llvm-dwarfdump crashes, like so: (... snip ...) 0x000000b4: DW_TAG_structure_type DW_AT_name ("test_struct") DW_AT_byte_size (136) DW_AT_decl_file (llvm-dwarfdump: (... snip ...)/llvm/include/llvm/ADT/Optional.h:197: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() & [with T = long unsigned int]: Assertion `hasVal' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: /opt/rocm/llvm/bin/llvm-dwarfdump ./testsuite/outputs/gdb.rocm/lane-pc-vega20/lane-pc-vega20-kernel.so #0 0x000055cc8e78315f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0 #1 0x000055cc8e780d3d SignalHandler(int) Signals.cpp:0:0 #2 0x00007f8f2cae8420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420) #3 0x00007f8f2c58d00b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #4 0x00007f8f2c56c859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7 #5 0x00007f8f2c56c729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8 #6 0x00007f8f2c56c729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34 #7 0x00007f8f2c57dfd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6) #8 0x000055cc8e58ceb9 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2e0eb9) #9 0x000055cc8e58bec3 llvm::DWARFDie::dump(llvm::raw_ostream&, unsigned int, llvm::DIDumpOptions) const (/opt/rocm/llvm/bin/llvm-dwarfdump+0x2dfec3) #10 0x000055cc8e5b28a3 llvm::DWARFCompileUnit::dump(llvm::raw_ostream&, llvm::DIDumpOptions) (.part.21) DWARFCompileUnit.cpp:0:0 Likewise with DW_AT_call_file / DW_AT_call_line. The problem is that the code in llvm/lib/DebugInfo/DWARF/DWARFDie.cpp dumping these attributes assumes that FormValue.getAsUnsignedConstant() returns an armed optional. If in debug mode, we get an assertion line the above. If in release mode, and asserts are compiled out, then we proceed as if the optional had a value, running into undefined behavior, printing whatever random value. Fix this by checking whether the optional returned by FormValue.getAsUnsignedConstant() has a value, like done in other places. In addition, DWARFVerifier.cpp is validating DW_AT_call_file / DW_AT_decl_file, but not AT_call_line / DW_AT_decl_line. This commit fixes that too. The llvm-dwarfdump/X86/verify_file_encoding.yaml testcase is extended to cover these cases. Current llvm-dwarfdump crashes running the newly-extended test. "make check-llvm-tools-llvm-dwarfdump" shows no regressions, on x86-64 GNU/Linux. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D129392

Casting a pointer to a suitably large integral type by reinterpret-cast should result in the same value as by using the `__builtin_bit_cast()`. The compiler exploits this: https://godbolt.org/z/zMP3sG683 However, the analyzer does not bind the same symbolic value to these expressions, resulting in weird situations, such as failing equality checks and even results in crashes: https://godbolt.org/z/oeMP7cj8q Previously, in the `RegionStoreManager::getBinding()` even if `T` was non-null, we replaced it with `TVR->getValueType()` in case the `MR` was `TypedValueRegion`. It doesn't make much sense to auto-detect the type if the type is already given. By not doing the auto-detection, we would just do the right thing and perform the load by that type. This means that we will cast the value to that type. So, in this patch, I'm proposing to do auto-detection only if the type was null. Here is a snippet of code, annotated by the previous and new dump values. `LocAsInteger` should wrap the `SymRegion`, since we want to load the address as if it was an integer. In none of the following cases should type auto-detection be triggered, hence we should eventually reach an `evalCast()` to lazily cast the loaded value into that type. ```lang=C++ void LValueToRValueBitCast_dumps(void *p, char (*array)[8]) { clang_analyzer_dump(p); // remained: &SymRegion{reg_$0<void * p>} clang_analyzer_dump(array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>} clang_analyzer_dump((unsigned long)p); // remained: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}} clang_analyzer_dump(__builtin_bit_cast(unsigned long, p)); <--------- change #1 // previously: {{&SymRegion{reg_$0<void * p>}}} // now: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}} clang_analyzer_dump((unsigned long)array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}} clang_analyzer_dump(__builtin_bit_cast(unsigned long, array)); <--------- change #2 // previously: {{&SymRegion{reg_$1<char (*)[8] array>}}} // now: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}} } ``` Reviewed By: xazax.hun Differential Revision: https://reviews.llvm.org/D136603

The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Add initial revision of assignment tracking analysis pass --------------------------------------------------------- This patch squashes five individually reviewed patches into one: #1 https://reviews.llvm.org/D136320 #2 https://reviews.llvm.org/D136321 #3 https://reviews.llvm.org/D136325 #4 https://reviews.llvm.org/D136331 #5 https://reviews.llvm.org/D136335 Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The two subsequent patches modify those files only. Patch #4 plumbs the analysis into SelectionDAG, and patch #5 is a collection of tests for the analysis as a whole. The analysis was broken up into smaller chunks for review purposes but for the most part the tests were written using the whole analysis. It would be possible to break up the tests for patches #1 through #3 for the purpose of landing the patches seperately. However, most them would require an update for each patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is required by all of the tests. If there is build-bot trouble, we might try a different landing sequence. Analysis problem and goal ------------------------- Variables values can be stored in memory, or available as SSA values, or both. Using the Assignment Tracking metadata, it's not possible to determine a variable location just by looking at a debug intrinsic in isolation. Instructions without any metadata can change the location of a variable. The meaning of dbg.assign intrinsics changes depending on whether there are linked instructions, and where they are relative to those instructions. So we need to analyse the IR and convert the embedded information into a form that SelectionDAG can consume to produce debug variable locations in MIR. The solution is a dataflow analysis which, aiming to maximise the memory location coverage for variables, outputs a mapping of instruction positions to variable location definitions. API usage --------- The analysis is named `AssignmentTrackingAnalysis`. It is added as a required pass for SelectionDAGISel when assignment tracking is enabled. The results of the analysis are exposed via `getResults` using the returned `const FunctionVarLocs *`'s const methods: const VarLocInfo *single_locs_begin() const; const VarLocInfo *single_locs_end() const; const VarLocInfo *locs_begin(const Instruction *Before) const; const VarLocInfo *locs_end(const Instruction *Before) const; void print(raw_ostream &OS, const Function &Fn) const; Debug intrinsics can be ignored after running the analysis. Instead, variable location definitions that occur between an instruction `Inst` and its predecessor (or block start) can be found by looping over the range: locs_begin(Inst), locs_end(Inst) Similarly, variables with a memory location that is valid for their lifetime can be iterated over using the range: single_locs_begin(), single_locs_end() Further detail -------------- For an explanation of the dataflow implementation and the integration with SelectionDAG, please see the reviews linked at the top of this commit message. Reviewed By: jmorse

…est unittest Need to finalize the DIBuilder to avoid leak sanitizer errors like this: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x55c99ea1761d in operator new(unsigned long) #1 0x55c9a518ae49 in operator new #2 0x55c9a518ae49 in llvm::MDTuple::getImpl(...) #3 0x55c9a4f1b1ec in getTemporary #4 0x55c9a4f1b1ec in llvm::DIBuilder::createFunction(...)

@foo

The motivation for this change is a workload generated by the XLA compiler targeting nvidia GPUs. This kernel has a few hundred i8 loads and stores. Merging is critical for performance. The current LSV doesn't merge these well because it only considers instructions within a block of 64 loads+stores. This limit is necessary to contain the O(n^2) behavior of the pass. I'm hesitant to increase the limit, because this pass is already one of the slowest parts of compiling an XLA program. So we rewrite basically the whole thing to use a new algorithm. Before, we compared every load/store to every other to see if they're consecutive. The insight (from tra@) is that this is redundant. If we know the offset from PtrA to PtrB, then we don't need to compare PtrC to both of them in order to tell whether C may be adjacent to A or B. So that's what we do. When scanning a basic block, we maintain a list of chains, where we know the offset from every element in the chain to the first element in the chain. Each instruction gets compared only to the leaders of all the chains. In the worst case, this is still O(n^2), because all chains might be of length 1. To prevent compile time blowup, we only consider the 64 most recently used chains. Thus we do no more comparisons than before, but we have the potential to make much longer chains. This rewrite affects many tests. The changes to tests fall into two categories. 1. The old code had what appears to be a bug when deciding whether a misaligned vectorized load is fast. Suppose TTI reports that load <i32 x 4> align 4 has relative speed 1, and suppose that load i32 align 4 has relative speed 32. The intent of the code seems to be that we prefer the scalar load, because it's faster. But the old code would choose the vectorized load. accessIsMisaligned would set RelativeSpeed to 0 for the scalar load (and not even call into TTI to get the relative speed), because the scalar load is aligned. After this patch, we will prefer the scalar load if it's faster. 2. This patch changes the logic for how we vectorize. Usually this results in vectorizing more. Explanation of changes to tests: - AMDGPU/adjust-alloca-alignment.ll: #1 - AMDGPU/flat_atomic.ll: #2, we vectorize more. - AMDGPU/int_sideeffect.ll: #2, there are two possible locations for the call to @foo, and the pass is brittle to this. Before, we'd vectorize in case 1 and not case 2. Now we vectorize in case 2 and not case 1. So we just move the call. - AMDGPU/adjust-alloca-alignment.ll: #2, we vectorize more - AMDGPU/insertion-point.ll: #2 we vectorize more - AMDGPU/merge-stores-private.ll: #1 (undoes changes from git rev 86f9117, which appear to have hit the bug from #1) - AMDGPU/multiple_tails.ll: #1 - AMDGPU/vect-ptr-ptr-size-mismatch.ll: Fix alignment (I think related to #1 above). - AMDGPU CodeGen: I have difficulty commenting on these changes, but many of them look like #2, we vectorize more. - NVPTX/4x2xhalf.ll: Fix alignment (I think related to #1 above). - NVPTX/vectorize_i8.ll: We don't generate <3 x i8> vectors on NVPTX because they're not legal (and eventually get split) - X86/correct-order.ll: #2, we vectorize more, probably because of changes to the chain-splitting logic. - X86/subchain-interleaved.ll: #2, we vectorize more - X86/vector-scalar.ll: #2, we can now vectorize scalar float + <1 x float> - X86/vectorize-i8-nested-add-inseltpoison.ll: Deleted the nuw test because it was nonsensical. It was doing `add nuw %v0, -1`, but this is equivalent to `add nuw %v0, 0xffff'ffff`, which is equivalent to asserting that %v0 == 0. - X86/vectorize-i8-nested-add.ll: Same as nested-add-inseltpoison.ll Differential Revision: https://reviews.llvm.org/D149893

Running this on Amazon Ubuntu the final backtrace is: ``` (lldb) thread backtrace * thread #1, name = 'a.out', stop reason = breakpoint 1.1 * frame #0: 0x0000aaaaaaaa07d0 a.out`func_c at main.c:10:3 frame #1: 0x0000aaaaaaaa07c4 a.out`func_b at main.c:14:3 frame #2: 0x0000aaaaaaaa07b4 a.out`func_a at main.c:18:3 frame #3: 0x0000aaaaaaaa07a4 a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:22:3 frame #4: 0x0000fffff7b373fc libc.so.6`___lldb_unnamed_symbol2962 + 108 frame #5: 0x0000fffff7b374cc libc.so.6`__libc_start_main + 152 frame #6: 0x0000aaaaaaaa06b0 a.out`_start + 48 ``` This causes the test to fail because of the extra ___lldb_unnamed_symbol2962 frame (an inlined function?). To fix this, strictly check all the frames in main.c then for the rest just check we find __libc_start_main and _start in that order regardless of other frames in between. Reviewed By: omjavaid Differential Revision: https://reviews.llvm.org/D154204

…tput The crash happens in clang::driver::tools::SplitDebugName when Output is InputInfo::Nothing. It doesn't happen with standalone clang driver because output is created in Driver::BuildJobsForActionNoCache. Example backtrace: ``` * thread #1, name = 'clangd', stop reason = hit program assert * frame #0: 0x00007ffff5c4eacf libc.so.6`raise + 271 frame #1: 0x00007ffff5c21ea5 libc.so.6`abort + 295 frame #2: 0x00007ffff5c21d79 libc.so.6`__assert_fail_base.cold.0 + 15 frame #3: 0x00007ffff5c47426 libc.so.6`__assert_fail + 70 frame #4: 0x000055555dc0923c clangd`clang::driver::InputInfo::getFilename(this=0x00007fffffff9398) const at InputInfo.h:84:5 frame #5: 0x000055555dcd0d8d clangd`clang::driver::tools::SplitDebugName(JA=0x000055555f6c6a50, Args=0x000055555f6d0b80, Input=0x00007fffffff9678, Output=0x00007fffffff9398) at CommonArgs.cpp:1275:40 frame #6: 0x000055555dc955a5 clangd`clang::driver::tools::Clang::ConstructJob(this=0x000055555f6c69d0, C=0x000055555f6c64a0, JA=0x000055555f6c6a50, Output=0x00007fffffff9398, Inputs=0x00007fffffff9668, Args=0x000055555f6d0b80, LinkingOutput=0x0000000000000000) const at Clang.cpp:5690:33 frame #7: 0x000055555dbf6b54 clangd`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5618:10 frame #8: 0x000055555dbf4ef0 clangd`clang::driver::Driver::BuildJobsForAction(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5306:26 frame #9: 0x000055555dbeb590 clangd`clang::driver::Driver::BuildJobs(this=0x00007fffffffb5e0, C=0x000055555f6c64a0) const at Driver.cpp:4844:5 frame #10: 0x000055555dbe6b0f clangd`clang::driver::Driver::BuildCompilation(this=0x00007fffffffb5e0, ArgList=ArrayRef<const char *> @ 0x00007fffffffb268) at Driver.cpp:1496:3 frame #11: 0x000055555b0cc0d9 clangd`clang::createInvocation(ArgList=ArrayRef<const char *> @ 0x00007fffffffbb38, Opts=CreateInvocationOptions @ 0x00007fffffffbb90) at CreateInvocationFromCommandLine.cpp:53:52 frame #12: 0x000055555b378e7b clangd`clang::clangd::buildCompilerInvocation(Inputs=0x00007fffffffca58, D=0x00007fffffffc158, CC1Args=size=0) at Compiler.cpp:116:44 frame #13: 0x000055555895a6c8 clangd`clang::clangd::(anonymous namespace)::Checker::buildInvocation(this=0x00007fffffffc760, TFS=0x00007fffffffe570, Contents= Has Value=false ) at Check.cpp:212:9 frame #14: 0x0000555558959cec clangd`clang::clangd::check(File=(Data = "build/test.cpp", Length = 64), TFS=0x00007fffffffe570, Opts=0x00007fffffffe600) at Check.cpp:486:34 frame #15: 0x000055555892164a clangd`main(argc=4, argv=0x00007fffffffecd8) at ClangdMain.cpp:993:12 frame #16: 0x00007ffff5c3ad85 libc.so.6`__libc_start_main + 229 frame #17: 0x00005555585bbe9e clangd`_start + 46 ``` Test Plan: ninja ClangDriverTests && tools/clang/unittests/Driver/ClangDriverTests Differential Revision: https://reviews.llvm.org/D154602

fhahn and others added 30 commits December 22, 2021 12:44

[lldb/python] Avoid more dangling pointers in python glue code

2efc689

[lldb] Use GetSupportedArchitectures on darwin platforms

e7c48f3

This finishes the GetSupportedArchitectureAtIndex migration. There are opportunities to simplify this even further, but I am going to leave that to the platform owners. Differential Revision: https://reviews.llvm.org/D116028

[msan] Break optimization in memccpy tests

a9bb97e

After D116148 the memccpy gets optimized away and the expected uninitialized memory access does not occur. Make sure the call does not get optimized away.

[clang-tidy] abseil-string-find-startswith: detect s.rfind(z, 0) == 0

fd8fc5e

Suggest converting `std::string::rfind()` calls to `absl::StartsWith()` where possible.

[JSONNodeDumper] Regenerate test checks (NFC)

da007a3

gen_ast_dump_json_test.py adds these lines of whitespace. Precommit it to avoid spurious diffs in future changes.

[OpenMP] Regenerate test checks (NFC)

0fe1ccc

Regenerate test checks to reduce diff for an upcoming patch.

[NFC][Clang] Move function implementation of `OpenMPAtomicUpdateCheck…

a364e8f

…er` into anonymous namespace Just to keep code consistent as `OpenMPAtomicUpdateChecker` is defined in anonymous namespace. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D116068

Revert "[AMDGPU] Move call clobbered return address registers s[30:31…

09b5329

…] to callee saved range" This reverts commit 9075009. Failed amdgpu runtime buildbot # 3514

[DAG][X86] Add TargetLowering::isSplatValueForTargetNode override

4639461

Add callback to enable us to test target nodes if they are splat vectors Added some basic X86ISD::VBROADCAST + X86ISD::VBROADCAST_LOAD handling

[mlir][arith] Fix CmpIOP folding for vector types.

4a10457

Previously, the folding assumed that it always operates on scalar types. Differential Revision: https://reviews.llvm.org/D116151

[gn build] Port cb8a0b0

ece75e2

Remove superfluous semicolon.

8b58344

Missed by MSVC

[mlir] Fix missing namespace (NFC)

db68e6a

[NFC][AMDGPU][CostModel] Add tests for AMDGPU cost model.

deaedab

[NFC][AMDGPU][CostModel] Add tests for AMDGPU cost model, part 2.

a2120f6

[libc][obvious] fix formatting mistake

79abf89

I missed two instances of "SetUp" being replaced by "set_up" and "TearDown" being replaced by "tear_down" when finalizing the formatting change. This fixes that. Differential Revision: https://reviews.llvm.org/D116178

[mlir] Update BUILD.bazel to include scf_tests

ad761f0

ljmf00 added 4 commits December 27, 2021 02:51

[lldb] Add mapping for DType kind and lldb::BasicType

806ad10

[lldb] Implement Dump for D TypeSystem

e429ca3

[lldb] Add minimal type information for builtin types in D TypeSystem

806fe93

[lldb] Add target triple on D TypeSystem

38e5f71

ljmf00 force-pushed the lldb-d/implement-typesystem-d branch from 9504086 to 5a953be Compare December 30, 2021 03:14

ljmf00 force-pushed the lldb-d/implement-typesystem-d branch from a1d3583 to 7d4b672 Compare January 5, 2022 02:03

ljmf00 added 7 commits January 5, 2022 18:17

[lldb] Implement GetBitSize for builtin types in D TypeSystem

51638b5

[lldb] Add type name mapping for DType

83aeeb6

[lldb] Use bit size on DWARF encoding

b60e906

[lldb] Add hardcoded bitsize version of real builtin type

6b4138d

[lldb] Add support for the rest of the builtin types

47403dd

[lldb] Add support for Derived types

6f1192a

ljmf00 force-pushed the lldb-d/implement-typesystem-d branch from 7d4b672 to 6f1192a Compare January 14, 2022 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement D TypeSystem and DWARFASTParser #2

Implement D TypeSystem and DWARFASTParser #2

ljmf00 commented Dec 23, 2021

Implement D TypeSystem and DWARFASTParser #2

Are you sure you want to change the base?

Implement D TypeSystem and DWARFASTParser #2

Conversation

ljmf00 commented Dec 23, 2021