Improve type lookup using .debug_names parent chain #108907

jeffreytan81 · 2024-09-17T01:15:52Z

Summary

This PR improves SymbolFileDWARF::FindTypes() by utilizing the newly added parent chain DW_IDX_parent in .debug_names. The proposal was originally discussed in this RFC.

Implementation

To leverage the parent chain for SymbolFileDWARF::FindTypes(), this PR adds a new API: GetTypesWithQuery in DWARFIndex base class. The API performs the same function as GetTypes with additional filtering using TypeQuery. Since this only introduces filtering, the callback mechanisms at all call sites remain unchanged. A default implementation is given in DWARFIndex class which parses debug info and performs the matching. In the DebugNameDWARFIndex override, the parent_contexts in the TypeQuery is cross checked with parent chain in .debug_names for for much faster filtering before fallback to base implementation for final filtering.

Unlike the GetFullyQualifiedType API, which fully consumes the DW_IDX_parent parent chain for exact matching, these new APIs perform partial subset matching for type/namespace queries. This is necessary to support queries involving anonymous or inline namespaces. For instance, a user might request NS1::NS2::NS3::Foo, while the index table's parent chain might contain NS1::inline_NS2::NS3::Foo, which would fail exact matching.

Performance Results

In one of our internal target using .debug_names + split dwarf. Expanding a "this" pointer in locals view in VSCode:
94s => 48s. (Not sure why I got 94s this time instead of 70s last week).

github-actions · 2024-09-17T01:19:56Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvmbot · 2024-09-17T03:03:18Z

@llvm/pr-subscribers-lldb

Author: None (jeffreytan81)

Changes

Context

We have a customer reporting scenario that expanding "this" pointer for a non-trivial class method context takes more than 70~90 seconds. This is a linux target with .debug_names index table enabled and using split dwarf dwo files to avoid debug info overflow.

Profiling shows the majority of the bottleneck is spent in SymbolFileDWARF::FindTypes() and SymbolFileDWARF:: FindNamespace() two functions searching types/namespaces in .debug_names index table with base name, then parsing/loading the underlying dwo files which can be very slow.

Summary

This PR improves SymbolFileDWARF::FindTypes() and SymbolFileDWARF::FindNamespace() by using the newly added parent chain DW_IDX_parent in .debug_names. The proposal is originally discussed in this RFC.

This PR also implements a new algorithm in DebugNamesDWARFIndex::SameAsEntryATName() that can greatly avoid parsing the dwo files.

Implementation

To leverage parent chain for SymbolFileDWARF::FindTypes and SymbolFileDWARF:: FindNamespace, the PR adds two new APIs GetTypesWithParents and GetNamespacesWithParents in DWARFIndex base class. These two APIs are performing the same semantics as GetTypes/GetNamespaces with extra filtering if parent_names parameter is not empty. Since only filtering is performed, the callback to all the callsites are not changed. A default implementation in DWARFIndex class is implemented to parse type context from each base name matched DIE. In DebugNameDWARFIndex override, it uses parent chain to perform much faster searching.

Unlike GetFullyQualifiedType API which is the first one consuming DW_IDX_parent parent chain, these two new APIs do not perform exact matching, other than partial subset matching for the type/namespace queries. This is required to support anonymous/inline namespace queries from client. For example, users may ask for NS1::NS2::NS3::Foo while index table's parent chain may contain NS1::inline_NS2::NS3::Foo which would fail the exact matching.

Another optimization performed is inside DebugNamesDWARFIndex::SameAsEntryATName(). The old implementation uses GetNonSkeletonUnit() which triggers parsing/loading of underlying dwo files which proves to be very expensive which we want to avoid. The new implementation tries to avoid touching dwo as much as possible. There are two ideas:

Build <pool_entry_range => NameEntry> mapping table (built lazily as needed apparently), then we can query check the name of the parent entry.
Searching the name the entry tries to match from current NameIndex, then check if the parent entry can be found from the name entry.

Per my testing/profiling, the 1) does not seem to improve wall-time because the cost to build the mapping table can be high. 2) has proved to be very fast.

Performance result

The other landed PR reduces user's scenario wall-time from 70s => 50s
Using GetTypesWithParents/GetNamespacesWithParents reduces 50s => 13s
Using the new algorithm to avoid touching dwo files in DebugNamesDWARFIndex::SameAsEntryATName() reduces 13s => 1.4s

Overall, these changes reduce from 70s => 1.4s

Patch is 26.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/108907.diff

6 Files Affected:

(modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp (+59)
(modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h (+22)
(modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp (+151-14)
(modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h (+30)
(modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+107-73)
(modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+2)

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
index eafddbad03f57b..3101b1e137fbf3 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
@@ -126,3 +126,62 @@ bool DWARFIndex::GetFullyQualifiedTypeImpl(
     return callback(die);
   return true;
 }
+
+void DWARFIndex::GetNamespacesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  GetNamespaces(name, [&](DWARFDIE die) {
+    return ProcessDieMatchParentNames(name, parent_names, die, callback);
+  });
+}
+
+void DWARFIndex::GetTypesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  GetTypes(name, [&](DWARFDIE die) {
+    return ProcessDieMatchParentNames(name, parent_names, die, callback);
+  });
+}
+
+bool DWARFIndex::ProcessDieMatchParentNames(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> query_parent_names,
+    DWARFDIE die, llvm::function_ref<bool(DWARFDIE die)> callback) {
+  std::vector<lldb_private::CompilerContext> type_context =
+      die.GetTypeLookupContext();
+  if (type_context.empty()) {
+    // If both type_context and query_parent_names and empty we have a match.
+    // Otherwise, this one does not match and we keep on searching.
+    if (query_parent_names.empty())
+      return callback(die);
+    return true;
+  }
+
+  // Type lookup context includes the current DIE as the last element.
+  // so revert it for easy matching.
+  std::reverse(type_context.begin(), type_context.end());
+
+  // type_context includes the name of the current DIE while query_parent_names
+  // doesn't. So start check from index 1 for dwarf_decl_ctx.
+  uint32_t i = 1, j = 0;
+  while (i < type_context.size() && j < query_parent_names.size()) {
+    // If type_context[i] has no name, skip it.
+    // e.g. this can happen for anonymous namespaces.
+    if (type_context[i].name.IsNull() || type_context[i].name.IsEmpty()) {
+      ++i;
+      continue;
+    }
+    // If the name doesn't match, skip it.
+    // e.g. this can happen for inline namespaces.
+    if (query_parent_names[j] != type_context[i].name) {
+      ++i;
+      continue;
+    }
+    ++i;
+    ++j;
+  }
+  // If not all query_parent_names were found in type_context.
+  // This die does not meet the criteria, try next one.
+  if (j != query_parent_names.size())
+    return true;
+  return callback(die);
+}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
index cb3ae8a06d7885..1f457fda0282f5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
@@ -64,6 +64,23 @@ class DWARFIndex {
   virtual void
   GetNamespaces(ConstString name,
                 llvm::function_ref<bool(DWARFDIE die)> callback) = 0;
+
+  /// Get type DIEs whose base name match \param name with \param parent_names
+  /// in its decl parent chain as subset.  A base implementation is provided,
+  /// Specializations should override this if they are able to provide a faster
+  /// implementation.
+  virtual void
+  GetTypesWithParents(ConstString name,
+                      llvm::ArrayRef<llvm::StringRef> parent_names,
+                      llvm::function_ref<bool(DWARFDIE die)> callback);
+  /// Get namespace DIEs whose base name match \param name with \param
+  /// parent_names in its decl parent chain as subset.  A base implementation is
+  /// provided. Specializations should override this if they are able to provide
+  /// a faster implementation.
+  virtual void
+  GetNamespacesWithParents(ConstString name,
+                           llvm::ArrayRef<llvm::StringRef> parent_names,
+                           llvm::function_ref<bool(DWARFDIE die)> callback);
   virtual void
   GetFunctions(const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
                const CompilerDeclContext &parent_decl_ctx,
@@ -115,6 +132,11 @@ class DWARFIndex {
   bool
   GetFullyQualifiedTypeImpl(const DWARFDeclContext &context, DWARFDIE die,
                             llvm::function_ref<bool(DWARFDIE die)> callback);
+
+  /// Check if the \a die has \a parent_names in its decl parent chain.
+  bool ProcessDieMatchParentNames(
+      ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+      DWARFDIE die, llvm::function_ref<bool(DWARFDIE die)> callback);
 };
 } // namespace dwarf
 } // namespace lldb_private::plugin
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 32d8a92305aafa..76c80011791de6 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -12,6 +12,8 @@
 #include "Plugins/SymbolFile/DWARF/LogChannelDWARF.h"
 #include "Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h"
 #include "lldb/Core/Module.h"
+#include "lldb/Core/Progress.h"
+#include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/RegularExpression.h"
 #include "lldb/Utility/Stream.h"
 #include "llvm/ADT/Sequence.h"
@@ -301,7 +303,8 @@ using Entry = llvm::DWARFDebugNames::Entry;
 /// If any parent does not have an `IDX_parent`, or the Entry data is corrupted,
 /// nullopt is returned.
 std::optional<llvm::SmallVector<Entry, 4>>
-getParentChain(Entry entry, uint32_t max_parents) {
+getParentChain(Entry entry,
+               uint32_t max_parents = std::numeric_limits<uint32_t>::max()) {
   llvm::SmallVector<Entry, 4> parent_entries;
 
   do {
@@ -374,25 +377,40 @@ void DebugNamesDWARFIndex::GetFullyQualifiedType(
   m_fallback.GetFullyQualifiedType(context, callback);
 }
 
+bool DebugNamesDWARFIndex::SameAsEntryATName(
+    llvm::StringRef query_name, const DebugNames::Entry &entry) const {
+  auto maybe_dieoffset = entry.getDIEUnitOffset();
+  if (!maybe_dieoffset)
+    return false;
+
+  // [Optimization] instead of parsing the entry from dwo file, we simply
+  // check if the query_name can point to an entry of the same DIE offset.
+  // This greatly reduced number of dwo file parsed and thus improved the
+  // performance.
+  for (const DebugNames::Entry &query_entry :
+       entry.getNameIndex()->equal_range(query_name)) {
+    auto query_dieoffset = query_entry.getDIEUnitOffset();
+    if (!query_dieoffset)
+      continue;
+
+    if (*query_dieoffset == *maybe_dieoffset) {
+      return true;
+    } else if (*query_dieoffset > *maybe_dieoffset) {
+      // The pool entries of the same name are sequentially cluttered together
+      // so if the query name from `query_name` is after the target entry, this
+      // is definitely not the correct one, we can stop searching.
+      return false;
+    }
+  }
+  return false;
+}
+
 bool DebugNamesDWARFIndex::SameParentChain(
     llvm::ArrayRef<llvm::StringRef> parent_names,
     llvm::ArrayRef<DebugNames::Entry> parent_entries) const {
-
   if (parent_entries.size() != parent_names.size())
     return false;
 
-  auto SameAsEntryATName = [this](llvm::StringRef name,
-                                  const DebugNames::Entry &entry) {
-    // Peek at the AT_name of `entry` and test equality to `name`.
-    auto maybe_dieoffset = entry.getDIEUnitOffset();
-    if (!maybe_dieoffset)
-      return false;
-    DWARFUnit *unit = GetNonSkeletonUnit(entry);
-    if (!unit)
-      return false;
-    return name == unit->PeekDIEName(unit->GetOffset() + *maybe_dieoffset);
-  };
-
   // If the AT_name of any parent fails to match the expected name, we don't
   // have a match.
   for (auto [parent_name, parent_entry] :
@@ -402,6 +420,36 @@ bool DebugNamesDWARFIndex::SameParentChain(
   return true;
 }
 
+bool DebugNamesDWARFIndex::WithinParentChain(
+    llvm::ArrayRef<llvm::StringRef> query_parent_names,
+    llvm::ArrayRef<DebugNames::Entry> parent_chain) const {
+  if (parent_chain.size() < query_parent_names.size())
+    return false;
+
+  size_t query_idx = 0, chain_idx = 0;
+  while (query_idx < query_parent_names.size() &&
+         chain_idx < parent_chain.size()) {
+    if (SameAsEntryATName(query_parent_names[query_idx],
+                          parent_chain[chain_idx])) {
+      ++query_idx;
+      ++chain_idx;
+    } else {
+      // Name does not match, try next parent_chain entry if the current entry
+      // is namespace because the current one can be an inline namespace.
+      if (parent_chain[chain_idx].tag() != DW_TAG_namespace)
+        return false;
+
+      ++chain_idx;
+      if (query_parent_names.size() - query_idx >
+          parent_chain.size() - chain_idx) {
+        // Parent chain has not enough entries, we can't possibly have a match.
+        return false;
+      }
+    }
+  }
+  return query_idx == query_parent_names.size();
+}
+
 void DebugNamesDWARFIndex::GetTypes(
     ConstString name, llvm::function_ref<bool(DWARFDIE die)> callback) {
   for (const DebugNames::Entry &entry :
@@ -444,6 +492,95 @@ void DebugNamesDWARFIndex::GetNamespaces(
   m_fallback.GetNamespaces(name, callback);
 }
 
+void DebugNamesDWARFIndex::GetNamespacesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  Progress progress("Get namespace from index for %s", name.GetCString());
+  for (const DebugNames::Entry &entry :
+       m_debug_names_up->equal_range(name.GetStringRef())) {
+    lldb_private::dwarf::Tag entry_tag = entry.tag();
+    if (entry_tag == DW_TAG_namespace ||
+        entry_tag == DW_TAG_imported_declaration) {
+      std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
+          getParentChain(entry);
+      if (!parent_chain) {
+        // Fallback: use the base class implementation.
+        if (!ProcessEntry(entry, [&](DWARFDIE die) {
+              return ProcessDieMatchParentNames(name, parent_names, die,
+                                                callback);
+            }))
+          return;
+        continue;
+      }
+
+      // If parent_chain is shorter than parent_names, we can't possibly match.
+      if (parent_chain->size() < parent_names.size())
+        continue;
+      else if (parent_chain->size() == parent_names.size()) {
+        if (SameParentChain(parent_names, *parent_chain) &&
+            !ProcessEntry(entry, callback))
+          return;
+      } else {
+        // When parent_chain has more entries than parent_names we still
+        // possibly match if parent_chain contains inline namespaces.
+        if (WithinParentChain(parent_names, *parent_chain) &&
+            !ProcessEntry(entry, callback))
+          return;
+      }
+    }
+  }
+
+  m_fallback.GetNamespaces(name, callback);
+}
+
+void DebugNamesDWARFIndex::GetTypesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  if (parent_names.empty())
+    return GetTypes(name, callback);
+
+  // For each entry, grab its parent chain and check if we have a match.
+  for (const DebugNames::Entry &entry : m_debug_names_up->equal_range(name)) {
+    if (!isType(entry.tag()))
+      continue;
+
+    // If we get a NULL foreign_tu back, the entry doesn't match the type unit
+    // in the .dwp file, or we were not able to load the .dwo file or the DWO ID
+    // didn't match.
+    std::optional<DWARFTypeUnit *> foreign_tu = GetForeignTypeUnit(entry);
+    if (foreign_tu && foreign_tu.value() == nullptr)
+      continue;
+
+    std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
+        getParentChain(entry);
+    if (!parent_chain) {
+      // Fallback: use the base class implementation.
+      if (!ProcessEntry(entry, [&](DWARFDIE die) {
+            return ProcessDieMatchParentNames(name, parent_names, die,
+                                              callback);
+          }))
+        return;
+      continue;
+    }
+
+    // If parent_chain is shorter than parent_names, we can't possibly match.
+    if (parent_chain->size() < parent_names.size())
+      continue;
+    else if (parent_chain->size() == parent_names.size()) {
+      if (SameParentChain(parent_names, *parent_chain) &&
+          !ProcessEntry(entry, callback))
+        return;
+    } else {
+      // When parent_chain has more entries than parent_names we still possibly
+      // match if parent_chain contains inline namespaces.
+      if (WithinParentChain(parent_names, *parent_chain) &&
+          !ProcessEntry(entry, callback))
+        return;
+    }
+  }
+  m_fallback.GetTypesWithParents(name, parent_names, callback);
+}
+
 void DebugNamesDWARFIndex::GetFunctions(
     const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
     const CompilerDeclContext &parent_decl_ctx,
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
index cb15c1d4f994b3..1f9c87c76a99f8 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
@@ -52,6 +52,14 @@ class DebugNamesDWARFIndex : public DWARFIndex {
                 llvm::function_ref<bool(DWARFDIE die)> callback) override;
   void GetNamespaces(ConstString name,
                      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+  void
+  GetTypesWithParents(ConstString name,
+                      llvm::ArrayRef<llvm::StringRef> parent_names,
+                      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+  void GetNamespacesWithParents(
+      ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+
   void GetFunctions(const Module::LookupInfo &lookup_info,
                     SymbolFileDWARF &dwarf,
                     const CompilerDeclContext &parent_decl_ctx,
@@ -118,6 +126,28 @@ class DebugNamesDWARFIndex : public DWARFIndex {
   bool SameParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
                        llvm::ArrayRef<DebugNames::Entry> parent_entries) const;
 
+  /// Returns true if \a parent_names entries are within \a parent_chain.
+  /// This is diffferent from SameParentChain() which checks for exact match.
+  /// This function is required because \a parent_chain can contain inline
+  /// namespace entries which may not be specified in parent_names by client.
+  ///
+  /// \param[in] parent_names
+  ///   The list of parent names to check for.
+  ///
+  /// \param[in] parent_chain
+  ///   The fully qualified parent chain entries from .debug_names index table
+  ///   to check against.
+  ///
+  /// \returns
+  ///   True if all \a parent_names entries are can be sequentially found inside
+  ///   \a parent_chain, otherwise False.
+  bool WithinParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
+                         llvm::ArrayRef<DebugNames::Entry> parent_chain) const;
+
+  /// Returns true if .debug_names pool entry \p entry has name \p query_name.
+  bool SameAsEntryATName(llvm::StringRef query_name,
+                         const DebugNames::Entry &entry) const;
+
   static void MaybeLogLookupError(llvm::Error error,
                                   const DebugNames::NameIndex &ni,
                                   llvm::StringRef name);
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index f721ca00fd3559..cc2497b7e484ba 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -2731,6 +2731,22 @@ uint64_t SymbolFileDWARF::GetDebugInfoSize(bool load_all_debug_info) {
   return debug_info_size;
 }
 
+llvm::SmallVector<llvm::StringRef>
+SymbolFileDWARF::GetTypeQueryParentNames(TypeQuery query) {
+  std::vector<lldb_private::CompilerContext> &query_decl_context =
+      query.GetContextRef();
+  llvm::SmallVector<llvm::StringRef> parent_names;
+  if (!query_decl_context.empty()) {
+    auto rbegin = query_decl_context.rbegin(), rend = query_decl_context.rend();
+    for (auto rit = rbegin + 1; rit != rend; ++rit)
+      if ((rit->kind & CompilerContextKind::AnyType) !=
+              CompilerContextKind::Invalid &&
+          !rit->name.IsEmpty())
+        parent_names.push_back(rit->name.GetStringRef());
+  }
+  return parent_names;
+}
+
 void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
 
   // Make sure we haven't already searched this SymbolFile before.
@@ -2748,45 +2764,50 @@ void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
 
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
 
+  TypeQuery query_full(query);
+  llvm::SmallVector<llvm::StringRef> parent_names =
+      GetTypeQueryParentNames(query_full);
   bool have_index_match = false;
-  m_index->GetTypes(type_basename, [&](DWARFDIE die) {
-    // Check the language, but only if we have a language filter.
-    if (query.HasLanguage()) {
-      if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
-        return true; // Keep iterating over index types, language mismatch.
-    }
+  m_index->GetTypesWithParents(
+      query.GetTypeBasename(), parent_names, [&](DWARFDIE die) {
+        // Check the language, but only if we have a language filter.
+        if (query.HasLanguage()) {
+          if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
+            return true; // Keep iterating over index types, language mismatch.
+        }
 
-    // Check the context matches
-    std::vector<lldb_private::CompilerContext> die_context;
-    if (query.GetModuleSearch())
-      die_context = die.GetDeclContext();
-    else
-      die_context = die.GetTypeLookupContext();
-    assert(!die_context.empty());
-    if (!query.ContextMatches(die_context))
-      return true; // Keep iterating over index types, context mismatch.
-
-    // Try to resolve the type.
-    if (Type *matching_type = ResolveType(die, true, true)) {
-      if (matching_type->IsTemplateType()) {
-        // We have to watch out for case where we lookup a type by basename and
-        // it matches a template with simple template names. Like looking up
-        // "Foo" and if we have simple template names then we will match
-        // "Foo<int>" and "Foo<double>" because all the DWARF has is "Foo" in
-        // the accelerator tables. The main case we see this in is when the
-        // expression parser is trying to parse "Foo<int>" and it will first do
-        // a lookup on just "Foo". We verify the type basename matches before
-        // inserting the type in the results.
-        auto CompilerTypeBasename =
-            matching_type->GetForwardCompilerType().GetTypeName(true);
-        if (CompilerTypeBasename != query.GetTypeBasename())
-          return true; // Keep iterating over index types, basename mismatch.
-      }
-      have_index_match = true;
-      results.InsertUnique(matching_type->shared_from_this());
-    }
-    return !results.Done(query); // Keep iterating if we aren't done.
-  });
+        // Check the context matches
+        std::vector<lldb_private::CompilerContext> die_context;
+        if (query.GetModuleSearch())
+          die_context = die.GetDeclContext();
+        else
+          die_context = die.GetTypeLookupContext();
+        assert(!die_context.empty());
+        if (!query.ContextMatches(die_context))
+          return true; // Keep iterating over index types, context mismatch.
+
+        // Try to resolve the type.
+        if (Type *matching_type = ResolveType(die, true, true)) {
+          if (matching_type->IsTemplateType()) {
+            // We have to watch out for case where we lookup a type by basename
+            // and it matches a template with simple template names. Like
+            // looking up "Foo" and if we have simple template names then we
+            // will match "Foo<int>" and "Foo<double>" because all the DWARF has
+            // is "Foo" in the accelerator tables. The main case we see this in
+            // is when the expression parser is trying to parse "Foo<int>" and
+            // it will first do a lookup on just "Foo". We verify the type
+            // basename matches before inserting the type in...
[truncated]

jeffreytan81 · 2024-09-18T23:48:56Z

Ping, anyone wants to review this? Thanks

felipepiovezan

Very cool work! I've left some high level thoughts and questions inline. Focused only on the namespace bit for now, and didn't do a code review, just algorithm ideas

felipepiovezan · 2024-09-19T14:19:37Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+      // The pool entries of the same name are sequentially cluttered together
+      // so if the query name from `query_name` is after the target entry, this
+      // is definitely not the correct one, we can stop searching.
+      return false;


does the spec guarantee this sorting of DIE offsets?

felipepiovezan · 2024-09-19T14:41:11Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

-    DWARFUnit *unit = GetNonSkeletonUnit(entry);
-    if (!unit)
-      return false;
-    return name == unit->PeekDIEName(unit->GetOffset() + *maybe_dieoffset);


I think this is the core of your proposal, right? In the original implementation that I did, the vast majority of the savings was thanks to being able to do PeekDIEName. I would strongly recommend we benchmark this new implementation is the non-dwo case as well.

One way to check is to build with regular object files (the non dwo case) and repeat the experiments I described in this reply:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/38?u=felipe
And instrumenting this portion of the code: 837bc0f

Sounds good. Let me try to build a target with split dwarf disable and measure the perf (not sure if I can use the same target because I might get debug_info or debug_str overflow).

@felipepiovezan , I am having trouble finding a context without split dwarf to measure the performance. Our target is way too big to build without split dwarf (linker error). I am trying to debug clang++ per your benchmark, but the evaluation is very fast 293ms which I do not think it is slow enough to measure perf. How do you get 5~6 seconds from this?

devbig623:~/llvm-sand/build/Debug/fbcode-x86_64/toolchain$ lldb -o "b CodeGenFunction::GenerateCode" -o r -o "timeit expr Fn" -- ~/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++ -std=c++17 -pthread -glldb -c ~/personal/main.cpp -o ~/personal/a (lldb) target create "/home/jeffreytan/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++" Current executable set to '/home/jeffreytan/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++' (x86_64). (lldb) settings set -- target.run-args "-std=c++17" "-pthread" "-glldb" "-c" "/home/jeffreytan/personal/main.cpp" "-o" "/home/jeffreytan/personal/a" (lldb) b CodeGenFunction::GenerateCode Breakpoint 1: where = clang++`clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) + 59 at CodeGenFunction.cpp:1440:3, address = 0x00000000083c507b (lldb) r Process 233320 launched: '/home/jeffreytan/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++' (x86_64) Process 233320 stopped * thread #1, name = 'clang++', stop reason = breakpoint 1.1 frame #0: 0x000055555d91907b clang++`clang::CodeGen::CodeGenFunction::GenerateCode(this=0x00007fffffff5bb0, GD=GlobalDecl @ 0x00007fffffff5b48, Fn=0x000002921787f228, FnInfo=0x0000029218308200) at CodeGenFunction.cpp:1440:3 1437 1438 void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn, 1439 const CGFunctionInfo &FnInfo) { -> 1440 assert(Fn && "generating code for null Function"); 1441 const FunctionDecl *FD = cast<FunctionDecl>(GD.getDecl()); 1442 CurGD = GD; 1443 (lldb) timeit expr Fn (llvm::Function *) $0 = 0x000002921787f228 time: 0m0.293s

Nevermind, I manually strip down the internal target debug info from 7GB => 400MB so that it can be linked without split dwarf.

Tested with PeekDIEName() vs this PR in two different code paths:

In the first path: PeekDIEName 1.2s This PR: 1.4s In the second path: PeekDIEName 1.999s This PR: 1.720s

So they are about the same if not using split dwarf. And majority of the saving comes from split dwarf which avoiding parsing dwo files really matters.

but the evaluation is very fast 293ms which I do not think it is slow enough to measure perf. How do you get 5~6 seconds from this?

Oh, my bad, to get to 5s you would have to revert the original parent chain patch. It is still interesting that you are getting 0.2s for that experiment when I was getting 1.4s (after the parent chain patches). I have some time today and will try to repro this out of curiosity.

But I am glad this doesn't seem to regress perf!

felipepiovezan · 2024-09-19T14:46:52Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+      ++chain_idx;
+    } else {
+      // Name does not match, try next parent_chain entry if the current entry
+      // is namespace because the current one can be an inline namespace.


do we have a way of guaranteeing it is?
If the comparison that failed to match is with a real (non inline) namespace, we would return a false result here.

Yeah, I would love to but I do not think .debug_names tag has information to filter it out. It simply encodes DW_TAG_namespace tag:

Name 1748 { Hash: 0xB3EF34A4 String: 0x0000ae01 "inline_ns" Entry @ 0xb2d9 { Abbrev: 0x5 Tag: DW_TAG_namespace DW_IDX_die_offset: 0x00015df7 DW_IDX_parent: Entry @ 0xbab8 } }

I guess we could try to PeekDieName() to parse this DIE from dwo files to check if it is an inline vs non-inline namespace but that is the expensive code path this PR wants to avoid.

Would that help? I don't see anything in the DIE that let's us identify inline namespaces...

https://godbolt.org/z/GT98Yfbqv

Cool, did not know godbolt can show dwarfdump output.

DWARFASTParserClang identifies namespace DIE with DW_AT_export_symbols as inline namespace:

llvm-project/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp

Lines 3472 to 3473 in 81c3499

bool is_inline =

die.GetAttributeValueAsUnsigned(DW_AT_export_symbols, 0) != 0;

Ok, good to know we have a mechanism to detect inline namespaces.

That still leaves the question of how to avoid false results though. I understand peeking at the DIE would be expensive for dwos, but I don't think we can sacrifice the correctness of the answer for these queries. One possibility to explore is to add another IDX entry for namespaces, which would include whether they export symbols or not.

The design is made to perform as much filtering as possible in DebugNamesDWARFIndex::GetTypesWithQuery(), it does not have to be accurate, because it will eventually delegate to base class DWARFIndex::ProcessTypeDieMatchQuery() which performs accurate semantics matching as the original callsite.

felipepiovezan · 2024-09-19T14:47:55Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+bool DebugNamesDWARFIndex::WithinParentChain(
+    llvm::ArrayRef<llvm::StringRef> query_parent_names,
+    llvm::ArrayRef<DebugNames::Entry> parent_chain) const {
+  if (parent_chain.size() < query_parent_names.size())


this check and the one at the end of the loop can be combined into a single check at the start of the loop

felipepiovezan · 2024-09-19T14:55:01Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+
+      if (parent_chain->size() < parent_names.size())
+        continue;
+      else if (parent_chain->size() == parent_names.size()) {


I'd probably fold this optimization inside WithinParentChain

Sounds good. To make sure I understand, are you suggesting we always call WithinParentChain() from here, and WithinParentChain()'s implementation should dispatch to SameParentChain() if the entries are of same size?

Yup, that's right! The reason for the suggestion is that this is mostly an optimization of the WithinParentChain method, so it makes sense to live in there (also removes some code duplication)

labath

Ping, anyone wants to review this? Thanks

The usual ping time is one week, and this is definitely one of the more nuanced patches, which usually take a longer time to review.

This isn't really a review, just noting something that caught my eye.

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

labath · 2024-09-19T14:47:19Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+      // The pool entries of the same name are sequentially cluttered together
+      // so if the query name from `query_name` is after the target entry, this
+      // is definitely not the correct one, we can stop searching.


That may be the case for clang, but I don't think anything guarantees that. This sounds like the kind of thing that we could communicate through the augmentation string of the debug_names table though...

I.e. I think we shouldn't do this unconditionally, but only if the producer signals this through the augmentation string. We could possibly even retcon the LLVM0700 string which has been emitted by llvm/clang from the beginning of time (if indeed the output has been sorted the whole time).

We should probably also add some tests/comments/etc. to the producer code to make sure they realize we're relying on that.

@labath, @felipepiovezan, this assumption (pool entries of the same name are sequentially cluttered together) is actually guaranteed by dwarf5 specification:

Page 141:

The entry pool contains all the index entries, grouped by name. The second column of the name list points to the first index entry for the name, and all the index entries for that name are placed one after the other.

Also in 6.1.1.4.8 Entry Pool:

Gaps are not allowed between entries in a series (that is, the entries for a single 10 name must all be contiguous), but there may be gaps between series.

So if the entry's die_offset is before the current query_name's die_offset in the entry pool, entry can't be in pool entries chunk that current query_name entry points to, so won't match.

I can add quoting of these sections in the comment if you think will make it clear.

I think you're reading too much into the specification. That's definitely not how I would interpret those sections.

The whole of page 141 is devoted to describing the physical layout of the debug_names section, and that's the context in which I'd read that sentence. It just says that entries in the entry pool should follow one after the other -- i.e., with no intervening padding. It doesn't say anything about the debug info entries (in the debug_info) section which are referred to by the debug_names entries. In fact, at this point in the spec, it hasn't even been established how the index entries refer to DIEs -- that happens six pages later (somewhere around your second quote).

However, the second quote basically restates the same thing. There can be no gaps between entries for the same name (which makes sense, as you couldn't parse them otherwise -- you reach the next one by passing over the first one), but there can be gaps between different names (and that also makes sense -- you reach the first entry by following the pointer from the name table, so it does not have to be glued to the previous list).

The non-normative text after it confirms this theory:

For example, a producer/consumer combination may find it useful to maintain alignment.

The padding is there to (optionally) maintain alignment of the entries in the debug_names section. You can't maintain alignment by sorting entries.

In addition, this sorting of entries by die offset is very arbitrary and a remnant (introduced by me) from the apple tables. Since the DIE offset is relative to the CU start, it means that entries from different CUs would be interleaved just because they have interleaving relative offsets.

Bottom line: I'm certain that if the DWARF spec wanted to guarantee this sorting, it would use a much more obvious language, and I suspect that it would choose a different sorting key (e.g. group all entries from the same CU and then sort by offset).

I think this part of the patch should be split off to a separate PR.

Yup, I agree with @labath 's interpretation of the spec here.

Here's what I would suggest:

Make a patch just changes the implementation of SameAsEntryATName. This will benefit your use case (dwo) by itself and we can measure the perf impact of not being able to relying on the sorting of entries in all cases.

Then we can branch the work into two parallel lines of investigation:

1.1. [If sorting entries turns out to be important] we can create a series of patches exploring the augmentation string idea from @labath

1.2. One patch for GetNamespacesWithParents and then one patch for GetTypesWithParents

@labath, @felipepiovezan, ah, you are right. This check is a bad implementation. The optimization I want to express here is that: the "pool entries" in .debug_names are cluttered together (which is stated by spec) while the code incorrectly checks that the "die offsets" in .debug_info pointed to by .debug_names "pool entries" are sorted (which definitely has no guarantee).

From the existing API it is not that simple to express the intention I want to express, and the performance testing shows this check does not affect wall-time (at least in our cases) so I will remove this check in this iteration but add a TODO to look into in future.

For splitting PRs, that makes sense. This PR just shares big picture and gathers feedback so that I can refactor more effectively.

Okay, that explains a few things, but then I'm not sure what you want to achieve with this check. It sounds to me like this consecutiveness is already guaranteed by the parser (it keeps parsing until it reaches the end-of-list entry). Maybe you could provide an example of the kind of situation that you want to detect/rule out?

@labath, let me see if I can explain well:

The purpose of SameAsEntryATName(query_name, entry) is to check whether the pool entry has the name query_name.

Name table encodes data like this:

<name_table_entry for "query_name"> --> <consecutive_list_of_pool_entry>

In the name table, the SameAsEntryATName(query_name, entry) function in this PR attempts to verify if the entry is within a list of <consecutive_list_of_pool_entry> above.

The "incorrectly implemented check" in this PR tries to bail out early by checking whether the first Entry Offset pointed to by <name_table_entry for "query_name"> is greater than or equal to the entry's offset in the "Entry Pool". If this condition is met, it indicates that the entry is located before the <consecutive_list_of_pool_entry> in the "Entry Pool", meaning it cannot be part of the <consecutive_list_of_pool_entry> we are searching for. Consequently, the code can bail out early without iterating through the <consecutive_list_of_pool_entry> to check for a match.

Per testing, this early bail out does not affect much performance (I guess because <consecutive_list_of_pool_entry> is small for most query_name), so it is not an important optimization. For split dwarf, avoid touching dwo files is the key performance mover in this PR.

Okay, I think I've finally understood what you mean now. You want to reverse-engineer the name of an debug_names entry (check whether it matches the expected string) by looking at whether its contained in the range of entries belonging to that name.

That should be doable (and it's quite clever actually), but we don't currently have an API which would make that possible. If necessary, I think we can add one, though.

Michael137 · 2024-09-17T08:15:34Z

lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp

+  // so revert it for easy matching.
+  std::reverse(type_context.begin(), type_context.end());
+
+  // type_context includes the name of the current DIE while query_parent_names


This feels very similar to what we're doing in TypeQuery::ContextMatches. Any chance one can unify them? Or re-use some of the logic? Haven't compared them in-depth to be sure it's a good idea though

The .debug_names search is what returns results to TypeQuery::ContextMatches, but we should have functions that can do the same matching. The difference here in .debug_names is we are trying to live off only the contents of .debug_names without using and DIEs. TypeQuery::ContextMatches uses DIEs. So we might be able to unify some of the context matching once the context has been extracted from the .debug_names or from the DWARF, but that is about as far as we should go.

@Michael137 , @clayborg, yep, the algorithm is similar. I tried to reuse TypeQuery::ContextMatches from DebugNamesDWARFIndex before, but then find it hard because, as Greg pointed out, the CompilerContext is parsed out from .debug_names parent chain, which is lazily searched in the new algorithm.

However, I should be able to reuse TypeQuery::ContextMatches in this base implementation. Let me give a try.

Michael137 · 2024-09-19T18:27:24Z

Is it worth splitting this into two PRs (one for the type lookup and one for namespaces)?

clayborg · 2024-09-19T19:54:54Z

lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp

@@ -126,3 +126,62 @@ bool DWARFIndex::GetFullyQualifiedTypeImpl(
    return callback(die);
  return true;
 }
+
+void DWARFIndex::GetNamespacesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,


Instead of just an array of names, we should try to use the llvm::ArrayRef<CompilerContext>. The .debug_names table has the DW_TAG in each entry and we can further reduce the number of things and possibly not even need to check the name if the DW_TAG_ doesn't match of the CompilerContext type isn't set to Any.

clayborg · 2024-09-19T19:55:28Z

lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp

+}
+
+void DWARFIndex::GetTypesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,


Same here, replace llvm::ArrayRef<llvm::StringRef> parent_names with llvm::ArrayRef<CompilerContext>

clayborg · 2024-09-19T19:55:51Z

lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp

+}
+
+bool DWARFIndex::ProcessDieMatchParentNames(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> query_parent_names,


Same here, replace llvm::ArrayRef<llvm::StringRef> parent_names with llvm::ArrayRef<CompilerContext>

clayborg · 2024-09-19T19:58:51Z

lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp

+  // so revert it for easy matching.
+  std::reverse(type_context.begin(), type_context.end());
+
+  // type_context includes the name of the current DIE while query_parent_names


The .debug_names search is what returns results to TypeQuery::ContextMatches, but we should have functions that can do the same matching. The difference here in .debug_names is we are trying to live off only the contents of .debug_names without using and DIEs. TypeQuery::ContextMatches uses DIEs. So we might be able to unify some of the context matching once the context has been extracted from the .debug_names or from the DWARF, but that is about as far as we should go.

jeffreytan81 · 2024-09-19T21:20:37Z

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

@labath, that is the first prototype I tried, but benchmark shows it did not improve any performance because the time to build this mapping offsets any query gain later so this PR did not use it and uses the second idea by searching parent entry's matching names which is implemented in this PR.

labath · 2024-09-20T13:02:44Z

Is it worth splitting this into two PRs (one for the type lookup and one for namespaces)?

Maybe, but I think it may even be better to split it along the optimization-type axis (the die offset thingy vs. the parent chain).

labath · 2024-09-20T13:06:23Z

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

@labath, that is the first prototype I tried, but benchmark shows it did not improve any performance because the time to build this mapping offsets any query gain later so this PR did not use it and uses the second idea by searching parent entry's matching names which is implemented in this PR.

Okay, that explains why I couldn't find it, but then I guess the patch description is out of date, as it still contains this:

The new implementation minimizes the need to touch DWO files by employing two strategies:

Build a <pool_entry_range => NameEntry> mapping table (built lazily as needed), allowing us to check the name of the parent entry.

Can you update it to reflect the current state of the patch (you may choose to keep this in some "alternatives considered" section, but I wouldn't say that's necessary)?

jeffreytan81 · 2024-09-23T02:38:21Z

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

@labath, that is the first prototype I tried, but benchmark shows it did not improve any performance because the time to build this mapping offsets any query gain later so this PR did not use it and uses the second idea by searching parent entry's matching names which is implemented in this PR.

Okay, that explains why I couldn't find it, but then I guess the patch description is out of date, as it still contains this:

The new implementation minimizes the need to touch DWO files by employing two strategies:

Build a <pool_entry_range => NameEntry> mapping table (built lazily as needed), allowing us to check the name of the parent entry.

Can you update it to reflect the current state of the patch (you may choose to keep this in some "alternatives considered" section, but I wouldn't say that's necessary)?

That's fair. I listed the other option in case someone is asking but it did not clearly mention that the option is not implemented in this PR.

I will update the PR with all the feedback including splitting PR.

jeffreytan81 · 2024-09-24T00:51:32Z

This PR has been updated to only contain the type lookup using .debug_names parent chain.
Later PRs will add namespace changes and SameAsATName algorithm change.

jeffreytan81 · 2024-09-24T23:36:46Z

I have no idea how to do stack PR review well in github. Here is the second PR with namespace lookup diffing from my repo:
jeffreytan81#2 if anyone wants to see. Unfortunately, if I am diffing from master llvm-project, it is showing both combined changes from both PRs instead of only the second changes.

labath

This sort of makes sense to me, though I'd definitely feel better with another set of eyes looking it over.

labath · 2024-09-25T14:15:41Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

@@ -301,7 +301,8 @@ using Entry = llvm::DWARFDebugNames::Entry;
 /// If any parent does not have an `IDX_parent`, or the Entry data is corrupted,
 /// nullopt is returned.
 std::optional<llvm::SmallVector<Entry, 4>>
-getParentChain(Entry entry, uint32_t max_parents) {
+getParentChain(Entry entry,
+               uint32_t max_parents = std::numeric_limits<uint32_t>::max()) {
  llvm::SmallVector<Entry, 4> parent_entries;


Not really a job for this patch, but I'm noticed that this function will return nothing if the parent chain was cut short (e.g. because one of the intermediate types is in a type unit). It'd probably be better to return a partial list (which is marked as partial), as that can still be useful for some filtering.

Do you know what branch in this function will cut short during "intermediate types is in a type unit" situation? Is it "entry.hasParentInformation()" returning false, or "entry.getParentDIEEntry()" return failure?

The first one. Basically, if you try something like clang++ -o - -c -g -gpubnames -fdebug-types-section -x c++ - <<<"struct A { struct B {}; }; A a; A::B b;", you'll see that the entry for B has no DW_IDX_parent attribute (I think it could have it, but we'd have to figure out what exactly it should refer to).

labath · 2024-09-25T14:22:54Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+  size_t query_idx = 0, chain_idx = 0;
+  while (query_idx < query_contexts.size() && chain_idx < parent_chain.size()) {
+    if (query_contexts.size() - query_idx > parent_chain.size() - chain_idx) {
+      // Parent chain has not enough entries, we can't possibly have a match.


Suggested change

// Parent chain has not enough entries, we can't possibly have a match.

// Parent chain does not have enough entries, we can't possibly have a match.

labath · 2024-09-25T14:25:53Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+
+  size_t query_idx = 0, chain_idx = 0;
+  while (query_idx < query_contexts.size() && chain_idx < parent_chain.size()) {
+    if (query_contexts.size() - query_idx > parent_chain.size() - chain_idx) {


Maybe you could ditch the index variables and replace ++idx with ref = ref.drop_front() ?
That way this check will just be ref1.size() > ref2.size() ?

labath · 2024-09-25T14:32:07Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+  std::vector<lldb_private::CompilerContext> &query_decl_context =
+      query.GetContextRef();
+  llvm::SmallVector<CompilerContext> parent_contexts;
+  if (!query_decl_context.empty()) {
+    auto rbegin = query_decl_context.rbegin(), rend = query_decl_context.rend();
+    // Reverse the query decl context to match parent chain.
+    // Skip the last entry, it is the type we are looking for.
+    for (auto rit = rbegin + 1; rit != rend; ++rit)
+      // Skip any context without name because .debug_names might not encode
+      // them. (e.g. annonymous namespace)
+      if ((rit->kind & CompilerContextKind::AnyType) !=
+              CompilerContextKind::Invalid &&
+          !rit->name.IsEmpty())
+        parent_contexts.push_back(*rit);
+  }
+  return parent_contexts;
+}


Suggested change

std::vector<lldb_private::CompilerContext> &query_decl_context =

query.GetContextRef();

llvm::SmallVector<CompilerContext> parent_contexts;

if (!query_decl_context.empty()) {

auto rbegin = query_decl_context.rbegin(), rend = query_decl_context.rend();

// Reverse the query decl context to match parent chain.

// Skip the last entry, it is the type we are looking for.

for (auto rit = rbegin + 1; rit != rend; ++rit)

// Skip any context without name because .debug_names might not encode

// them. (e.g. annonymous namespace)

if ((rit->kind & CompilerContextKind::AnyType) !=

CompilerContextKind::Invalid &&

!rit->name.IsEmpty())

parent_contexts.push_back(*rit);

}

return parent_contexts;

}

llvm::ArrayRef<lldb_private::CompilerContext> query_decl_context =

query.GetContextRef();

llvm::SmallVector<CompilerContext> parent_contexts;

if (!query_decl_context.empty()) {

// Reverse the query decl context to match parent chain.

// Skip the last entry, it is the type we are looking for.

for (CompilerContext ctx: llvm::reverse(query_decl_context.drop_back())) {

// Skip any context without name because .debug_names might not encode

// them. (e.g. annonymous namespace)

if ((ctx.kind & CompilerContextKind::AnyType) !=

CompilerContextKind() &&

!ctx.name.IsEmpty())

parent_contexts.push_back(ctx);

}

return parent_contexts;

}

labath · 2024-09-25T14:36:02Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+    if (parent_chain->size() < parent_contexts.size())
+      continue;


This is already checked inside WithinParentChain. I doubt that checking it here again improves performace in any way

labath · 2024-09-25T14:38:49Z

I have no idea how to do stack PR review well in github. Here is the second PR with namespace lookup diffing from my repo: jeffreytan81#2 if anyone wants to see. Unfortunately, if I am diffing from master llvm-project, it is showing both combined changes from both PRs instead of only the second changes.

The best thing I've seen so far is to create another PR with exactly two commits (one of them is the dependent PR, and another is the actual change) and then tell the reviewers to only look at the second part.

kusmour

Just some nits

kusmour · 2024-09-26T20:44:10Z

lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h

+
+  /// Check if the type \a die can meet the requirements of \a query.
+  bool
+  ProcessTypeDieMatchQuery(TypeQuery &query, DWARFDIE die,


Looks like other function names use all cap for DIE

Suggested change

ProcessTypeDieMatchQuery(TypeQuery &query, DWARFDIE die,

ProcessTypeDIEMatchQuery(TypeQuery &query, DWARFDIE die,

kusmour · 2024-09-26T20:45:40Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h

@@ -118,6 +122,36 @@ class DebugNamesDWARFIndex : public DWARFIndex {
  bool SameParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
                       llvm::ArrayRef<DebugNames::Entry> parent_entries) const;

+  bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_names,


Suggested change

bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_names,

bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_contexts,

Jlalond

Some Minor comments, but I think Pavel pointed out everything important. LGTM

Jlalond · 2024-10-07T17:39:24Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+bool DebugNamesDWARFIndex::SameAsEntryContext(
+    const CompilerContext &query_context,
+    const DebugNames::Entry &entry) const {
+  // TODO: check dwarf tag matches.


If this is a long term todo, do we have a github issue that reflects this (and if we do, can we add the number in teh comment)

No, it is not a long term todo. My 3rd PR in the stack will implement it.

Jlalond · 2024-10-07T17:45:34Z

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp

+      query.GetContextRef();
+  llvm::SmallVector<CompilerContext> parent_contexts;
+  if (!query_decl_context.empty()) {
+    // Skip the last entry, it is the type we are looking for.


This comment is confusing

the last element contains what we have, but then we reverse the 0..N-1 part of the list and capture this data is the check if it's invalid and that we have a non-invalid/null name.

I can figure out what you're doing but I think we can rework some of the comments.

// Skip the last entry as it's the type we're finding parents for.

Also, why are we reversing it? That is not very apparent to me

The vector returned from TypeQuery::GetContextRef() stores from outmost to innermost namespace which is reversed from what's we wanted (and .debug_names encoded in parent chain).

## Summary This PR improves `SymbolFileDWARF::FindTypes()` by utilizing the newly added parent chain `DW_IDX_parent` in `.debug_names`. The proposal was originally discussed in [this RFC](https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151). ## Implementation To leverage the parent chain for `SymbolFileDWARF::FindTypes()`, this PR adds a new API: `GetTypesWithQuery` in `DWARFIndex` base class. The API performs the same function as `GetTypes` with additional filtering using `TypeQuery`. Since this only introduces filtering, the callback mechanisms at all call sites remain unchanged. A default implementation is given in `DWARFIndex` class which parses debug info and performs the matching. In the `DebugNameDWARFIndex` override, the parent_contexts in the `TypeQuery` is cross checked with parent chain in `.debug_names` for for much faster filtering before fallback to base implementation for final filtering. Unlike the `GetFullyQualifiedType` API, which fully consumes the `DW_IDX_parent` parent chain for exact matching, these new APIs perform partial subset matching for type/namespace queries. This is necessary to support queries involving anonymous or inline namespaces. For instance, a user might request `NS1::NS2::NS3::Foo`, while the index table's parent chain might contain `NS1::inline_NS2::NS3::Foo`, which would fail exact matching. ## Performance Results In one of our internal target using `.debug_names` + split dwarf. Expanding a "this" pointer in locals view in VSCode: 94s => 48s. (Not sure why I got 94s this time instead of 70s last week). --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>

## Summary This PR is a continuation of #108907 by using `.debug_names` parent chain faster lookup for namespaces. ## Implementation Similar to #108907. This PR adds a new API: `GetNamespacesWithParents` in `DWARFIndex` base class. The API performs the same function as `GetNamespaces()` with additional filtering using parents `CompilerDeclContext`. A default implementation is given in `DWARFIndex` class which parses debug info and performs the matching. In the `DebugNameDWARFIndex` override, parents `CompilerDeclContext` is cross checked with parent chain in `.debug_names` for much faster filtering before fallback to base implementation for final filtering. ## Performance Results For the same benchmark used in #108907, this PR improves: 48s => 28s --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>

## Summary This PR is a continuation of llvm#108907 by using `.debug_names` parent chain faster lookup for namespaces. ## Implementation Similar to llvm#108907. This PR adds a new API: `GetNamespacesWithParents` in `DWARFIndex` base class. The API performs the same function as `GetNamespaces()` with additional filtering using parents `CompilerDeclContext`. A default implementation is given in `DWARFIndex` class which parses debug info and performs the matching. In the `DebugNameDWARFIndex` override, parents `CompilerDeclContext` is cross checked with parent chain in `.debug_names` for much faster filtering before fallback to base implementation for final filtering. ## Performance Results For the same benchmark used in llvm#108907, this PR improves: 48s => 28s --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>

In llvm#108907, the index classes started filtering the DIEs according to the full type query (instead of just the base name). This means that the checks in SymbolFileDWARF are now redundant. I've also moved the non-redundant checks so that now all checking is done in the DWARFIndex class and the caller can expect to get the final filtered list of types.

In #108907, the index classes started filtering the DIEs according to the full type query (instead of just the base name). This means that the checks in SymbolFileDWARF are now redundant. I've also moved the non-redundant checks so that now all checking is done in the DWARFIndex class and the caller can expect to get the final filtered list of types.

@WanYi

Summary: This diff fixes T199918951 by changing lldb type and namespace lookup to use .debug_names parent chain. It has incorporated latest feedback from Apple/Google. We need to land it internally because: * Unblock Ads partners from dogfooding (currently, they keep on hitting this long delay which disrupts the dogfooding experience) * Dogfooding to flush out any new issues Note: I will keep update the internal patch the same way as the single thread stepping deadlock timeout PR so that we won't get any merge conflict. It includes 4 commits squashed into one diff: * This reverts the internal patch from @WanYi to "Revert PeekDIEName related patch" * Note: `PeekDIEName()` has been removed my the new changes so we won't suffer the original crash anymore. * Improve type lookup review: llvm#108907 * Improve namespace lookup review: llvm#110062 * Use a better algorithm for SameAsEntryContext: Pending =================== Original PR Summary =================== A customer has reported that expanding the "this" pointer for a non-trivial class method context takes more than **70–90 seconds**. This issue occurs on a Linux target with the `.debug_names` index table enabled and using split DWARF (DWO) files to avoid debug info overflow. Profiling shows that the majority of the bottleneck occurs in the `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()` functions, which search for types and namespaces in the `.debug_names` index table by base name, then parse/load the underlying DWO files—an operation that can be very slow. The whole profile trace is pretty big, but here is a part of the hot path: ``` --61.01%-- clang::ASTNodeImporter::VisitRecordDecl(clang::RecordDecl*) --61.00%-- clang::ASTNodeImporter::ImportDeclParts(clang::NamedDecl*, clang::DeclContext*&, clang::DeclContext*&, clang::DeclarationName&, clang::NamedDecl*&, clang::SourceLocation&) clang::ASTImporter::ImportContext(clang::DeclContext*) clang::ASTImporter::Import(clang::Decl*) lldb_private::ClangASTImporter::ASTImporterDelegate::Imported(clang::Decl*, clang::Decl*) lldb_private::ClangASTImporter::BuildNamespaceMap(clang::NamespaceDecl const*) lldb_private::ClangASTSource::CompleteNamespaceMap( std::shared_ptr<std::vector<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>, std::allocator<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>>> >&, lldb_private::ConstString, std::shared_ptr<std::vector<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>, std::allocator<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>>> >&) const ) lldb_private::plugin::dwarf::SymbolFileDWARF::FindNamespace(lldb_private::ConstString, lldb_private::CompilerDeclContext const&, bool) lldb_private::plugin::dwarf::DebugNamesDWARFIndex::GetNamespaces(lldb_private::ConstString, llvm::function_ref<bool (lldb_private::plugin::dwarf::DWARFDIE)>) --60.29%-- lldb_private::plugin::dwarf::DebugNamesDWARFIndex::GetDIE(llvm::DWARFDebugNames::Entry const&) const --46.11%-- lldb_private::plugin::dwarf::DWARFUnit::GetDIE(unsigned long) --46.11%-- lldb_private::plugin::dwarf::DWARFUnit::ExtractDIEsIfNeeded() --45.77%-- lldb_private::plugin::dwarf::DWARFUnit::ExtractDIEsRWLocked() --23.22%-- lldb_private::plugin::dwarf::DWARFDebugInfoEntry::Extract(lldb_private::DWARFDataExtractor const&, lldb_private::plugin::dwarf::DWARFUnit const&, unsigned long*) --10.04%-- lldb_private::plugin::dwarf::DWARFFormValue::SkipValue(llvm::dwarf::Form, lldb_private::DWARFDataExtractor const&, unsigned long*, lldb_private::plugin::dwarf::DWARFUnit const*) ``` This PR improves `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()` by utilizing the newly added parent chain `DW_IDX_parent` in `.debug_names`. The proposal was originally discussed in [this RFC](https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151). This PR also introduces a new algorithm in `DebugNamesDWARFIndex::SameAsEntryATName()` that significantly reduces the need to parse DWO files. To leverage the parent chain for `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()`, this PR adds two new APIs: `GetTypesWithParents` and `GetNamespacesWithParents` in the `DWARFIndex` base class. These APIs perform the same function as `GetTypes`/`GetNamespaces`, with additional filtering if the `parent_names` parameter is not empty. Since this only introduces filtering, the callback mechanisms at all call sites remain unchanged. A default implementation in the `DWARFIndex` class parses the type context from each base name matching DIE, then filter by parent_names. In the `DebugNameDWARFIndex` override, the parent_names is cross checked with parent chain in .debug_names for for much faster filtering. Unlike the `GetFullyQualifiedType` API, which fully consumes the `DW_IDX_parent` parent chain for exact matching, these new APIs perform partial subset matching for type/namespace queries. This is necessary to support queries involving anonymous or inline namespaces. For instance, a user might request `NS1::NS2::NS3::Foo`, while the index table's parent chain might contain `NS1::inline_NS2::NS3::Foo`, which would fail exact matching. Another optimization is implemented in `DebugNamesDWARFIndex::SameAsEntryATName()`. The old implementation used `GetNonSkeletonUnit()`, which triggered parsing/loading of underlying DWO files—a very costly operation we aim to avoid. The new implementation minimizes the need to touch DWO files by employing two strategies: 1. Build a `<pool_entry_range => NameEntry>` mapping table (built lazily as needed), allowing us to check the name of the parent entry. 2. Search for the name the entry is trying to match from the current `NameIndex`, then check if the parent entry can be found from the name entry. In my testing and profiling, strategy (1) did not improve wall-time, as the cost of building the mapping table was high. However, strategy (2) proved to be very fast. ========================================================= Test Plan: * `ninja check-lldb` are passing. * It improves T199918951 from around 110s+ => 1.8s. Reviewers: toyang, wanyi, jsamouh, jaihari, jalalonde, #lldb_team Reviewed By: wanyi Subscribers: wanyi, #lldb_team Differential Revision: https://phabricator.intern.facebook.com/D63440813 Tasks: T199918951

jeffreytan81 requested review from labath, clayborg, felipepiovezan, jimingham, kusmour and zhyty September 17, 2024 03:02

jeffreytan81 marked this pull request as ready for review September 17, 2024 03:02

jeffreytan81 requested a review from JDevlieghere as a code owner September 17, 2024 03:02

llvmbot added the lldb label Sep 17, 2024

felipepiovezan reviewed Sep 19, 2024

View reviewed changes

labath reviewed Sep 19, 2024

View reviewed changes

Michael137 reviewed Sep 19, 2024

View reviewed changes

clayborg reviewed Sep 19, 2024

View reviewed changes

jeffreytan81 force-pushed the improve_type_namespace_lookup branch from 1398459 to 4c72021 Compare September 24, 2024 00:32

jeffreytan81 changed the title ~~Improve type and namespace lookup using parent chain~~ Improve type lookup using .debug_names parent chain Sep 24, 2024

jeffreytan81 mentioned this pull request Sep 24, 2024

Improve namespace lookup using .debug_names parent chain jeffreytan81/llvm-project#2

Closed

labath reviewed Sep 25, 2024

View reviewed changes

jeffreytan81 force-pushed the improve_type_namespace_lookup branch 2 times, most recently from bb42917 to 3f92956 Compare September 25, 2024 22:39

jeffreytan81 mentioned this pull request Sep 26, 2024

Improve namespace lookup using .debug_names parent chain #110062

Merged

kusmour reviewed Sep 26, 2024

View reviewed changes

jeffreytan81 force-pushed the improve_type_namespace_lookup branch from 3f92956 to 750362e Compare September 26, 2024 22:48

jeffreytan81 requested a review from Jlalond October 4, 2024 20:30

Jlalond approved these changes Oct 7, 2024

View reviewed changes

jeffreytan81 added 2 commits October 9, 2024 12:58

Improve type query using .debug_names parent chain

37bcf44

Address review feedback

47b20f4

jeffreytan81 force-pushed the improve_type_namespace_lookup branch from d0a402a to 47b20f4 Compare October 9, 2024 19:59

jeffreytan81 merged commit 61a4678 into llvm:main Oct 9, 2024
7 checks passed

labath mentioned this pull request Nov 20, 2024

[lldb/DWARF] Remove duplicate type filtering #116989

Merged

	bool is_inline =
	die.GetAttributeValueAsUnsigned(DW_AT_export_symbols, 0) != 0;

	// Parent chain has not enough entries, we can't possibly have a match.
	// Parent chain does not have enough entries, we can't possibly have a match.

	ProcessTypeDieMatchQuery(TypeQuery &query, DWARFDIE die,
	ProcessTypeDIEMatchQuery(TypeQuery &query, DWARFDIE die,

	bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_names,
	bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_contexts,

Improve type lookup using .debug_names parent chain #108907

Improve type lookup using .debug_names parent chain #108907

Conversation

jeffreytan81 commented Sep 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Implementation

Performance Results

Uh oh!

github-actions bot commented Sep 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 17, 2024

Context

Summary

Implementation

Performance result

Uh oh!

jeffreytan81 commented Sep 18, 2024

Uh oh!

felipepiovezan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffreytan81 Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

labath left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffreytan81 Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

felipepiovezan Sep 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeffreytan81 commented Sep 17, 2024 •

edited

Loading

github-actions bot commented Sep 17, 2024 •

edited

Loading

jeffreytan81 Sep 24, 2024 •

edited

Loading

jeffreytan81 Sep 19, 2024 •

edited

Loading

felipepiovezan Sep 20, 2024 •

edited

Loading