Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve type lookup using .debug_names parent chain #108907

Merged
merged 2 commits into from
Oct 9, 2024

Conversation

jeffreytan81
Copy link
Contributor

@jeffreytan81 jeffreytan81 commented Sep 17, 2024

Summary

This PR improves SymbolFileDWARF::FindTypes() by utilizing the newly added parent chain DW_IDX_parent in .debug_names. The proposal was originally discussed in this RFC.

Implementation

To leverage the parent chain for SymbolFileDWARF::FindTypes(), this PR adds a new API: GetTypesWithQuery in DWARFIndex base class. The API performs the same function as GetTypes with additional filtering using TypeQuery. Since this only introduces filtering, the callback mechanisms at all call sites remain unchanged. A default implementation is given in DWARFIndex class which parses debug info and performs the matching. In the DebugNameDWARFIndex override, the parent_contexts in the TypeQuery is cross checked with parent chain in .debug_names for for much faster filtering before fallback to base implementation for final filtering.

Unlike the GetFullyQualifiedType API, which fully consumes the DW_IDX_parent parent chain for exact matching, these new APIs perform partial subset matching for type/namespace queries. This is necessary to support queries involving anonymous or inline namespaces. For instance, a user might request NS1::NS2::NS3::Foo, while the index table's parent chain might contain NS1::inline_NS2::NS3::Foo, which would fail exact matching.

Performance Results

In one of our internal target using .debug_names + split dwarf. Expanding a "this" pointer in locals view in VSCode:
94s => 48s. (Not sure why I got 94s this time instead of 70s last week).

Copy link

github-actions bot commented Sep 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 17, 2024

@llvm/pr-subscribers-lldb

Author: None (jeffreytan81)

Changes

Context

We have a customer reporting scenario that expanding "this" pointer for a non-trivial class method context takes more than 70~90 seconds. This is a linux target with .debug_names index table enabled and using split dwarf dwo files to avoid debug info overflow.

Profiling shows the majority of the bottleneck is spent in SymbolFileDWARF::FindTypes() and SymbolFileDWARF:: FindNamespace() two functions searching types/namespaces in .debug_names index table with base name, then parsing/loading the underlying dwo files which can be very slow.

Summary

This PR improves SymbolFileDWARF::FindTypes() and SymbolFileDWARF::FindNamespace() by using the newly added parent chain DW_IDX_parent in .debug_names. The proposal is originally discussed in this RFC.

This PR also implements a new algorithm in DebugNamesDWARFIndex::SameAsEntryATName() that can greatly avoid parsing the dwo files.

Implementation

To leverage parent chain for SymbolFileDWARF::FindTypes and SymbolFileDWARF:: FindNamespace, the PR adds two new APIs GetTypesWithParents and GetNamespacesWithParents in DWARFIndex base class. These two APIs are performing the same semantics as GetTypes/GetNamespaces with extra filtering if parent_names parameter is not empty. Since only filtering is performed, the callback to all the callsites are not changed. A default implementation in DWARFIndex class is implemented to parse type context from each base name matched DIE. In DebugNameDWARFIndex override, it uses parent chain to perform much faster searching.

Unlike GetFullyQualifiedType API which is the first one consuming DW_IDX_parent parent chain, these two new APIs do not perform exact matching, other than partial subset matching for the type/namespace queries. This is required to support anonymous/inline namespace queries from client. For example, users may ask for NS1::NS2::NS3::Foo while index table's parent chain may contain NS1::inline_NS2::NS3::Foo which would fail the exact matching.

Another optimization performed is inside DebugNamesDWARFIndex::SameAsEntryATName(). The old implementation uses GetNonSkeletonUnit() which triggers parsing/loading of underlying dwo files which proves to be very expensive which we want to avoid. The new implementation tries to avoid touching dwo as much as possible. There are two ideas:

  1. Build <pool_entry_range => NameEntry> mapping table (built lazily as needed apparently), then we can query check the name of the parent entry.
  2. Searching the name the entry tries to match from current NameIndex, then check if the parent entry can be found from the name entry.

Per my testing/profiling, the 1) does not seem to improve wall-time because the cost to build the mapping table can be high. 2) has proved to be very fast.

Performance result

  • The other landed PR reduces user's scenario wall-time from 70s => 50s
  • Using GetTypesWithParents/GetNamespacesWithParents reduces 50s => 13s
  • Using the new algorithm to avoid touching dwo files in DebugNamesDWARFIndex::SameAsEntryATName() reduces 13s => 1.4s

Overall, these changes reduce from 70s => 1.4s


Patch is 26.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/108907.diff

6 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp (+59)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h (+22)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp (+151-14)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h (+30)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+107-73)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+2)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
index eafddbad03f57b..3101b1e137fbf3 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
@@ -126,3 +126,62 @@ bool DWARFIndex::GetFullyQualifiedTypeImpl(
     return callback(die);
   return true;
 }
+
+void DWARFIndex::GetNamespacesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  GetNamespaces(name, [&](DWARFDIE die) {
+    return ProcessDieMatchParentNames(name, parent_names, die, callback);
+  });
+}
+
+void DWARFIndex::GetTypesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  GetTypes(name, [&](DWARFDIE die) {
+    return ProcessDieMatchParentNames(name, parent_names, die, callback);
+  });
+}
+
+bool DWARFIndex::ProcessDieMatchParentNames(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> query_parent_names,
+    DWARFDIE die, llvm::function_ref<bool(DWARFDIE die)> callback) {
+  std::vector<lldb_private::CompilerContext> type_context =
+      die.GetTypeLookupContext();
+  if (type_context.empty()) {
+    // If both type_context and query_parent_names and empty we have a match.
+    // Otherwise, this one does not match and we keep on searching.
+    if (query_parent_names.empty())
+      return callback(die);
+    return true;
+  }
+
+  // Type lookup context includes the current DIE as the last element.
+  // so revert it for easy matching.
+  std::reverse(type_context.begin(), type_context.end());
+
+  // type_context includes the name of the current DIE while query_parent_names
+  // doesn't. So start check from index 1 for dwarf_decl_ctx.
+  uint32_t i = 1, j = 0;
+  while (i < type_context.size() && j < query_parent_names.size()) {
+    // If type_context[i] has no name, skip it.
+    // e.g. this can happen for anonymous namespaces.
+    if (type_context[i].name.IsNull() || type_context[i].name.IsEmpty()) {
+      ++i;
+      continue;
+    }
+    // If the name doesn't match, skip it.
+    // e.g. this can happen for inline namespaces.
+    if (query_parent_names[j] != type_context[i].name) {
+      ++i;
+      continue;
+    }
+    ++i;
+    ++j;
+  }
+  // If not all query_parent_names were found in type_context.
+  // This die does not meet the criteria, try next one.
+  if (j != query_parent_names.size())
+    return true;
+  return callback(die);
+}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
index cb3ae8a06d7885..1f457fda0282f5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
@@ -64,6 +64,23 @@ class DWARFIndex {
   virtual void
   GetNamespaces(ConstString name,
                 llvm::function_ref<bool(DWARFDIE die)> callback) = 0;
+
+  /// Get type DIEs whose base name match \param name with \param parent_names
+  /// in its decl parent chain as subset.  A base implementation is provided,
+  /// Specializations should override this if they are able to provide a faster
+  /// implementation.
+  virtual void
+  GetTypesWithParents(ConstString name,
+                      llvm::ArrayRef<llvm::StringRef> parent_names,
+                      llvm::function_ref<bool(DWARFDIE die)> callback);
+  /// Get namespace DIEs whose base name match \param name with \param
+  /// parent_names in its decl parent chain as subset.  A base implementation is
+  /// provided. Specializations should override this if they are able to provide
+  /// a faster implementation.
+  virtual void
+  GetNamespacesWithParents(ConstString name,
+                           llvm::ArrayRef<llvm::StringRef> parent_names,
+                           llvm::function_ref<bool(DWARFDIE die)> callback);
   virtual void
   GetFunctions(const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
                const CompilerDeclContext &parent_decl_ctx,
@@ -115,6 +132,11 @@ class DWARFIndex {
   bool
   GetFullyQualifiedTypeImpl(const DWARFDeclContext &context, DWARFDIE die,
                             llvm::function_ref<bool(DWARFDIE die)> callback);
+
+  /// Check if the \a die has \a parent_names in its decl parent chain.
+  bool ProcessDieMatchParentNames(
+      ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+      DWARFDIE die, llvm::function_ref<bool(DWARFDIE die)> callback);
 };
 } // namespace dwarf
 } // namespace lldb_private::plugin
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 32d8a92305aafa..76c80011791de6 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -12,6 +12,8 @@
 #include "Plugins/SymbolFile/DWARF/LogChannelDWARF.h"
 #include "Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h"
 #include "lldb/Core/Module.h"
+#include "lldb/Core/Progress.h"
+#include "lldb/Utility/LLDBLog.h"
 #include "lldb/Utility/RegularExpression.h"
 #include "lldb/Utility/Stream.h"
 #include "llvm/ADT/Sequence.h"
@@ -301,7 +303,8 @@ using Entry = llvm::DWARFDebugNames::Entry;
 /// If any parent does not have an `IDX_parent`, or the Entry data is corrupted,
 /// nullopt is returned.
 std::optional<llvm::SmallVector<Entry, 4>>
-getParentChain(Entry entry, uint32_t max_parents) {
+getParentChain(Entry entry,
+               uint32_t max_parents = std::numeric_limits<uint32_t>::max()) {
   llvm::SmallVector<Entry, 4> parent_entries;
 
   do {
@@ -374,25 +377,40 @@ void DebugNamesDWARFIndex::GetFullyQualifiedType(
   m_fallback.GetFullyQualifiedType(context, callback);
 }
 
+bool DebugNamesDWARFIndex::SameAsEntryATName(
+    llvm::StringRef query_name, const DebugNames::Entry &entry) const {
+  auto maybe_dieoffset = entry.getDIEUnitOffset();
+  if (!maybe_dieoffset)
+    return false;
+
+  // [Optimization] instead of parsing the entry from dwo file, we simply
+  // check if the query_name can point to an entry of the same DIE offset.
+  // This greatly reduced number of dwo file parsed and thus improved the
+  // performance.
+  for (const DebugNames::Entry &query_entry :
+       entry.getNameIndex()->equal_range(query_name)) {
+    auto query_dieoffset = query_entry.getDIEUnitOffset();
+    if (!query_dieoffset)
+      continue;
+
+    if (*query_dieoffset == *maybe_dieoffset) {
+      return true;
+    } else if (*query_dieoffset > *maybe_dieoffset) {
+      // The pool entries of the same name are sequentially cluttered together
+      // so if the query name from `query_name` is after the target entry, this
+      // is definitely not the correct one, we can stop searching.
+      return false;
+    }
+  }
+  return false;
+}
+
 bool DebugNamesDWARFIndex::SameParentChain(
     llvm::ArrayRef<llvm::StringRef> parent_names,
     llvm::ArrayRef<DebugNames::Entry> parent_entries) const {
-
   if (parent_entries.size() != parent_names.size())
     return false;
 
-  auto SameAsEntryATName = [this](llvm::StringRef name,
-                                  const DebugNames::Entry &entry) {
-    // Peek at the AT_name of `entry` and test equality to `name`.
-    auto maybe_dieoffset = entry.getDIEUnitOffset();
-    if (!maybe_dieoffset)
-      return false;
-    DWARFUnit *unit = GetNonSkeletonUnit(entry);
-    if (!unit)
-      return false;
-    return name == unit->PeekDIEName(unit->GetOffset() + *maybe_dieoffset);
-  };
-
   // If the AT_name of any parent fails to match the expected name, we don't
   // have a match.
   for (auto [parent_name, parent_entry] :
@@ -402,6 +420,36 @@ bool DebugNamesDWARFIndex::SameParentChain(
   return true;
 }
 
+bool DebugNamesDWARFIndex::WithinParentChain(
+    llvm::ArrayRef<llvm::StringRef> query_parent_names,
+    llvm::ArrayRef<DebugNames::Entry> parent_chain) const {
+  if (parent_chain.size() < query_parent_names.size())
+    return false;
+
+  size_t query_idx = 0, chain_idx = 0;
+  while (query_idx < query_parent_names.size() &&
+         chain_idx < parent_chain.size()) {
+    if (SameAsEntryATName(query_parent_names[query_idx],
+                          parent_chain[chain_idx])) {
+      ++query_idx;
+      ++chain_idx;
+    } else {
+      // Name does not match, try next parent_chain entry if the current entry
+      // is namespace because the current one can be an inline namespace.
+      if (parent_chain[chain_idx].tag() != DW_TAG_namespace)
+        return false;
+
+      ++chain_idx;
+      if (query_parent_names.size() - query_idx >
+          parent_chain.size() - chain_idx) {
+        // Parent chain has not enough entries, we can't possibly have a match.
+        return false;
+      }
+    }
+  }
+  return query_idx == query_parent_names.size();
+}
+
 void DebugNamesDWARFIndex::GetTypes(
     ConstString name, llvm::function_ref<bool(DWARFDIE die)> callback) {
   for (const DebugNames::Entry &entry :
@@ -444,6 +492,95 @@ void DebugNamesDWARFIndex::GetNamespaces(
   m_fallback.GetNamespaces(name, callback);
 }
 
+void DebugNamesDWARFIndex::GetNamespacesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  Progress progress("Get namespace from index for %s", name.GetCString());
+  for (const DebugNames::Entry &entry :
+       m_debug_names_up->equal_range(name.GetStringRef())) {
+    lldb_private::dwarf::Tag entry_tag = entry.tag();
+    if (entry_tag == DW_TAG_namespace ||
+        entry_tag == DW_TAG_imported_declaration) {
+      std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
+          getParentChain(entry);
+      if (!parent_chain) {
+        // Fallback: use the base class implementation.
+        if (!ProcessEntry(entry, [&](DWARFDIE die) {
+              return ProcessDieMatchParentNames(name, parent_names, die,
+                                                callback);
+            }))
+          return;
+        continue;
+      }
+
+      // If parent_chain is shorter than parent_names, we can't possibly match.
+      if (parent_chain->size() < parent_names.size())
+        continue;
+      else if (parent_chain->size() == parent_names.size()) {
+        if (SameParentChain(parent_names, *parent_chain) &&
+            !ProcessEntry(entry, callback))
+          return;
+      } else {
+        // When parent_chain has more entries than parent_names we still
+        // possibly match if parent_chain contains inline namespaces.
+        if (WithinParentChain(parent_names, *parent_chain) &&
+            !ProcessEntry(entry, callback))
+          return;
+      }
+    }
+  }
+
+  m_fallback.GetNamespaces(name, callback);
+}
+
+void DebugNamesDWARFIndex::GetTypesWithParents(
+    ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  if (parent_names.empty())
+    return GetTypes(name, callback);
+
+  // For each entry, grab its parent chain and check if we have a match.
+  for (const DebugNames::Entry &entry : m_debug_names_up->equal_range(name)) {
+    if (!isType(entry.tag()))
+      continue;
+
+    // If we get a NULL foreign_tu back, the entry doesn't match the type unit
+    // in the .dwp file, or we were not able to load the .dwo file or the DWO ID
+    // didn't match.
+    std::optional<DWARFTypeUnit *> foreign_tu = GetForeignTypeUnit(entry);
+    if (foreign_tu && foreign_tu.value() == nullptr)
+      continue;
+
+    std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
+        getParentChain(entry);
+    if (!parent_chain) {
+      // Fallback: use the base class implementation.
+      if (!ProcessEntry(entry, [&](DWARFDIE die) {
+            return ProcessDieMatchParentNames(name, parent_names, die,
+                                              callback);
+          }))
+        return;
+      continue;
+    }
+
+    // If parent_chain is shorter than parent_names, we can't possibly match.
+    if (parent_chain->size() < parent_names.size())
+      continue;
+    else if (parent_chain->size() == parent_names.size()) {
+      if (SameParentChain(parent_names, *parent_chain) &&
+          !ProcessEntry(entry, callback))
+        return;
+    } else {
+      // When parent_chain has more entries than parent_names we still possibly
+      // match if parent_chain contains inline namespaces.
+      if (WithinParentChain(parent_names, *parent_chain) &&
+          !ProcessEntry(entry, callback))
+        return;
+    }
+  }
+  m_fallback.GetTypesWithParents(name, parent_names, callback);
+}
+
 void DebugNamesDWARFIndex::GetFunctions(
     const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
     const CompilerDeclContext &parent_decl_ctx,
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
index cb15c1d4f994b3..1f9c87c76a99f8 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
@@ -52,6 +52,14 @@ class DebugNamesDWARFIndex : public DWARFIndex {
                 llvm::function_ref<bool(DWARFDIE die)> callback) override;
   void GetNamespaces(ConstString name,
                      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+  void
+  GetTypesWithParents(ConstString name,
+                      llvm::ArrayRef<llvm::StringRef> parent_names,
+                      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+  void GetNamespacesWithParents(
+      ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
+      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+
   void GetFunctions(const Module::LookupInfo &lookup_info,
                     SymbolFileDWARF &dwarf,
                     const CompilerDeclContext &parent_decl_ctx,
@@ -118,6 +126,28 @@ class DebugNamesDWARFIndex : public DWARFIndex {
   bool SameParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
                        llvm::ArrayRef<DebugNames::Entry> parent_entries) const;
 
+  /// Returns true if \a parent_names entries are within \a parent_chain.
+  /// This is diffferent from SameParentChain() which checks for exact match.
+  /// This function is required because \a parent_chain can contain inline
+  /// namespace entries which may not be specified in parent_names by client.
+  ///
+  /// \param[in] parent_names
+  ///   The list of parent names to check for.
+  ///
+  /// \param[in] parent_chain
+  ///   The fully qualified parent chain entries from .debug_names index table
+  ///   to check against.
+  ///
+  /// \returns
+  ///   True if all \a parent_names entries are can be sequentially found inside
+  ///   \a parent_chain, otherwise False.
+  bool WithinParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
+                         llvm::ArrayRef<DebugNames::Entry> parent_chain) const;
+
+  /// Returns true if .debug_names pool entry \p entry has name \p query_name.
+  bool SameAsEntryATName(llvm::StringRef query_name,
+                         const DebugNames::Entry &entry) const;
+
   static void MaybeLogLookupError(llvm::Error error,
                                   const DebugNames::NameIndex &ni,
                                   llvm::StringRef name);
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index f721ca00fd3559..cc2497b7e484ba 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -2731,6 +2731,22 @@ uint64_t SymbolFileDWARF::GetDebugInfoSize(bool load_all_debug_info) {
   return debug_info_size;
 }
 
+llvm::SmallVector<llvm::StringRef>
+SymbolFileDWARF::GetTypeQueryParentNames(TypeQuery query) {
+  std::vector<lldb_private::CompilerContext> &query_decl_context =
+      query.GetContextRef();
+  llvm::SmallVector<llvm::StringRef> parent_names;
+  if (!query_decl_context.empty()) {
+    auto rbegin = query_decl_context.rbegin(), rend = query_decl_context.rend();
+    for (auto rit = rbegin + 1; rit != rend; ++rit)
+      if ((rit->kind & CompilerContextKind::AnyType) !=
+              CompilerContextKind::Invalid &&
+          !rit->name.IsEmpty())
+        parent_names.push_back(rit->name.GetStringRef());
+  }
+  return parent_names;
+}
+
 void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
 
   // Make sure we haven't already searched this SymbolFile before.
@@ -2748,45 +2764,50 @@ void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
 
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
 
+  TypeQuery query_full(query);
+  llvm::SmallVector<llvm::StringRef> parent_names =
+      GetTypeQueryParentNames(query_full);
   bool have_index_match = false;
-  m_index->GetTypes(type_basename, [&](DWARFDIE die) {
-    // Check the language, but only if we have a language filter.
-    if (query.HasLanguage()) {
-      if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
-        return true; // Keep iterating over index types, language mismatch.
-    }
+  m_index->GetTypesWithParents(
+      query.GetTypeBasename(), parent_names, [&](DWARFDIE die) {
+        // Check the language, but only if we have a language filter.
+        if (query.HasLanguage()) {
+          if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
+            return true; // Keep iterating over index types, language mismatch.
+        }
 
-    // Check the context matches
-    std::vector<lldb_private::CompilerContext> die_context;
-    if (query.GetModuleSearch())
-      die_context = die.GetDeclContext();
-    else
-      die_context = die.GetTypeLookupContext();
-    assert(!die_context.empty());
-    if (!query.ContextMatches(die_context))
-      return true; // Keep iterating over index types, context mismatch.
-
-    // Try to resolve the type.
-    if (Type *matching_type = ResolveType(die, true, true)) {
-      if (matching_type->IsTemplateType()) {
-        // We have to watch out for case where we lookup a type by basename and
-        // it matches a template with simple template names. Like looking up
-        // "Foo" and if we have simple template names then we will match
-        // "Foo<int>" and "Foo<double>" because all the DWARF has is "Foo" in
-        // the accelerator tables. The main case we see this in is when the
-        // expression parser is trying to parse "Foo<int>" and it will first do
-        // a lookup on just "Foo". We verify the type basename matches before
-        // inserting the type in the results.
-        auto CompilerTypeBasename =
-            matching_type->GetForwardCompilerType().GetTypeName(true);
-        if (CompilerTypeBasename != query.GetTypeBasename())
-          return true; // Keep iterating over index types, basename mismatch.
-      }
-      have_index_match = true;
-      results.InsertUnique(matching_type->shared_from_this());
-    }
-    return !results.Done(query); // Keep iterating if we aren't done.
-  });
+        // Check the context matches
+        std::vector<lldb_private::CompilerContext> die_context;
+        if (query.GetModuleSearch())
+          die_context = die.GetDeclContext();
+        else
+          die_context = die.GetTypeLookupContext();
+        assert(!die_context.empty());
+        if (!query.ContextMatches(die_context))
+          return true; // Keep iterating over index types, context mismatch.
+
+        // Try to resolve the type.
+        if (Type *matching_type = ResolveType(die, true, true)) {
+          if (matching_type->IsTemplateType()) {
+            // We have to watch out for case where we lookup a type by basename
+            // and it matches a template with simple template names. Like
+            // looking up "Foo" and if we have simple template names then we
+            // will match "Foo<int>" and "Foo<double>" because all the DWARF has
+            // is "Foo" in the accelerator tables. The main case we see this in
+            // is when the expression parser is trying to parse "Foo<int>" and
+            // it will first do a lookup on just "Foo". We verify the type
+            // basename matches before inserting the type in...
[truncated]

@jeffreytan81
Copy link
Contributor Author

Ping, anyone wants to review this? Thanks

Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool work! I've left some high level thoughts and questions inline. Focused only on the namespace bit for now, and didn't do a code review, just algorithm ideas

// The pool entries of the same name are sequentially cluttered together
// so if the query name from `query_name` is after the target entry, this
// is definitely not the correct one, we can stop searching.
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the spec guarantee this sorting of DIE offsets?

DWARFUnit *unit = GetNonSkeletonUnit(entry);
if (!unit)
return false;
return name == unit->PeekDIEName(unit->GetOffset() + *maybe_dieoffset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the core of your proposal, right? In the original implementation that I did, the vast majority of the savings was thanks to being able to do PeekDIEName. I would strongly recommend we benchmark this new implementation is the non-dwo case as well.

One way to check is to build with regular object files (the non dwo case) and repeat the experiments I described in this reply:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/38?u=felipe
And instrumenting this portion of the code: 837bc0f

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let me try to build a target with split dwarf disable and measure the perf (not sure if I can use the same target because I might get debug_info or debug_str overflow).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felipepiovezan , I am having trouble finding a context without split dwarf to measure the performance. Our target is way too big to build without split dwarf (linker error). I am trying to debug clang++ per your benchmark, but the evaluation is very fast 293ms which I do not think it is slow enough to measure perf. How do you get 5~6 seconds from this?

devbig623:~/llvm-sand/build/Debug/fbcode-x86_64/toolchain$ lldb -o "b CodeGenFunction::GenerateCode" -o r -o "timeit expr Fn" --  ~/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++ -std=c++17 -pthread -glldb -c ~/personal/main.cpp -o ~/personal/a
(lldb) target create "/home/jeffreytan/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++"
Current executable set to '/home/jeffreytan/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++' (x86_64).
(lldb) settings set -- target.run-args  "-std=c++17" "-pthread" "-glldb" "-c" "/home/jeffreytan/personal/main.cpp" "-o" "/home/jeffreytan/personal/a"
(lldb) b CodeGenFunction::GenerateCode
Breakpoint 1: where = clang++`clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) + 59 at CodeGenFunction.cpp:1440:3, address = 0x00000000083c507b
(lldb) r
Process 233320 launched: '/home/jeffreytan/llvm-sand/build/Debug/fbcode-x86_64/toolchain/bin/clang++' (x86_64)
Process 233320 stopped
* thread #1, name = 'clang++', stop reason = breakpoint 1.1
    frame #0: 0x000055555d91907b clang++`clang::CodeGen::CodeGenFunction::GenerateCode(this=0x00007fffffff5bb0, GD=GlobalDecl @ 0x00007fffffff5b48, Fn=0x000002921787f228, FnInfo=0x0000029218308200) at CodeGenFunction.cpp:1440:3
   1437
   1438	void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn,
   1439	                                   const CGFunctionInfo &FnInfo) {
-> 1440	  assert(Fn && "generating code for null Function");
   1441	  const FunctionDecl *FD = cast<FunctionDecl>(GD.getDecl());
   1442	  CurGD = GD;
   1443
(lldb) timeit expr Fn
(llvm::Function *) $0 = 0x000002921787f228

time: 0m0.293s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I manually strip down the internal target debug info from 7GB => 400MB so that it can be linked without split dwarf.

Tested with PeekDIEName() vs this PR in two different code paths:

In the first path: 
PeekDIEName  1.2s 
This PR: 1.4s

In the second path:
PeekDIEName  1.999s
This PR: 1.720s

So they are about the same if not using split dwarf. And majority of the saving comes from split dwarf which avoiding parsing dwo files really matters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the evaluation is very fast 293ms which I do not think it is slow enough to measure perf. How do you get 5~6 seconds from this?

Oh, my bad, to get to 5s you would have to revert the original parent chain patch. It is still interesting that you are getting 0.2s for that experiment when I was getting 1.4s (after the parent chain patches). I have some time today and will try to repro this out of curiosity.

But I am glad this doesn't seem to regress perf!

++chain_idx;
} else {
// Name does not match, try next parent_chain entry if the current entry
// is namespace because the current one can be an inline namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a way of guaranteeing it is?
If the comparison that failed to match is with a real (non inline) namespace, we would return a false result here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would love to but I do not think .debug_names tag has information to filter it out. It simply encodes DW_TAG_namespace tag:

   Name 1748 {
      Hash: 0xB3EF34A4
      String: 0x0000ae01 "inline_ns"
      Entry @ 0xb2d9 {
        Abbrev: 0x5
        Tag: DW_TAG_namespace
        DW_IDX_die_offset: 0x00015df7
        DW_IDX_parent: Entry @ 0xbab8
      }
    }

I guess we could try to PeekDieName() to parse this DIE from dwo files to check if it is an inline vs non-inline namespace but that is the expensive code path this PR wants to avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that help? I don't see anything in the DIE that let's us identify inline namespaces...

https://godbolt.org/z/GT98Yfbqv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, did not know godbolt can show dwarfdump output.

DWARFASTParserClang identifies namespace DIE with DW_AT_export_symbols as inline namespace:

bool is_inline =
die.GetAttributeValueAsUnsigned(DW_AT_export_symbols, 0) != 0;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good to know we have a mechanism to detect inline namespaces.

That still leaves the question of how to avoid false results though. I understand peeking at the DIE would be expensive for dwos, but I don't think we can sacrifice the correctness of the answer for these queries. One possibility to explore is to add another IDX entry for namespaces, which would include whether they export symbols or not.

Copy link
Contributor Author

@jeffreytan81 jeffreytan81 Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design is made to perform as much filtering as possible in DebugNamesDWARFIndex::GetTypesWithQuery(), it does not have to be accurate, because it will eventually delegate to base class DWARFIndex::ProcessTypeDieMatchQuery() which performs accurate semantics matching as the original callsite.

bool DebugNamesDWARFIndex::WithinParentChain(
llvm::ArrayRef<llvm::StringRef> query_parent_names,
llvm::ArrayRef<DebugNames::Entry> parent_chain) const {
if (parent_chain.size() < query_parent_names.size())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check and the one at the end of the loop can be combined into a single check at the start of the loop


if (parent_chain->size() < parent_names.size())
continue;
else if (parent_chain->size() == parent_names.size()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably fold this optimization inside WithinParentChain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. To make sure I understand, are you suggesting we always call WithinParentChain() from here, and WithinParentChain()'s implementation should dispatch to SameParentChain() if the entries are of same size?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's right! The reason for the suggestion is that this is mostly an optimization of the WithinParentChain method, so it makes sense to live in there (also removes some code duplication)

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping, anyone wants to review this? Thanks

The usual ping time is one week, and this is definitely one of the more nuanced patches, which usually take a longer time to review.

This isn't really a review, just noting something that caught my eye.

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

Comment on lines 399 to 401
// The pool entries of the same name are sequentially cluttered together
// so if the query name from `query_name` is after the target entry, this
// is definitely not the correct one, we can stop searching.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may be the case for clang, but I don't think anything guarantees that. This sounds like the kind of thing that we could communicate through the augmentation string of the debug_names table though...

I.e. I think we shouldn't do this unconditionally, but only if the producer signals this through the augmentation string. We could possibly even retcon the LLVM0700 string which has been emitted by llvm/clang from the beginning of time (if indeed the output has been sorted the whole time).

We should probably also add some tests/comments/etc. to the producer code to make sure they realize we're relying on that.

Copy link
Contributor Author

@jeffreytan81 jeffreytan81 Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@labath, @felipepiovezan, this assumption (pool entries of the same name are sequentially cluttered together) is actually guaranteed by dwarf5 specification:

Page 141:

The entry pool contains all the index entries, grouped by name. The second column of the name list points to the first index entry for the name, and all the index entries for that name are placed one after the other.

Also in 6.1.1.4.8 Entry Pool:

Gaps are not allowed between entries in a series (that is, the entries for a single
10 name must all be contiguous), but there may be gaps between series.

So if the entry's die_offset is before the current query_name's die_offset in the entry pool, entry can't be in pool entries chunk that current query_name entry points to, so won't match.

I can add quoting of these sections in the comment if you think will make it clear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're reading too much into the specification. That's definitely not how I would interpret those sections.

The whole of page 141 is devoted to describing the physical layout of the debug_names section, and that's the context in which I'd read that sentence. It just says that entries in the entry pool should follow one after the other -- i.e., with no intervening padding. It doesn't say anything about the debug info entries (in the debug_info) section which are referred to by the debug_names entries. In fact, at this point in the spec, it hasn't even been established how the index entries refer to DIEs -- that happens six pages later (somewhere around your second quote).

However, the second quote basically restates the same thing. There can be no gaps between entries for the same name (which makes sense, as you couldn't parse them otherwise -- you reach the next one by passing over the first one), but there can be gaps between different names (and that also makes sense -- you reach the first entry by following the pointer from the name table, so it does not have to be glued to the previous list).

The non-normative text after it confirms this theory:

For example, a producer/consumer combination may find it useful to maintain alignment.

The padding is there to (optionally) maintain alignment of the entries in the debug_names section. You can't maintain alignment by sorting entries.

In addition, this sorting of entries by die offset is very arbitrary and a remnant (introduced by me) from the apple tables. Since the DIE offset is relative to the CU start, it means that entries from different CUs would be interleaved just because they have interleaving relative offsets.

Bottom line: I'm certain that if the DWARF spec wanted to guarantee this sorting, it would use a much more obvious language, and I suspect that it would choose a different sorting key (e.g. group all entries from the same CU and then sort by offset).

I think this part of the patch should be split off to a separate PR.

Copy link
Contributor

@felipepiovezan felipepiovezan Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I agree with @labath 's interpretation of the spec here.

Here's what I would suggest:

  1. Make a patch just changes the implementation of SameAsEntryATName. This will benefit your use case (dwo) by itself and we can measure the perf impact of not being able to relying on the sorting of entries in all cases.

Then we can branch the work into two parallel lines of investigation:

1.1. [If sorting entries turns out to be important] we can create a series of patches exploring the augmentation string idea from @labath

1.2. One patch for GetNamespacesWithParents and then one patch for GetTypesWithParents

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@labath, @felipepiovezan, ah, you are right. This check is a bad implementation. The optimization I want to express here is that: the "pool entries" in .debug_names are cluttered together (which is stated by spec) while the code incorrectly checks that the "die offsets" in .debug_info pointed to by .debug_names "pool entries" are sorted (which definitely has no guarantee).

From the existing API it is not that simple to express the intention I want to express, and the performance testing shows this check does not affect wall-time (at least in our cases) so I will remove this check in this iteration but add a TODO to look into in future.

For splitting PRs, that makes sense. This PR just shares big picture and gathers feedback so that I can refactor more effectively.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that explains a few things, but then I'm not sure what you want to achieve with this check. It sounds to me like this consecutiveness is already guaranteed by the parser (it keeps parsing until it reaches the end-of-list entry). Maybe you could provide an example of the kind of situation that you want to detect/rule out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@labath, let me see if I can explain well:

The purpose of SameAsEntryATName(query_name, entry) is to check whether the pool entry has the name query_name.

Name table encodes data like this:

<name_table_entry for "query_name"> --> <consecutive_list_of_pool_entry>

In the name table, the SameAsEntryATName(query_name, entry) function in this PR attempts to verify if the entry is within a list of <consecutive_list_of_pool_entry> above.

The "incorrectly implemented check" in this PR tries to bail out early by checking whether the first Entry Offset pointed to by <name_table_entry for "query_name"> is greater than or equal to the entry's offset in the "Entry Pool". If this condition is met, it indicates that the entry is located before the <consecutive_list_of_pool_entry> in the "Entry Pool", meaning it cannot be part of the <consecutive_list_of_pool_entry> we are searching for. Consequently, the code can bail out early without iterating through the <consecutive_list_of_pool_entry> to check for a match.

Per testing, this early bail out does not affect much performance (I guess because <consecutive_list_of_pool_entry> is small for most query_name), so it is not an important optimization. For split dwarf, avoid touching dwo files is the key performance mover in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I've finally understood what you mean now. You want to reverse-engineer the name of an debug_names entry (check whether it matches the expected string) by looking at whether its contained in the range of entries belonging to that name.

That should be doable (and it's quite clever actually), but we don't currently have an API which would make that possible. If necessary, I think we can add one, though.

// so revert it for easy matching.
std::reverse(type_context.begin(), type_context.end());

// type_context includes the name of the current DIE while query_parent_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels very similar to what we're doing in TypeQuery::ContextMatches. Any chance one can unify them? Or re-use some of the logic? Haven't compared them in-depth to be sure it's a good idea though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .debug_names search is what returns results to TypeQuery::ContextMatches, but we should have functions that can do the same matching. The difference here in .debug_names is we are trying to live off only the contents of .debug_names without using and DIEs. TypeQuery::ContextMatches uses DIEs. So we might be able to unify some of the context matching once the context has been extracted from the .debug_names or from the DWARF, but that is about as far as we should go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Michael137 , @clayborg, yep, the algorithm is similar. I tried to reuse TypeQuery::ContextMatches from DebugNamesDWARFIndex before, but then find it hard because, as Greg pointed out, the CompilerContext is parsed out from .debug_names parent chain, which is lazily searched in the new algorithm.

However, I should be able to reuse TypeQuery::ContextMatches in this base implementation. Let me give a try.

@Michael137
Copy link
Member

Is it worth splitting this into two PRs (one for the type lookup and one for namespaces)?

@@ -126,3 +126,62 @@ bool DWARFIndex::GetFullyQualifiedTypeImpl(
return callback(die);
return true;
}

void DWARFIndex::GetNamespacesWithParents(
ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just an array of names, we should try to use the llvm::ArrayRef<CompilerContext>. The .debug_names table has the DW_TAG in each entry and we can further reduce the number of things and possibly not even need to check the name if the DW_TAG_ doesn't match of the CompilerContext type isn't set to Any.

}

void DWARFIndex::GetTypesWithParents(
ConstString name, llvm::ArrayRef<llvm::StringRef> parent_names,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, replace llvm::ArrayRef<llvm::StringRef> parent_names with llvm::ArrayRef<CompilerContext>

}

bool DWARFIndex::ProcessDieMatchParentNames(
ConstString name, llvm::ArrayRef<llvm::StringRef> query_parent_names,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, replace llvm::ArrayRef<llvm::StringRef> parent_names with llvm::ArrayRef<CompilerContext>

// so revert it for easy matching.
std::reverse(type_context.begin(), type_context.end());

// type_context includes the name of the current DIE while query_parent_names
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .debug_names search is what returns results to TypeQuery::ContextMatches, but we should have functions that can do the same matching. The difference here in .debug_names is we are trying to live off only the contents of .debug_names without using and DIEs. TypeQuery::ContextMatches uses DIEs. So we might be able to unify some of the context matching once the context has been extracted from the .debug_names or from the DWARF, but that is about as far as we should go.

@jeffreytan81
Copy link
Contributor Author

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

@labath, that is the first prototype I tried, but benchmark shows it did not improve any performance because the time to build this mapping offsets any query gain later so this PR did not use it and uses the second idea by searching parent entry's matching names which is implemented in this PR.

@labath
Copy link
Collaborator

labath commented Sep 20, 2024

Is it worth splitting this into two PRs (one for the type lookup and one for namespaces)?

Maybe, but I think it may even be better to split it along the optimization-type axis (the die offset thingy vs. the parent chain).

@labath
Copy link
Collaborator

labath commented Sep 20, 2024

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

@labath, that is the first prototype I tried, but benchmark shows it did not improve any performance because the time to build this mapping offsets any query gain later so this PR did not use it and uses the second idea by searching parent entry's matching names which is implemented in this PR.

Okay, that explains why I couldn't find it, but then I guess the patch description is out of date, as it still contains this:

The new implementation minimizes the need to touch DWO files by employing two strategies:

  • Build a <pool_entry_range => NameEntry> mapping table (built lazily as needed), allowing us to check the name of the parent entry.

Can you update it to reflect the current state of the patch (you may choose to keep this in some "alternatives considered" section, but I wouldn't say that's necessary)?

@jeffreytan81
Copy link
Contributor Author

I'm also not sure where's the "<pool_entry_range => NameEntry> mapping table" you mention in the description. Could you point me to it?

@labath, that is the first prototype I tried, but benchmark shows it did not improve any performance because the time to build this mapping offsets any query gain later so this PR did not use it and uses the second idea by searching parent entry's matching names which is implemented in this PR.

Okay, that explains why I couldn't find it, but then I guess the patch description is out of date, as it still contains this:

The new implementation minimizes the need to touch DWO files by employing two strategies:

  • Build a <pool_entry_range => NameEntry> mapping table (built lazily as needed), allowing us to check the name of the parent entry.

Can you update it to reflect the current state of the patch (you may choose to keep this in some "alternatives considered" section, but I wouldn't say that's necessary)?

That's fair. I listed the other option in case someone is asking but it did not clearly mention that the option is not implemented in this PR.

I will update the PR with all the feedback including splitting PR.

@jeffreytan81 jeffreytan81 changed the title Improve type and namespace lookup using parent chain Improve type lookup using .debug_names parent chain Sep 24, 2024
@jeffreytan81
Copy link
Contributor Author

This PR has been updated to only contain the type lookup using .debug_names parent chain.
Later PRs will add namespace changes and SameAsATName algorithm change.

@jeffreytan81
Copy link
Contributor Author

I have no idea how to do stack PR review well in github. Here is the second PR with namespace lookup diffing from my repo:
jeffreytan81#2 if anyone wants to see. Unfortunately, if I am diffing from master llvm-project, it is showing both combined changes from both PRs instead of only the second changes.

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of makes sense to me, though I'd definitely feel better with another set of eyes looking it over.

@@ -301,7 +301,8 @@ using Entry = llvm::DWARFDebugNames::Entry;
/// If any parent does not have an `IDX_parent`, or the Entry data is corrupted,
/// nullopt is returned.
std::optional<llvm::SmallVector<Entry, 4>>
getParentChain(Entry entry, uint32_t max_parents) {
getParentChain(Entry entry,
uint32_t max_parents = std::numeric_limits<uint32_t>::max()) {
llvm::SmallVector<Entry, 4> parent_entries;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a job for this patch, but I'm noticed that this function will return nothing if the parent chain was cut short (e.g. because one of the intermediate types is in a type unit). It'd probably be better to return a partial list (which is marked as partial), as that can still be useful for some filtering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what branch in this function will cut short during "intermediate types is in a type unit" situation? Is it "entry.hasParentInformation()" returning false, or "entry.getParentDIEEntry()" return failure?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one. Basically, if you try something like clang++ -o - -c -g -gpubnames -fdebug-types-section -x c++ - <<<"struct A { struct B {}; }; A a; A::B b;", you'll see that the entry for B has no DW_IDX_parent attribute (I think it could have it, but we'd have to figure out what exactly it should refer to).

size_t query_idx = 0, chain_idx = 0;
while (query_idx < query_contexts.size() && chain_idx < parent_chain.size()) {
if (query_contexts.size() - query_idx > parent_chain.size() - chain_idx) {
// Parent chain has not enough entries, we can't possibly have a match.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Parent chain has not enough entries, we can't possibly have a match.
// Parent chain does not have enough entries, we can't possibly have a match.


size_t query_idx = 0, chain_idx = 0;
while (query_idx < query_contexts.size() && chain_idx < parent_chain.size()) {
if (query_contexts.size() - query_idx > parent_chain.size() - chain_idx) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could ditch the index variables and replace ++idx with ref = ref.drop_front() ?
That way this check will just be ref1.size() > ref2.size() ?

Comment on lines 508 to 522
std::vector<lldb_private::CompilerContext> &query_decl_context =
query.GetContextRef();
llvm::SmallVector<CompilerContext> parent_contexts;
if (!query_decl_context.empty()) {
auto rbegin = query_decl_context.rbegin(), rend = query_decl_context.rend();
// Reverse the query decl context to match parent chain.
// Skip the last entry, it is the type we are looking for.
for (auto rit = rbegin + 1; rit != rend; ++rit)
// Skip any context without name because .debug_names might not encode
// them. (e.g. annonymous namespace)
if ((rit->kind & CompilerContextKind::AnyType) !=
CompilerContextKind::Invalid &&
!rit->name.IsEmpty())
parent_contexts.push_back(*rit);
}
return parent_contexts;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<lldb_private::CompilerContext> &query_decl_context =
query.GetContextRef();
llvm::SmallVector<CompilerContext> parent_contexts;
if (!query_decl_context.empty()) {
auto rbegin = query_decl_context.rbegin(), rend = query_decl_context.rend();
// Reverse the query decl context to match parent chain.
// Skip the last entry, it is the type we are looking for.
for (auto rit = rbegin + 1; rit != rend; ++rit)
// Skip any context without name because .debug_names might not encode
// them. (e.g. annonymous namespace)
if ((rit->kind & CompilerContextKind::AnyType) !=
CompilerContextKind::Invalid &&
!rit->name.IsEmpty())
parent_contexts.push_back(*rit);
}
return parent_contexts;
}
llvm::ArrayRef<lldb_private::CompilerContext> query_decl_context =
query.GetContextRef();
llvm::SmallVector<CompilerContext> parent_contexts;
if (!query_decl_context.empty()) {
// Reverse the query decl context to match parent chain.
// Skip the last entry, it is the type we are looking for.
for (CompilerContext ctx: llvm::reverse(query_decl_context.drop_back())) {
// Skip any context without name because .debug_names might not encode
// them. (e.g. annonymous namespace)
if ((ctx.kind & CompilerContextKind::AnyType) !=
CompilerContextKind() &&
!ctx.name.IsEmpty())
parent_contexts.push_back(ctx);
}
return parent_contexts;
}

Comment on lines 560 to 561
if (parent_chain->size() < parent_contexts.size())
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already checked inside WithinParentChain. I doubt that checking it here again improves performace in any way

@labath
Copy link
Collaborator

labath commented Sep 25, 2024

I have no idea how to do stack PR review well in github. Here is the second PR with namespace lookup diffing from my repo: jeffreytan81#2 if anyone wants to see. Unfortunately, if I am diffing from master llvm-project, it is showing both combined changes from both PRs instead of only the second changes.

The best thing I've seen so far is to create another PR with exactly two commits (one of them is the dependent PR, and another is the actual change) and then tell the reviewers to only look at the second part.

@jeffreytan81 jeffreytan81 force-pushed the improve_type_namespace_lookup branch 2 times, most recently from bb42917 to 3f92956 Compare September 25, 2024 22:39
Copy link
Contributor

@kusmour kusmour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits


/// Check if the type \a die can meet the requirements of \a query.
bool
ProcessTypeDieMatchQuery(TypeQuery &query, DWARFDIE die,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like other function names use all cap for DIE

Suggested change
ProcessTypeDieMatchQuery(TypeQuery &query, DWARFDIE die,
ProcessTypeDIEMatchQuery(TypeQuery &query, DWARFDIE die,

@@ -118,6 +122,36 @@ class DebugNamesDWARFIndex : public DWARFIndex {
bool SameParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
llvm::ArrayRef<DebugNames::Entry> parent_entries) const;

bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_names,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_names,
bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_contexts,

Copy link
Contributor

@Jlalond Jlalond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some Minor comments, but I think Pavel pointed out everything important. LGTM

bool DebugNamesDWARFIndex::SameAsEntryContext(
const CompilerContext &query_context,
const DebugNames::Entry &entry) const {
// TODO: check dwarf tag matches.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a long term todo, do we have a github issue that reflects this (and if we do, can we add the number in teh comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not a long term todo. My 3rd PR in the stack will implement it.

query.GetContextRef();
llvm::SmallVector<CompilerContext> parent_contexts;
if (!query_decl_context.empty()) {
// Skip the last entry, it is the type we are looking for.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is confusing

the last element contains what we have, but then we reverse the 0..N-1 part of the list and capture this data is the check if it's invalid and that we have a non-invalid/null name.

I can figure out what you're doing but I think we can rework some of the comments.

// Skip the last entry as it's the type we're finding parents for.

Also, why are we reversing it? That is not very apparent to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vector returned from TypeQuery::GetContextRef() stores from outmost to innermost namespace which is reversed from what's we wanted (and .debug_names encoded in parent chain).

@jeffreytan81 jeffreytan81 merged commit 61a4678 into llvm:main Oct 9, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants