Skip to content

Commit

Permalink
Skip the cache of DWARFInfo and CU.
Browse files Browse the repository at this point in the history
Add DWARFInfo.skip_cache() and DWARFInfo.enable_cache() to give users
the ability of controlling cache.

For the case of parsing the DWARF of a large binary, we may want to
skip the cache to release the memory ASAP, avoiding extra CPU cycles
on maintaining a cache.

One of my use causes is to extract types, functions, and call sites
from the DWARF of a Linux kernel image. With caches, it takes about
573 seconds to go through all DIEs. Skipping caches reduces time to
448 seconds. It is about 27% faster. When going through every DIEs
sequentially, cache doesn't help use at all.
  • Loading branch information
ThinkerYzu1 committed Feb 12, 2024
1 parent c359508 commit 16fe507
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 0 deletions.
3 changes: 3 additions & 0 deletions elftools/dwarf/compileunit.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,9 @@ def _get_cached_DIE(self, offset):
# the top DIE and obtain a reference to its stream.
top_die_stream = self.get_top_DIE().stream

if self.dwarfinfo._skip_cache:
return DIE(cu=self, stream=top_die_stream, offset=offset)

# `offset` is the offset in the stream of the DIE we want to return.
# The map is maintined as a parallel array to the list. We call
# bisect each time to ensure new DIEs are inserted in the correct
Expand Down
13 changes: 13 additions & 0 deletions elftools/dwarf/dwarfinfo.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,8 @@ def __init__(self,
self._cu_cache = []
self._cu_offsets_map = []

self._skip_cache = False

@property
def has_debug_info(self):
""" Return whether this contains debug information.
Expand Down Expand Up @@ -430,6 +432,10 @@ def _cached_CU_at_offset(self, offset):
See get_CU_at().
"""
if self._skip_cache:
cu = self._parse_CU_at_offset(offset)
return cu

# Find the insert point for the requested offset. With bisect_right,
# if this entry is present in the cache it will be the prior entry.
i = bisect_right(self._cu_offsets_map, offset)
Expand Down Expand Up @@ -575,3 +581,10 @@ def parse_debugsupinfo(self):
return suplink.sup_filename
return None

def skip_cache(self):
self._skip_cache = True
pass

def enable_cache(self):
self._skip_cache = False
pass

0 comments on commit 16fe507

Please sign in to comment.