You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems like we should be able to do better if we modeled the data as a tree. IIUC the current approach is super-linear in runtime complexity.
tree= {}
forsegmentinsub_segments:
sub_tree=treefornibbleinsegment:
sub_tree.setdefault(nibble, {})
ifnotisinstance(sub_tree[nibble], collections.Mapping):
... # we've detected a that `segment` is a superset of `sub_tree[nibble]`sub_tree=sub_tree[nibble]
# once the segment terminates, we store the full segment at that position # so that we can later detect segments that are prefixes of other segments.sub_tree[segment[-1]] =segment
I think the untested psuedocode above has O(n) complexity.
Because this whole verification is for an unknown use case (exploring more than one node at a time, in parallel), it's hard to optimize for speed. Are there a huge number of sub_segments? Are there not so many, but each one has a really long segment? etc.
The whole validation check is skipped for known use cases, because the number of different-lengthed sub-segments is always 1 for branches, nodes, and extensions:
all_lengths=set(len(segment) forsegmentinsub_segments)
iflen(all_lengths) >1:
# verification here
So the runtime of all known use cases is basically unaffected by this verification speed.
If someone does decide to speed this up, I'd want to see a hypothesis test to make sure all the cases are covered. But really I think it's fine to wait until we have a reason to speed it up (and hence, a way to test that it's actually faster).
Originally posted by @pipermerriam in #95
The text was updated successfully, but these errors were encountered: