-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory usage of bbolt block header index #213
Fix memory usage of bbolt block header index #213
Conversation
As a preparation for memory behavior optimizations, we first add a set of benchmark tests to establish a baseline against. The benchmarks determine the speed and memory usage of writing different sized batches to the bbolt index as well as the random access latency for retrieving entries from the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a neutrino node has a database file that is partially synced, this PR breaks it, so would need a migration. Alternatively they could remove their filter databases.
I think it would pay dividends to implement parallel header download (#71) instead. It's a more lengthy refactor but would be worth it in the end and also allow us to perform bigger batch inserts. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach! Very clever and simple fix that'll probably help a lot.
I do agree with @Crypt-iQ tho, we must make sure existing user's databases keeps working. A possible solution would be to have these subbuckets be separate from the root bucket, then still fall back to fetching from the root bucket if not found.
We want to isolate the code that reads from/writes to the index bucket within the headerIndex type. We prepare to do so by extracting re-usable code into methods.
aa2c1b0
to
60380ea
Compare
Good point. I added a fallback to the root bucket like @halseth suggested and also added a test for it.
Do you mean instead of merging this PR? I think we can still implement the parallel header download and get benefits from both improvements. |
both improvements makes sense, disregard earlier comment will re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that turned out to be very clean! Great PR 👍
LGTM 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after nits addressed
We want to isolate the code that reads from/writes to the index bucket within the headerIndex type. We prepare to do so by extracting re-usable code into methods.
Now that we have methods for accessing the index buckets, we use those in the blockHeaderStore instead of manipulating the DB directly.
With this commit we store the index keys (hash->height) in sub- buckets with the first two bytes of the hash as the bucket name. Storing a large number of keys in the same bucket has a large impact on memory usage in bbolt if small-ish batch sizes are used (the b+ tree needs to be copied with every resize operation). Using sub buckets is a compromise between memory usage and access time. 2 bytes (=max 65535 sub buckets) seems to be the sweet spot (-50% memory usage, +30% access time). We take the bytes from the beginning of the byte-serialized hash since all Bitcoin hashes are reverse-serialized when displayed as strings. That means the leading zeroes of a block hash are actually at the end of the byte slice.
60380ea
to
9ea609b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
50% is very nice
Fixes #196.
With this PR we store the block index keys (hash->height) in sub-
buckets with the first two bytes of the hash as the bucket name.
Storing a large number of keys in the same bucket has a large
impact on memory usage in bbolt if small-ish batch sizes are
used (the b+ tree needs to be copied with every resize operation).
Using sub buckets is a compromise between memory usage and
access time. 2 bytes (=max 65535 sub buckets) seems to be the
sweet spot (-50% memory usage, +30% access time). We take the
bytes from the beginning of the byte-serialized hash since all
Bitcoin hashes are reverse-serialized when displayed as
strings. That means the leading zeroes of a block hash
are actually at the end of the byte slice.
As the benchmarks below show, the 2 byte prefix seems to be the sweet spot between memory usage, access speed and DB file size.
I also looked at other ways of reducing the memory footprint of the
bbolt
based index. The main culprit seems to be the relatively small batch size of 2k blocks per update. That number is limited by how many blocks a peer serving block headers is serving in one message and cannot easily be increased.The only other way to increase the DB write batch size is to add a second level cache. But that would require more refactoring and could lead to possible de-synchronization of the index and the actual block file.
Benchmarks
All results are retrieved by running:
System:
Test 1: control, no changes
Benchmark output
Max DB file size: ~63 MB
Memory usage (
go tool pprof mem.out
->top
):Test2: 1 byte sub bucket
Benchmark output
Max DB file size: ~64 MB
Memory usage (
go tool pprof mem.out
->top
):Test 3: 2 byte sub bucket
Benchmark output
Max DB file size: ~69 MB
Memory usage (
go tool pprof mem.out
->top
):Test 4: 3 byte sub bucket
Benchmark output
Max DB file size: ~106 MB
Memory usage (
go tool pprof mem.out
->top
):