-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update: nvm representation #68
Open
fia0
wants to merge
138
commits into
parcio:main
Choose a base branch
from
fia0:nvm_tree
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…eing, an alternative approach is used to verify the node size and structure.
Introduced by rebase.
Internal fragmentation made this necessary with the smaller cache size for key-value store tests.
They are meant to allow for nodes to do their own integrity check like internal checksumming on singular entries. Analagous this can be done for compression.
The other one was just silly.
So for quite some time sequential insertion constructed a tree which did not really adhere to the bepsilon-tree rules. This was due to the nodes-in-cache optimization in the insertion code which skips insertion into nodes when their child nodes are in cache. This lead to the case that on sequence many leaves where created and all the pivots are inserted into the parent node of the last node in cache, this was never checked bc we only call rebalance on the final node which was the last node in cache. Now bc of this these parents grew without checks and pivots were essentially just glued together. First, this slows down searching in the node. Second, all access guarantees and buffer spaces normally allowed in the bepsilon tree are gone and with only pivots our tree essentially behaved like a btree in these scenarios. Why this was never caught before i don't know but this commit fixes this behavior doing two things: 1. The `is_too_large` of the node objects now include this space devision of at maximum B^epsilon space for pivots. Meaning as soon as nodes overstep this boundary they are split to adhere to bepsilon-tree construction but might be smaller than 4m, 1m, whatever. This has implication on performance (positive and negative) but is the correct thing to do. 2. Before we check if the child of the current node is in cache and can be modified we check if the current node is already too large if this is the case we DO NOT SKIP THE CURRENT NODE but instead insert the message into the current internal node. This causes more operations on insertion but also makes future updates as cheap as they are actually expected to be with the complexity of the bepsi tree. In the context of this: Another bug was fixed which highlights how problematic this behavior was, the `get_with_info` code of the node was not able to fetch an entry when it was not present in the leaves. Due to the bug when constructing the tree sequentially this was not caught somehow before. It is fixed now.
used the absolute storage size instead of cache size
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've worked a bit on the previous draft (#63) of the nvm tree variant of haura's nodes to improve performance. This is the current version of this endeavour. Performance is improved on the new variant of the tree as well as the old "block" variant due to various performance fixes in the entirety of the storage engine. Additionally,
StorageKind
s are introduced to map tiers to storage media types. Oh, and we allow for yaml configs because i don't want tofoo: {{{{{[[{ option: null }]]}}}}}
anymore.Also some crucial bugs are fixed related to the node balancing in haura. This PR arguably fixes some problems which have existed some time related to the
epsilon
of ourb^epsilon
.Compression still needs to be integrated with this variant of the tree as well as some safety checks.