-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some per key optimization for UDT in memtable only feature #13031
Conversation
14e33e3
to
34a90b9
Compare
34a90b9
to
321991a
Compare
@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for improving this @jowlyzhang !
db/dbformat.cc
Outdated
|
||
void IterKey::EnlargeSecondaryBufferIfNeeded(size_t key_size) { | ||
// If size is smaller than buffer size, continue using current buffer, | ||
// or the static allocated one, as default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor but seems to me the buffers are not actually statically allocated; maybe call them something like "fixed-size" or "inline"
@@ -562,18 +562,25 @@ inline uint64_t GetInternalKeySeqno(const Slice& internal_key) { | |||
// allocation for smaller keys. | |||
// 3. It tracks user key or internal key, and allow conversion between them. | |||
class IterKey { | |||
static constexpr char kTsMin[] = "\x00\x00\x00\x00\x00\x00\x00\x00"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to add the usual comment here about only 64-bit timestamps being supported currently.
db/dbformat.h
Outdated
char* secondary_buf_; | ||
char space_for_secondary_buf_[39]; // Avoid allocation for short keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably wouldn't cause any issues in practice but since secondary_buf_
can potentially point to space_for_secondary_buf_
, it would be nice to have these two ordered the other way around. (Technically, secondary_buf_
currently gets constructed before and destroyed after space_for_secondary_buf_
.) Also, we could introduce a named constant for the size of the inline buffers (39).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, thank you for the suggestion!
db/dbformat.h
Outdated
// Use to track the pieces that together make the whole key. We then copy | ||
// these pieces in order either into buf_ or secondary_buf_ depending on where | ||
// the previous key is held. | ||
Slice key_slices_[5]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider using std::array
instead of a C-style arrray
db/dbformat.h
Outdated
secondary_buf_ = space_for_secondary_buf_; | ||
} | ||
secondary_buf_size_ = sizeof(space_for_secondary_buf_); | ||
key_size_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to clear key_size_
iff key_
points to the secondary buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This is only supposed to be called when key_ points to secondary buffer, or during destruction. It's good to make a check for this.
db/dbformat.h
Outdated
size_t actual_total_bytes = 0; | ||
#endif // NDEBUG | ||
for (size_t i = 0; i < num_key_slices; i++) { | ||
size_t key_size = key_slices_[i].size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key_size
might not be the best name for this variable; how about something like key_slice_size
or slice_size
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, the name is indeed confusing.
key_parts.emplace_back(slice_data, left_sz); | ||
key_parts.emplace_back(min_timestamp); | ||
key_parts.emplace_back(slice_data + left_sz, slice_sz - left_sz); | ||
key_slices_[(*next_key_slice_idx)++] = Slice(slice_data, left_sz); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could assert that next_key_slice_idx
is not null and that we don't overrun the key_slices_
buffer (i.e. that we don't end up with more than 5 parts)
@jowlyzhang has updated the pull request. You must reimport the pull request before landing. |
@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
0313896
to
a174a68
Compare
@jowlyzhang has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @jowlyzhang !
if (key_ == secondary_buf_) { | ||
key_size_ = 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a similar check in ResetBuffer
too (with buf_
)?
a174a68
to
2d47f74
Compare
@jowlyzhang has updated the pull request. You must reimport the pull request before landing. |
@jowlyzhang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jowlyzhang merged this pull request in 32dd657. |
Summary: This PR added some optimizations for the per key handling for SST file for the user-defined timestamps in Memtable only feature. CPU profiling shows this part is a big culprit for regression. This optimization saves some string construction/destruction/appending/copying. vector operations like reserve/emplace_back. When iterating keys in a block, we need to copy some shared bytes from previous key, put it together with the non shared bytes and find a right location to pad the min timestamp. Previously, we create a tmp local string buffer to first construct the key from its pieces, and then copying this local string's content into `IterKey`'s buffer. To avoid having this local string and to avoid this extra copy. Instead of piecing together the key in a local string first, we just track all the pieces that make this key in a reused Slice array. And then copy the pieces in order into `IterKey`'s buffer. Since the previous key should be kept intact while we are copying some shared bytes from it, we added a secondary buffer in `IterKey` and alternate between primary buffer and secondary buffer. Pull Request resolved: #13031 Test Plan: Existing tests. Reviewed By: ltamasi Differential Revision: D63416531 Pulled By: jowlyzhang fbshipit-source-id: 9819b0e02301a2dbc90621b2fe4f651bc912113c
This PR added some optimizations for the per key handling for SST file for the user-defined timestamps in Memtable only feature. CPU profiling shows this part is a big culprit for regression. This optimization saves some string construction/destruction/appending/copying. vector operations like reserve/emplace_back.
When iterating keys in a block, we need to copy some shared bytes from previous key, put it together with the non shared bytes and find a right location to pad the min timestamp. Previously, we create a tmp local string buffer to first construct the key from its pieces, and then copying this local string's content into
IterKey
's buffer. To avoid having this local string and to avoid this extra copy. Instead of piecing together the key in a local string first, we just track all the pieces that make this key in a reused Slice array. And then copy the pieces in order intoIterKey
's buffer. Since the previous key should be kept intact while we are copying some shared bytes from it, we added a secondary buffer inIterKey
and alternate between primary buffer and secondary buffer.Test plan:
Existing tests.