-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KVStore: Record table-wise memory usage and warn when it goes beyond the threshold #9835
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Co-authored-by: JaySon <tshent@qq.com>
Co-authored-by: JaySon <tshent@qq.com>
RegionTable::InternalRegion & RegionTable::insertRegion(Table & table, const Region & region) | ||
{ | ||
const auto range = region.getRange(); | ||
return insertRegion(table, *range, region.id()); | ||
return insertRegion(table, *range, region); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make line 108 in RegionTable::getOrInsertRegion
as return insertRegion(table, *region.getRange(), region);
and remove this redundant function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
Type total() const { return payload + decoded; } | ||
|
||
size_t totalSize() const { return payload + decoded; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need two method for the same result? Maybe just adding explicit cast when want to assign the result to differnet type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total()
needs no casting, but is not used very much,
totalSize()
needs casting, but is widely used in test, so it is we just write it here to eliminate some static_cast
ing
std::atomic_int64_t table_size; | ||
std::atomic_bool warned; | ||
}; | ||
using RegionTableCtx = std::shared_ptr<RegionTableCtxInner>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to RegionTableCtxPtr
? Most alias of shared_ptr are xxxPtr
in TiFlash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just keep those function name to make them shorter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just keep those function name to make them shorter?
I think its OK.
Co-authored-by: jinhelin <linjinhe33@gmail.com>
Co-authored-by: JaySon <tshent@qq.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: Calvin Neo <calvinneo1995@gmail.com>
Signed-off-by: JaySon-Huang <tshent@qq.com>
// If there are data flow in, we will check if the memory is exhaused. | ||
auto limit = tmt.getKVStore()->getKVStoreMemoryLimit(); | ||
auto current = real_rss.load(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If KVStoreMemoryLimit (max_memory_usage_for_all_queries) is explicitly set to 0, now it will not print any logging. IMO, for this case we should use the server memory capacity * 80% as the warning limit.
- Only print log when the region_table_size > 50% * current is too large. I think we may easily miss the logging for track. I recommend to lower the table level memory threshold to min(20% * current, 10GiB) to increase the recall rate for logging down info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- OK
- We'd better substract "start_time" memory which could be significant if we set the memory limit to kind of a small value like 4GiB.
/hold I want to re-perform a test over the new pr. |
When rss is about 3.63 GiB, the KVStore memory consumption is about 619MB, which is about 16.7% of the rss. The reported point is at 2629MB. |
Another test that prints the table size
So the table size is 143347752 bytes(137MiB), while the current is 2866593792 bytes(2.66G), about 5%. It shows the KVStore size is 630MB, while the usage is 3.74GiB, about 16%. Consider the base memory usage is 2.37 GiB. It shows a byte cost in KVStore will result in 2 bytes cost in memory usage.
Suppose the memory starts from X, and grows to Y, which is our "warning limit". And if it is all due to one table in KVStore, the table's "table_size" should be |
New method
|
What problem does this PR solve?
Issue Number: ref #9722
Problem Summary:
RegionTableSize
which is shared by aTable
and all its relatedRegion
s. Once registered, a region will update its memory change to this struct, so the table would learn.maybeWarnMemoryLimitByTable
andresetWarnMemoryLimitByTable
, to log when the memory of a table goes beyond threshold.Both of the systems works under the path that:
RegionTableSize
.RegionTableSize
.Also:
Region
to reflect what they actually do.What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note