-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: continue vacuum drop table on per-table cleanup failures #16424
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 10 files at r1, all commit messages.
Reviewable status: 6 of 10 files reviewed, 2 unresolved discussions (waiting on @dantengsky and @SkyFan2002)
src/meta/api/src/schema_api_impl.rs
line 2883 at r1 (raw file):
// if limit is Some, append DroppedId::Db only when table_infos is empty if drop_db && table_infos.is_empty() {
Why append Db
only when table_infos
is empty? The comment should explain the reason for it.
And IMHO, this method should append a Db
if drop_ids
contains all of the table_infos
belonging to this DB?
Therefore table_infos.is_empty()
should be replaced with num >= table_infos.len()
, which meaning the table_infos
are exhausted.
My suggestion is replacing if let Some(left_num) = left_num
with let num = left_num.unwrap_or(u64::MAX)
and unifying all of the following logic.
Code quote:
// if limit is Some, append DroppedId::Db only when table_infos is empty
if drop_db && table_infos.is_empty() {
src/meta/app/src/schema/table.rs
line 926 at r1 (raw file):
pub enum DroppedId { // db id, db name, (table id, table name)s Db(u64, String, Arc<Vec<(u64, String)>>),
Using a tuple can be a bit complicated. Consider converting it into a struct-like variant instead:
Db {
database_id: u64,
database_name: String,
...
}
And why does it have to be an Arc<Vec<_>>
? Does it need to be shared between thread?
The number of tables would be small(like several thousand?), just using Vec<_>
would be all right for this small data set.
@drmingdrmer This is not introduced in this PR, but introduced in #13779. I think it means when a db need to be vacuumed, but no tables in this db, just drop this db from meta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 10 files at r1, 2 of 4 files at r2, all commit messages.
Reviewable status: 8 of 10 files reviewed, 2 unresolved discussions (waiting on @dantengsky and @SkyFan2002)
src/meta/api/src/schema_api_impl.rs
line 2888 at r2 (raw file):
*db_id, table_info.ident.table_id, table_info.name.clone(),
this_db_drop_table_infos
should always be pushed into drop_ids
, no matter to drop the db or not.
Code quote:
for (table_info, db_id) in this_db_drop_table_infos {
drop_ids.push(DroppedId::Table(
*db_id,
table_info.ident.table_id,
table_info.name.clone(),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r2, all commit messages.
Reviewable status: 9 of 10 files reviewed, 3 unresolved discussions (waiting on @dantengsky and @SkyFan2002)
src/meta/api/src/schema_api_impl.rs
line 2870 at r3 (raw file):
let take_num = left_num.unwrap_or(usize::MAX); if drop_db && take_num > table_infos.len() {
Add some comment to explain this condition:
Suggestion:
// A DB can be removed only when all its tables are removed.
if drop_db && take_num > table_infos.len() {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: 9 of 10 files reviewed, 2 unresolved discussions (waiting on @dantengsky and @SkyFan2002)
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
If the cleanup process fails for an individual dropped table, the operation should not immediately terminate. Instead, it should continue to process and clean up other tables that can be handled normally.
If the table's files are not successfully cleaned up, the metadata for both the table and the database will be retained.
Tests
Type of change
This change is