-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Consolidate statistics merging code #15645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate statistics merging code #15645
Conversation
}) | ||
} | ||
|
||
/// Merge this Statistics value with another Statistics value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the core logic for merging one Statistics
instance with another
@@ -433,6 +480,79 @@ fn check_num_rows(value: Option<usize>, is_exact: bool) -> Precision<usize> { | |||
} | |||
} | |||
|
|||
pub fn add_row_stats( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods had to remain pub
for now as they are also used in the listing table statistics summary code. I will look into that and see if it makes sense to do as a individual PR
@@ -798,4 +920,173 @@ mod tests { | |||
distinct_count: Precision::Exact(100), | |||
} | |||
} | |||
|
|||
#[test] | |||
fn test_merge_iter_basic() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are just tests moved from datasource
@@ -409,62 +409,6 @@ pub async fn get_statistics_with_limit( | |||
Ok((result_files, statistics)) | |||
} | |||
|
|||
/// Generic function to compute statistics across multiple items that have statistics | |||
fn compute_summary_statistics<T, I>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was moved into Statistics
Just noticed the PR. FYI, I've reviewed it in the PR: xudong963#5 |
Thanks @xudong963 -- I will work on this in a follow on PR |
Which issue does this PR close?
statistics_by_partition API
to ExecutionPlan #15503statistics_by_partition
API toExecutionPlan
#15495Rationale for this change
While reviewing #15503 from @xudong963 I noticed he had to move
compute_summary_statistics
anywaysWhat changes are included in this PR?
Statistics::merge_iter
andStatistics::merge
Are these changes tested?
Yes (by existing tests)
Are there any user-facing changes?