-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent Early Ordering Pushdown to Enable Aggregation Pushdown to MySQL #16278
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16278 +/- ##
==========================================
- Coverage 68.71% 68.71% -0.01%
==========================================
Files 1547 1547
Lines 198290 198286 -4
==========================================
- Hits 136248 136244 -4
Misses 62042 62042 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
@@ -369,14 +369,14 @@ func pushAggregationThroughApplyJoin(ctx *plancontext.PlanningContext, rootAggr | |||
|
|||
columns := &applyJoinColumns{} | |||
output, err := splitAggrColumnsToLeftAndRight(ctx, rootAggr, join, !join.JoinType.IsInner(), columns, lhs, rhs) | |||
join.JoinColumns = columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bug was not related to the rest of the PR, but it caused tests to fail, so it had to go.
} | ||
|
||
col.GroupBy = addToGroupBy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another bug that was not related to the issue being found, but it caused some plans to be missing group by expressions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -2108,9 +2108,9 @@ | |||
"Name": "user", | |||
"Sharded": true | |||
}, | |||
"FieldQuery": "select min(a.id), a.tcol1, weight_string(a.tcol1), weight_string(a.id) from `user` as a where 1 != 1 group by a.tcol1, weight_string(a.tcol1), weight_string(a.id)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just an optimization, the previous plan also looks wrong. It should have been selecting weight_string(min(a.id))
and not weight_string(a.id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you publish some benchmarks as the PR description claims the impact is of improved performance?
"FieldQuery": "select foo, col, weight_string(foo) from `user` where 1 != 1 group by col, foo, weight_string(foo)", | ||
"FieldQuery": "select foo, col, weight_string(foo) from `user` where 1 != 1 group by foo, col, weight_string(foo)", | ||
"OrderBy": "1 ASC, (0|2) ASC", | ||
"Query": "select foo, col, weight_string(foo) from `user` where id between :vtg1 and :vtg2 group by col, foo, weight_string(foo) order by `user`.col asc, foo asc", | ||
"Query": "select foo, col, weight_string(foo) from `user` where id between :vtg1 and :vtg2 group by foo, col, weight_string(foo) order by `user`.col asc, foo asc", | ||
"Table": "`user`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no impact on the output by this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some e2e tests for the fix and any new cases added in the plan tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find how we improved the planning unless the improvements are in tpch cases.
I added a plan to the bottom of That query produces a very different plan on {
"QueryType": "SELECT",
"Original": "select sum(user.type) from user join user_extra on user.team_id = user_extra.id group by user_extra.col order by user_extra.col",
"Instructions": {
"OperatorType": "Aggregate",
"Variant": "Ordered",
"Aggregates": "sum(0) AS sum(`user`.type)",
"GroupBy": "1",
"ResultColumns": 1,
"Inputs": [
{
"OperatorType": "Sort",
"Variant": "Memory",
"OrderBy": "1 ASC",
"Inputs": [
{
"OperatorType": "Join",
"Variant": "Join",
"JoinColumnIndexes": "L:0,R:0",
"JoinVars": {
"user_team_id": 1
},
"TableName": "`user`_user_extra",
"Inputs": [
{
"OperatorType": "Route",
"Variant": "Scatter",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select `user`.type, `user`.team_id from `user` where 1 != 1",
"Query": "select `user`.type, `user`.team_id from `user`",
"Table": "`user`"
},
{
"OperatorType": "Route",
"Variant": "Scatter",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select user_extra.col from user_extra where 1 != 1",
"Query": "select user_extra.col from user_extra where user_extra.id = :user_team_id",
"Table": "user_extra"
}
]
}
]
}
]
},
"TablesUsed": [
"user.user",
"user.user_extra"
]
} As you can see here, MySQL is not doing any aggregation at all. |
Signed-off-by: Andres Taylor <andres@planetscale.com>
|
@@ -5324,8 +5324,8 @@ | |||
"Name": "user", | |||
"Sharded": true | |||
}, | |||
"FieldQuery": "select count(*), :user_id + user_extra.id, weight_string(:user_id + user_extra.id) from user_extra where 1 != 1 group by :user_id + user_extra.id", | |||
"Query": "select count(*), :user_id + user_extra.id, weight_string(:user_id + user_extra.id) from user_extra group by :user_id + user_extra.id", | |||
"FieldQuery": "select count(*), :user_id + user_extra.id, weight_string(:user_id + user_extra.id) from user_extra where 1 != 1 group by :user_id + user_extra.id, weight_string(:user_id + user_extra.id)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the additional grouping by weight string make sense? 🤔 If so, can you explain why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, they don't really make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are needed for some specific cases, we have CI tests which are failing due to removal of weight_string changes.
{ | ||
"OperatorType": "Projection", | ||
"Expressions": [ | ||
"sum(volume) * count(*) as revenue", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain why we are not passing volume
as an index, like we do for the other columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are, the printing of the plan is just not showing it
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
This reverts commit f7c0ef7. Turns out we do need them. Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Description
This PR introduces a crucial check in the
pushOrderingUnderAgg
r method to prevent the premature pushdown of theORDER BY
clause during the query planning process. By ensuring that the ordering is not pushed down too early, we allow the aggregation to be effectively pushed down to MySQL, optimizing query execution.Details:
Added a Check for Planning Phase:
Rationale:
Optimization:
Impact:
Related Issue(s)
Issue: #16279
Checklist