-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DATE_TRUNC
Optimizer
#14385
base: master
Are you sure you want to change the base?
Add DATE_TRUNC
Optimizer
#14385
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14385 +/- ##
=============================================
- Coverage 61.75% 34.12% -27.63%
- Complexity 207 778 +571
=============================================
Files 2436 2661 +225
Lines 133233 146117 +12884
Branches 20636 22379 +1743
=============================================
- Hits 82274 49866 -32408
- Misses 44911 92223 +47312
+ Partials 6048 4028 -2020
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
DATETRUNC
in PredicateDATE_TRUNC
Optimizer
@jadami10 Made that optimizer enhancement here. Let me know if anything looks off! |
...test/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizerTest.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizerTest.java
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first, thank you for doing this!
took an initial look and left some nit comments, some testing ideas, and some ideas about stuff that looks like it might break.
I haven't reviewed the actual algorithm yet
// Check if date trunc function is being applied on a literal value | ||
if (dateTruncOperands.get(1).isSetLiteral()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think you want a
testInvalidFilterOptimizer
unit test for this. - nit: this comment is identical to the code below it. a better comment would say why we can't/don't optimize literal values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since DATETRUNC
can be applied on literals, my thinking was that if we returned at this point, some other optimizer would precompute this value. I need to look into the other optimizers to figure out if this is the case. If this is not applied elsewhere, I'll add the computation for the DATETRUNC
of the literal and new query creation here.
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizerTest.java
Show resolved
Hide resolved
|
||
@Test | ||
public void testDateTruncOptimizer() { | ||
testDateTrunc("datetrunc('DAY', col) < 1620777600000", new Range("0", true, "1620777600000", false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- do we need a test with an INT instead of long? I believe that's also a valid time type in pinot
- can we have test with week or month truncation
- can we have a test case where it's not a support function. as in
IN
to make sure nothing is erroring - also a test case where the time granularity is unsupported? I'm not sure if calcite will catch that before the optimizer, but we do use
DAY
as an input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jadami10. Thanks for this insight! I've been working on this for the past couple days (specifically on the time zone test). I've introduced several time zone tests and my implementation seems to work only for some time zone usages.
If you're willing, would you be able to write up a quick draft of what the algorithm should look like to convert the date_trunc function with time zones to a range query (essentially, the floor and ceiling inverse of date trunc). I think it would be beneficial to hear it from another perspective to find what I'm missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think your approach to convert the date trunc predicates to floor/ceiling sounds good. I left some comments where I'm confused about why the implementation drifts for similar functions
...src/main/java/org/apache/pinot/core/query/optimizer/filter/TimePredicateFilterOptimizer.java
Show resolved
Hide resolved
upperMillis = dateTruncFloor(operands); | ||
if (upperMillis != TimeUnit.MILLISECONDS.convert(getLongValue(filterOperands.get(1)), TimeUnit.valueOf(outputTimeUnit.toUpperCase()))) { | ||
upperInclusive = true; | ||
upperMillis = dateTruncCeil(operands); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this recomputed here?
lowerMillis = Long.MIN_VALUE; | ||
upperInclusive = false; | ||
upperMillis = dateTruncFloor(operands); | ||
if (upperMillis != TimeUnit.MILLISECONDS.convert(getLongValue(filterOperands.get(1)), TimeUnit.valueOf(outputTimeUnit.toUpperCase()))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we checking this here but not in in GREATER_THAN
?
break; | ||
case GREATER_THAN_OR_EQUAL: | ||
operands.set(1, getExpression(getLongValue(filterOperands.get(1)), new DateTimeFormatSpec("TIMESTAMP"))); | ||
lowerMillis = dateTruncFloor(operands); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be ceil?
// testDateTrunc("datetrunc('DAY', col, 'DAYS', 'CET', 'MILLISECONDS') = 39193714800000", | ||
// new Range("453631", true, "453631", true)); | ||
testDateTrunc("datetrunc('DAY', col, 'MILLISECONDS', 'UTC', 'DAYS') = 453630", | ||
new Range("39193632000000", true, "39193718399999", true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks strange. 39193632000000
is like 1000 years from now?
dateTrunc
function