-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter #3240
Conversation
I think we should solve this problem, rather than set a maximum number of partitions to bypass it, right? It's quite common for our online operations to have several hundred partitions. |
When too many partitions should be optimized, the partition filter seems to be useless. |
I think it's feasible. @7hong What do you think? |
I think it's feasible. I will add a parameter |
I am okay with that. But the configuration should be added to the AMS configuration rather than table configuration. |
How to add AMS configuration, you can refer to: https://github.com/apache/amoro/pull/3193/files @7hong |
okk ,I'll change the code, thanks to both of you. |
Is this a bad usecase of iceberg expression? I mean partition filter is what we could need in this case, especially for so many partitions and large mount of data per partition. what if we do not use iceberg expressions here and use a set to filter or something else. can we solve the problem? On the other hand, if we do not filter partitions, the evaluation stage is somehow insiginificant, we could eliminate pending partitions in pendingInput to save DB storage @zhoujinsong @7hong @lintingbin |
Yes, It is a bad case to construct iceberg expression with too many conditions. We can improve this case in another PR.
Yes, we may drop the partition set in pending state to save our db storage in current implementation. |
‘optimizer.ignore-filter-partition-count’
|
@@ -54,6 +54,7 @@ ams: | |||
task-ack-timeout: 30000 # 30s | |||
polling-timeout: 3000 # 3s | |||
max-planning-parallelism: 1 # default 1 | |||
ignore-filter-partition-count: 100 # default 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this param is not suitable for this place.
It's more suitable for self-optimizing groups.
And I think more proper meaning is 'self-optimizing.skip-evaluating-for-partition-count', WDYT @zhoujinsong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's appropriate to put it in self-optimizing groups.
But I think skip-evaluating
is inappropriate, because this parameter only controls whether to use the filter, not skipping the Evaluator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct. I think skip is inappropriate too.
Originally there are no evaluating and filtering. But it is proved that directly planning would cause OOM because all partition files would be cached in memory, which leads to design a evaluating phase before planning(stream planning and store a partition set to avoid memory usage in planning phase). If we do not filter here, that means evaluating is not used or skipped.
Evaluating is a sensible concept and even be revealed on dashboard in the future, however filter is quite a detailed implementation and could be pointed to anything evolved or nothing. For example, partition filter is a fast implementation for evaluting, which has serveral drawbacks fed back:
- can not filter anything for non-partition table
- too large stores in sysdb
- the issue you have encountered
From my view, this PR has done a temporary optimization and the evaluating logic should evolve to resolve issues above, and I'm concerned this parameter pointed to filter is easy to be outdated(some work related to work I am pushing #2596).
How do you think using a set to filter to resolve the issue directly? Or you could maintain this evolation when related issues are raised. That would help a lot
I carefully read everyone's discussion and summarized the current situation as follows:
So I am thinking we may add an AMS property named I think it solved the current issue and limited the memory usage for the optimizing plan. |
How does it work in evaluating stage and planning stage? |
Similar to what I submitted the first time? Solve this problem by limiting the number of partitions added? this.partitionFilter =
tableRuntime.getPendingInput() == null
? Expressions.alwaysTrue()
: tableRuntime.getPendingInput().getPartitions().entrySet().stream()
.map(
entry ->
ExpressionUtil.convertPartitionDataToDataFilter(
table,
entry.getKey(),
entry.getValue().stream()
.limit(maxPartitionCount) // here
.collect(Collectors.toSet())))
.reduce(Expressions::or)
.orElse(Expressions.alwaysTrue()); |
That's true, but there is a small difference: during the evaluate phase, we limit the set of partitions in the pending input to only 100 (default), rather than truncating it when using it. This can help reduce storage overhead. |
Yes, you are right. It is a better solution to truncate during the pending stage. |
…esh-tables.max-pending-partition-count"
I added a parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for the contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for this contribution.
It looks fine. |
… in the filter (#3240) * [AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter * [AMORO-3239] Add the "ignore-filter-partition-count" parameter * move parameter "optimizer.ignore-filter-partition-count" to "self-optimizing.skip-filter-partition-count" * move parameter "self-optimizing.skip-filter-partition-count" to "refresh-tables.max-pending-partition-count"
… in the filter
Why are the changes needed?
Close #3239 .
Brief change log
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation