[AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter #3240

7hong · 2024-10-11T03:19:30Z

… in the filter

Why are the changes needed?

Close #3239 .

Brief change log

How was this patch tested?

Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

… in the filter

lintingbin · 2024-10-11T07:43:32Z

I think we should solve this problem, rather than set a maximum number of partitions to bypass it, right? It's quite common for our online operations to have several hundred partitions.

zhoujinsong · 2024-10-11T08:48:29Z

When too many partitions should be optimized, the partition filter seems to be useless.
We may use the alwaysTrue filter for that case, HDYT? @7hong @lintingbin

lintingbin · 2024-10-11T09:09:57Z

When too many partitions should be optimized, the partition filter seems to be useless. We may use the alwaysTrue filter for that case, HDYT? @7hong @lintingbin

I think it's feasible. @7hong What do you think?

7hong · 2024-10-11T09:39:52Z

When too many partitions should be optimized, the partition filter seems to be useless. We may use the alwaysTrue filter for that case, HDYT? @7hong @lintingbin

I think it's feasible. @7hong What do you think?

I think it's feasible. I will add a parameter self-optimizing.ignore-filter-partition-count, which defaults to 100. Filters are not used when the number of partitions exceeds self-optimizing.ignore-filter-partition-count. What do you think? @zhoujinsong @lintingbin

zhoujinsong · 2024-10-11T09:42:47Z

I will add a parameter self-optimizing.ignore-filter-partition-count, which defaults to 100. Filters are not used when the number of partitions exceeds self-optimizing.ignore-filter-partition-count. What do you think? @zhoujinsong @lintingbin

I am okay with that. But the configuration should be added to the AMS configuration rather than table configuration.

lintingbin · 2024-10-11T09:47:57Z

How to add AMS configuration, you can refer to: https://github.com/apache/amoro/pull/3193/files @7hong

7hong · 2024-10-11T09:57:19Z

okk ，I'll change the code, thanks to both of you.

majin1102 · 2024-10-12T02:50:24Z

Is this a bad usecase of iceberg expression?

I mean partition filter is what we could need in this case, especially for so many partitions and large mount of data per partition. what if we do not use iceberg expressions here and use a set to filter or something else. can we solve the problem?

On the other hand, if we do not filter partitions, the evaluation stage is somehow insiginificant, we could eliminate pending partitions in pendingInput to save DB storage @zhoujinsong @7hong @lintingbin

zhoujinsong · 2024-10-12T03:26:33Z

Is this a bad usecase of iceberg expression?

I mean partition filter is what we could need in this case, especially for so many partitions and large mount of data per partition. what if we do not use iceberg expressions here and use a set to filter or something else. can we solve the problem?

Yes, It is a bad case to construct iceberg expression with too many conditions.
We can filter the data file by ourselves rather than pass it to iceberg scan, but we cannot get better plan performance, but still save some memory for our optimizing plan process.

We can improve this case in another PR.

On the other hand, if we do not filter partitions, the evaluation stage is somehow insiginificant, we could eliminate pending partitions in pendingInput to save DB storage

Yes, we may drop the partition set in pending state to save our db storage in current implementation.

majin1102 · 2024-10-12T04:01:15Z

Is this a bad usecase of iceberg expression?
I mean partition filter is what we could need in this case, especially for so many partitions and large mount of data per partition. what if we do not use iceberg expressions here and use a set to filter or something else. can we solve the problem?

Yes, It is a bad case to construct iceberg expression with too many conditions. We can filter the data file by ourselves rather than pass it to iceberg scan, but we cannot get better plan performance, but still save some memory for our optimizing plan process.

We can improve this case in another PR.

On the other hand, if we do not filter partitions, the evaluation stage is somehow insiginificant, we could eliminate pending partitions in pendingInput to save DB storage

Yes, we may drop the partition set in pending state to save our db storage in current implementation.

‘optimizer.ignore-filter-partition-count’
I think this parameter is hard to describe on documents. since it appears to point to a temporary solution and not quite general.
I suggest two solutions:

no parameter and make filter a hashset, the performance is well enough
use parameter as 'self-optimizing.skip-evaluating-for-partition-count' or something like that, the parameter is meant for evaluating stage and skipping cases, if we do it in another PR, could leave a TODO and keep the proper parameter

majin1102 · 2024-10-12T04:53:41Z

dist/src/main/amoro-bin/conf/config.yaml

@@ -54,6 +54,7 @@ ams:
    task-ack-timeout: 30000 # 30s
    polling-timeout: 3000 # 3s
    max-planning-parallelism: 1 # default 1
+    ignore-filter-partition-count: 100 # default 100


I think this param is not suitable for this place.
It's more suitable for self-optimizing groups.

And I think more proper meaning is 'self-optimizing.skip-evaluating-for-partition-count', WDYT @zhoujinsong

I think it's appropriate to put it in self-optimizing groups.

But I think skip-evaluating is inappropriate, because this parameter only controls whether to use the filter, not skipping the Evaluator.

Yes, that's correct. I think skip is inappropriate too.

Originally there are no evaluating and filtering. But it is proved that directly planning would cause OOM because all partition files would be cached in memory, which leads to design a evaluating phase before planning(stream planning and store a partition set to avoid memory usage in planning phase). If we do not filter here, that means evaluating is not used or skipped.

Evaluating is a sensible concept and even be revealed on dashboard in the future, however filter is quite a detailed implementation and could be pointed to anything evolved or nothing. For example, partition filter is a fast implementation for evaluting, which has serveral drawbacks fed back:

can not filter anything for non-partition table

too large stores in sysdb

the issue you have encountered

From my view, this PR has done a temporary optimization and the evaluating logic should evolve to resolve issues above, and I'm concerned this parameter pointed to filter is easy to be outdated(some work related to work I am pushing #2596).

How do you think using a set to filter to resolve the issue directly? Or you could maintain this evolation when related issues are raised. That would help a lot

…imizing.skip-filter-partition-count"

zhoujinsong · 2024-10-14T08:27:43Z

I carefully read everyone's discussion and summarized the current situation as follows:

Adding too many conditions to iceberg expression is not a good practice.
Removing the partition filter for tables with too many partitions that need to be optimized is dangerous, and may lead to memory leak issues.

So I am thinking we may add an AMS property named self-optimizing.max-partition-count to limit the partition count an optimizing process may add.

I think it solved the current issue and limited the memory usage for the optimizing plan.
HDYT? @7hong @lintingbin @majin1102

majin1102 · 2024-10-14T08:43:24Z

I carefully read everyone's discussion and summarized the current situation as follows:

Adding too many conditions to iceberg expression is not a good practice.

Removing the partition filter for tables with too many partitions that need to be optimized is dangerous, and may lead to memory leak issues.

So I am thinking we may add an AMS property named self-optimizing.max-partition-count to limit the partition count an optimizing process may add.

I think it solved the current issue and limited the memory usage for the optimizing plan. HDYT? @7hong @lintingbin @majin1102

How does it work in evaluating stage and planning stage?

7hong · 2024-10-14T09:06:34Z

I carefully read everyone's discussion and summarized the current situation as follows:

Adding too many conditions to iceberg expression is not a good practice.

Removing the partition filter for tables with too many partitions that need to be optimized is dangerous, and may lead to memory leak issues.

So I am thinking we may add an AMS property named self-optimizing.max-partition-count to limit the partition count an optimizing process may add.

I think it solved the current issue and limited the memory usage for the optimizing plan. HDYT? @7hong @lintingbin @majin1102

Similar to what I submitted the first time? Solve this problem by limiting the number of partitions added?

    this.partitionFilter =
        tableRuntime.getPendingInput() == null
            ? Expressions.alwaysTrue()
            : tableRuntime.getPendingInput().getPartitions().entrySet().stream()
                .map(
                    entry ->
                        ExpressionUtil.convertPartitionDataToDataFilter(
                            table,
                            entry.getKey(),
                            entry.getValue().stream()
                                .limit(maxPartitionCount)      // here
                                .collect(Collectors.toSet())))
                .reduce(Expressions::or)
                .orElse(Expressions.alwaysTrue());

zhoujinsong · 2024-10-14T11:05:33Z

That's true, but there is a small difference: during the evaluate phase, we limit the set of partitions in the pending input to only 100 (default), rather than truncating it when using it. This can help reduce storage overhead.

7hong · 2024-10-14T11:33:03Z

That's true, but there is a small difference: during the evaluate phase, we limit the set of partitions in the pending input to only 100 (default), rather than truncating it when using it. This can help reduce storage overhead.

Yes, you are right. It is a better solution to truncate during the pending stage.
But there is a small problem: the File Count displayed on the dashboard will be unrealistic

…esh-tables.max-pending-partition-count"

7hong · 2024-10-14T13:04:29Z

I added a parameter refresh-tables.max-pending-partition-count. Used to limit the maximum length of pendingInput.
Therefore the semantics of File Count and File Size on the dashboard will change (tables in pending state)

zhoujinsong

LGTM.

Thanks for the contribution!

majin1102

LGTM

Thanks for this contribution.

majin1102 · 2024-10-16T12:33:46Z

I added a parameter refresh-tables.max-pending-partition-count. Used to limit the maximum length of pendingInput. Therefore the semantics of File Count and File Size on the dashboard will change (tables in pending state)

It looks fine.
The 'File Count' and 'File Size' means pending metrics, not meant to be actually planned next round

… in the filter (#3240) * [AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter * [AMORO-3239] Add the "ignore-filter-partition-count" parameter * move parameter "optimizer.ignore-filter-partition-count" to "self-optimizing.skip-filter-partition-count" * move parameter "self-optimizing.skip-filter-partition-count" to "refresh-tables.max-pending-partition-count"

[AMORO-3239] Fix stack overflow caused by reading too many partitions…

29a0f05

… in the filter

github-actions bot added module:ams-server Ams server module module:common labels Oct 11, 2024

7hong added 3 commits October 11, 2024 19:45

Merge branch 'apache:master' into master

40c0540

[AMORO-3239] Add the "ignore-filter-partition-count" parameter

a781dc1

Merge branch 'master' of https://github.com/7hong/amoro

9b7cfc1

github-actions bot added type:build and removed module:common labels Oct 11, 2024

majin1102 requested changes Oct 12, 2024

View reviewed changes

7hong added 4 commits October 14, 2024 10:17

Merge branch 'apache:master' into master

0116ad3

move parameter "optimizer.ignore-filter-partition-count" to "self-opt…

cda96d9

…imizing.skip-filter-partition-count"

Merge branch 'apache:master' into master

5841cb2

Merge branch 'master' into master

3e52fbf

Merge branch 'master' into master

f6be31c

Merge branch 'apache:master' into master

8071c9c

move parameter "self-optimizing.skip-filter-partition-count" to "refr…

02b3855

…esh-tables.max-pending-partition-count"

zhoujinsong approved these changes Oct 15, 2024

View reviewed changes

Merge branch 'master' into master

4db9209

7hong requested a review from majin1102 October 16, 2024 02:01

majin1102 approved these changes Oct 16, 2024

View reviewed changes

majin1102 merged commit cc29688 into apache:master Oct 16, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter #3240

[AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter #3240

7hong commented Oct 11, 2024

lintingbin commented Oct 11, 2024

zhoujinsong commented Oct 11, 2024

lintingbin commented Oct 11, 2024

7hong commented Oct 11, 2024

zhoujinsong commented Oct 11, 2024

lintingbin commented Oct 11, 2024

7hong commented Oct 11, 2024

majin1102 commented Oct 12, 2024 •

edited

Loading

zhoujinsong commented Oct 12, 2024

majin1102 commented Oct 12, 2024 •

edited

Loading

majin1102 Oct 12, 2024

7hong Oct 14, 2024

majin1102 Oct 14, 2024 •

edited

Loading

zhoujinsong commented Oct 14, 2024 •

edited

Loading

majin1102 commented Oct 14, 2024

7hong commented Oct 14, 2024

zhoujinsong commented Oct 14, 2024

7hong commented Oct 14, 2024 •

edited

Loading

7hong commented Oct 14, 2024

zhoujinsong left a comment

majin1102 left a comment

majin1102 commented Oct 16, 2024

[AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter #3240

[AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter #3240

Conversation

7hong commented Oct 11, 2024

Why are the changes needed?

Brief change log

How was this patch tested?

Documentation

lintingbin commented Oct 11, 2024

zhoujinsong commented Oct 11, 2024

lintingbin commented Oct 11, 2024

7hong commented Oct 11, 2024

zhoujinsong commented Oct 11, 2024

lintingbin commented Oct 11, 2024

7hong commented Oct 11, 2024

majin1102 commented Oct 12, 2024 • edited Loading

zhoujinsong commented Oct 12, 2024

majin1102 commented Oct 12, 2024 • edited Loading

majin1102 Oct 12, 2024

Choose a reason for hiding this comment

7hong Oct 14, 2024

Choose a reason for hiding this comment

majin1102 Oct 14, 2024 • edited Loading

Choose a reason for hiding this comment

zhoujinsong commented Oct 14, 2024 • edited Loading

majin1102 commented Oct 14, 2024

7hong commented Oct 14, 2024

zhoujinsong commented Oct 14, 2024

7hong commented Oct 14, 2024 • edited Loading

7hong commented Oct 14, 2024

zhoujinsong left a comment

Choose a reason for hiding this comment

majin1102 left a comment

Choose a reason for hiding this comment

majin1102 commented Oct 16, 2024

majin1102 commented Oct 12, 2024 •

edited

Loading

majin1102 commented Oct 12, 2024 •

edited

Loading

majin1102 Oct 14, 2024 •

edited

Loading

zhoujinsong commented Oct 14, 2024 •

edited

Loading

7hong commented Oct 14, 2024 •

edited

Loading