Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-3317]: Move optimizing planer and scan to iceberg/mixed-format module #3314

Merged
merged 15 commits into from
Nov 13, 2024

Conversation

baiyangtx
Copy link
Contributor

@baiyangtx baiyangtx commented Oct 31, 2024

Why are the changes needed?

close #3317

Currently, the Optimizing Plan has become a bottleneck of AMS. A feasible solution is to delegate the Optimizing Plan to the Optimizer for execution.

For this, some prerequisite refactoring work is required.

This PR moves the code of the Optimizing Plan from the AMS module to the iceberg/mixed-format module, and the relevant logic can be called by the optimizer in the future.

Brief change log

  • Move Optimizing Plan related codes to iceberg/mixed-format module
  • Add a mixed-iceberg rewrite executor factory

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added module:ams-server Ams server module module:mixed-hive Hive moduel for Mixed Format module:common labels Oct 31, 2024
@baiyangtx baiyangtx force-pushed the remove-table-runtime-from-planner branch from 611c39a to 4499ec1 Compare November 4, 2024 11:58
@baiyangtx baiyangtx changed the title [Improvement]: Move optimizing planer and scan to iceberg/mixed-format module [AMORO-3317]: Move optimizing planer and scan to iceberg/mixed-format module Nov 4, 2024
@baiyangtx baiyangtx marked this pull request as ready for review November 4, 2024 12:00
@baiyangtx baiyangtx force-pushed the remove-table-runtime-from-planner branch from 83fe53e to 097e958 Compare November 5, 2024 04:14
@baiyangtx baiyangtx requested review from XBaith and zhoujinsong and removed request for XBaith November 5, 2024 06:16
@baiyangtx baiyangtx requested a review from Aireed November 7, 2024 06:55
# Conflicts:
#	amoro-ams/src/main/java/org/apache/amoro/server/optimizing/OptimizingProcessMeta.java
#	amoro-ams/src/main/java/org/apache/amoro/server/persistence/TableRuntimeMeta.java
#	amoro-ams/src/test/java/org/apache/amoro/server/optimizing/BaseOptimizingChecker.java
Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baiyangtx Thanks for the contribution!

It is very important to the new process API feature.

The PR look good to me in general, I left some minor suggestion, PTAL.

long lastMinor = tableRuntime.getLastMinorOptimizingTime();
long lastFull = tableRuntime.getLastFullOptimizingTime();
TableSnapshot snapshot = IcebergTableUtil.getSnapshot(table, tableRuntime);
if (TableFormat.ICEBERG.in(table.format())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (TableFormat.ICEBERG.in(table.format())) {
if (TableFormat.ICEBERG.equals(table.format())) {

lastMinor,
lastFull);
}
throw new IllegalStateException("Un-supported table-format:" + table.format().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new IllegalStateException("Un-supported table-format:" + table.format().toString());
throw new IllegalStateException("Unsupported table format:" + table.format().toString());

Copy link
Contributor

@Aireed Aireed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Question: After this refactoring, can Spark customize procedures to trigger optimization? (only depend on the plan and scan logical)

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhoujinsong zhoujinsong merged commit 933c1ed into apache:master Nov 13, 2024
4 checks passed
@baiyangtx
Copy link
Contributor Author

LGTM. Question: After this refactoring, can Spark customize procedures to trigger optimization? (only depend on the plan and scan logical)

  1. Committer still in ams-module
  2. These refactorings only involve core-level code, and the engine still needs to adapt.

@baiyangtx baiyangtx deleted the remove-table-runtime-from-planner branch November 13, 2024 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-server Ams server module module:common module:mixed-hive Hive moduel for Mixed Format
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask]: Move optimizing planer and scan to iceberg/mixed-format module
3 participants