Skip to content

Commit

Permalink
docs: update miscellaneous.md
Browse files Browse the repository at this point in the history
  • Loading branch information
skyzh authored Jan 20, 2024
1 parent db485ce commit c6ab0d7
Showing 1 changed file with 12 additions and 2 deletions.
14 changes: 12 additions & 2 deletions docs/src/miscellaneous.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ optd supports exploring SQL expressions in the optimization process. However, th

We do not have something like a binding iterator as in the Cascades paper. Before applying a rule, we will generate all bindings of a group, which might take a lot of memory. This should be fixed in the future.

## Cycle Detection
## Cycle Detection + DAG

Consider the case for join commute rule.

Expand All @@ -26,10 +26,20 @@ Consider the case for join commute rule.

After applying the rule twice, the memo table will have self-referential groups. Currently, we detect such self-referential things in optimize group task. Probably there will be better ways to do that.

The same applies to DAG / Recursive CTEs -- we did not test if the framework works with DAG but in theory it should support it. We just need to ensure a node in DAG does not get searched twice.

# DAG

For DAG, another challenge is to recover the reusable fragments from the optimizer output. The optimizer can give you a DAG output but by iterating through the plan, you cannot know which parts can be reused/materialized. Therefore, we might need to produce some extra information with the plan node output. i.e., a graph-representation with metadata of each node, instead of `RelNode`. This also helps the process of inserting the physical collector plan nodes, which is currently a little bit hacky in the implementation.

## Memo Table

Obviously, it is not efficient to simply store a mapping from RelNode to the expression id. Cannot imaging how many levels of depths will it require to compute a hash of a tree structure.

## Partial Exploration

Each iteration will only be slower because we have to invoke the optimize group tasks before we can find a group to apply the rule. Probably we can keep the task stack across runs to make it faster.

## Physical Property
## Physical Property + Enforcer Rules

A major missing feature in the optimizer. Need this to support shuffling and sort optimizations.

0 comments on commit c6ab0d7

Please sign in to comment.