docs: update miscellaneous.md

cmu-db · Jan 20, 2024 · c6ab0d7 · c6ab0d7
1 parent db485ce
commit c6ab0d7
Showing 1 changed file with 12 additions and 2 deletions.
diff --git a/docs/src/miscellaneous.md b/docs/src/miscellaneous.md
@@ -14,7 +14,7 @@ optd supports exploring SQL expressions in the optimization process. However, th
 
 We do not have something like a binding iterator as in the Cascades paper. Before applying a rule, we will generate all bindings of a group, which might take a lot of memory. This should be fixed in the future.
 
-## Cycle Detection
+## Cycle Detection + DAG
 
 Consider the case for join commute rule.
 
@@ -26,10 +26,20 @@ Consider the case for join commute rule.
 
 After applying the rule twice, the memo table will have self-referential groups. Currently, we detect such self-referential things in optimize group task. Probably there will be better ways to do that.
 
+The same applies to DAG / Recursive CTEs -- we did not test if the framework works with DAG but in theory it should support it. We just need to ensure a node in DAG does not get searched twice.
+
+# DAG
+
+For DAG, another challenge is to recover the reusable fragments from the optimizer output. The optimizer can give you a DAG output but by iterating through the plan, you cannot know which parts can be reused/materialized. Therefore, we might need to produce some extra information with the plan node output. i.e., a graph-representation with metadata of each node, instead of `RelNode`. This also helps the process of inserting the physical collector plan nodes, which is currently a little bit hacky in the implementation.
+
+## Memo Table
+
+Obviously, it is not efficient to simply store a mapping from RelNode to the expression id. Cannot imaging how many levels of depths will it require to compute a hash of a tree structure.
+
 ## Partial Exploration
 
 Each iteration will only be slower because we have to invoke the optimize group tasks before we can find a group to apply the rule. Probably we can keep the task stack across runs to make it faster.
 
-## Physical Property
+## Physical Property + Enforcer Rules
 
 A major missing feature in the optimizer. Need this to support shuffling and sort optimizations.