Skip to content

2. Implementing custom optimization rules

wyfunique edited this page Apr 16, 2022 · 1 revision

Overview

The directory of optimization rules in DBSim is dbsim/planners/rules/.

  1. rule_operand.py: defines RuleOperand which represents a rule as a recursive operand tree.

    (1) Each RuleOperand includes a target class of a logical plan node (Expr) and a list of child RuleOperand.

    (2) The target class is a subclass of Expr, representing the type of logical plan node that matches with this RuleOperand.

    (3) When one RuleOperand matches a logical plan node, children of this RuleOperand and children of the logical plan node will be inspected recursively. If all the nodes between them are matched until the RuleOperand tree is exhausted, the rule matches the logical plan.

    (4) Two special types of RuleOperand: NoneOperand for matching nothing (i.e., only matches None) and AnyMatchOperand for matching any Expr subclass.

  2. rule.py: defines the abstract class Rule for any concrete rule. Rule includes the root operand of its RuleOperand tree. Matching a Rule with a logical plan is actually matching between its RuleOperand tree and the plan.

  3. Concrete rules: like filter_merge_rule.py, filter_push_down_rule.py, etc.

Implementing a custom rule

To implement a custom rule, you need to

  1. create a subclass of Rule

  2. initialize the custom rule with its RuleOperand tree in __init__

  3. implement two interfaces for the query plan transformation: transformImpl and transformImplInplace

    (1) transformImpl: the major interface to transform a matched logical plan according to the rule, defined as follows. Note that the implementation of this interface should not modify the input Expr but work on a deep copy of it (using the deepCopyAST method provided by dbsim/ast.py).

    def transformImpl(self, ast_root: Expr) -> List[Expr]:
        """
        The actual implementation of the transformation by the current rule.
        This method is left to the concrete rule classes to implement.
    
        Note that some rules may generate multiple equivalent plans, 
        so this method returns a list of expression trees.
        """
        raise NotImplementedError

    (2) transformImplInplace: an auxiliary interface to transform a matched logical plan in-place, i.e., directly modifying the original plan, defined as follows. Therefore, if a rule may generate multiple transformed results, do not implement this interface.

    def transformImplInplace(self, ast_root: Expr) -> Expr:
        """
        The actual implementation of the inplace transformation by the current rule.
        This method is left to the concrete rule classes to implement.
    
        Note that this method transforms the expression tree in-place, 
        so the rules implementing this method must generate only one equivalent plan, instead of a list of plans.
        So for the rules which generate multiple equivalent plans, do not implement this method. 
        """
        raise NotImplementedError

See filter_merge_rule.py filter_push_down_rule.py selection_simselection_swap_rule.py in dbsim/planners/rules/ for examples.