QueryPlanner
transforms a LogicalPlan through a chain of GenericStrategy
objects to produce a PhysicalPlan
, e.g. SparkPlan for SparkPlanner or the custom SparkPlanner for HiveSessionState.
QueryPlanner
contract defines three operations:
-
strategies
that returns a collection ofGenericStrategy
objects. -
planLater(plan: LogicalPlan): PhysicalPlan
that skips the current plan. -
plan(plan: LogicalPlan)
that returns anIterator[PhysicalPlan]
with elements being the result of applying eachGenericStrategy
object fromstrategies
collection toplan
input parameter.
SparkStrategies
is an abstract QueryPlanner
for SparkPlan.
It serves as a source of concrete Strategy
objects.
Among available SparkStrategies
is SparkPlanner.
SparkPlanner
is a concrete QueryPlanner
(extending SparkStrategies) that uses a SparkContext, a SQLConf, and a collection of Strategy
objects (as extraStrategies
).
It defines numPartitions
method that is the value of spark.sql.shuffle.partitions for the number of partitions to use for joins and aggregations.
strategies
collection uses predefined Strategy
objects as well as the constructor’s extraStrategies
.
Among the Strategy
objects is JoinSelection
.
HiveSessionState
class uses an custom anonymous SparkPlanner for planner
method (part of SessionState
contract).
The custom anonymous SparkPlanner
uses Strategy
objects defined in HiveStrategies
.