Skip to content

Latest commit

 

History

History
121 lines (75 loc) · 3.88 KB

spark-sql-sessionstate.adoc

File metadata and controls

121 lines (75 loc) · 3.88 KB

SessionState

SessionState is the default separation layer for isolating state across sessions, including SQL configuration, tables, functions, UDFs, the SQL parser, and everything else that depends on a SQLConf.

Caution
FIXME Elaborate please.

It requires a SparkSession and manages its own SQLConf.

Note
Given the package org.apache.spark.sql.internal that SessionState belongs to, this one is truly internal. You’ve been warned.
Note
SessionState is a private[sql] class.

SessionState offers the following services:

catalog Attribute

catalog: SessionCatalog

catalog attribute points at shared internal SessionCatalog for managing tables and databases.

It is used to create the shared analyzer, optimizer

SessionCatalog

SessionCatalog is a proxy between SparkSession and the underlying metastore, e.g. HiveSessionCatalog.

analyzer Attribute

analyzer: Analyzer

analyzer is…​

optimizer Attribute

optimizer: Optimizer

optimizer is an optimizer for logical query plans.

It is (lazily) set to SparkOptimizer that is a specialization of Catalyst optimizer. It is created for the session-owned SessionCatalog, SQLConf, and ExperimentalMethods.

experimentalMethods

experimentalMethods is…​

sqlParser Attribute

sqlParser is…​

planner method

planner is…​

executePlan method

executePlan(plan: LogicalPlan): QueryExecution

executePlan executes the input LogicalPlan to produce a QueryExecution in the current SparkSession.

refreshTable method

refreshTable is…​

addJar method

addJar is…​

analyze method

analyze is…​

streamingQueryManager Attribute

streamingQueryManager: StreamingQueryManager

streamingQueryManager attribute points at shared StreamingQueryManager (e.g. to start streaming queries in DataStreamWriter).

udf Attribute

udf: UDFRegistration

udf attribute points at shared UDFRegistration for a given Spark session.

Creating New Hadoop Configuration (newHadoopConf method)

newHadoopConf(): Configuration

newHadoopConf returns Hadoop’s Configuration that it builds using SparkContext.hadoopConfiguration (through SparkSession) with all configuration settings added.

Note
newHadoopConf is used by HiveSessionState (for HiveSessionCatalog), ScriptTransformation, ParquetRelation, StateStoreRDD, and SessionState itself, and few other places.
Caution
FIXME What is ScriptTransformation? StateStoreRDD?