ParserInterface
is the parser contract for extracting LogicalPlan, Expressions
, and TableIdentifiers
from a given SQL string.
package org.apache.spark.sql.catalyst.parser
trait ParserInterface {
def parsePlan(sqlText: String): LogicalPlan
def parseExpression(sqlText: String): Expression
def parseTableIdentifier(sqlText: String): TableIdentifier
}
It has the only single abstract subclass AbstractSqlParser.
AbstractSqlParser
abstract class is a ParserInterface that provides the foundation for the SQL parsing infrastructure in Spark SQL with two concrete implementations: SparkSqlParser and CatalystSqlParser.
AbstractSqlParser
expects that subclasses provide custom AstBuilder
(as astBuilder
) that converts a ParseTree
(from ANTLR) into an AST.
SparkSqlParser
is the default parser of the SQL statements supported in Spark SQL. It is available as a ParserInterface object in SessionState (as sqlParser
).
It uses its own specialized astBuilder
, i.e. SparkSqlAstBuilder
, that extends CatalystSqlParser's AstBuilder
.
It is used for the expr function.
scala> expr("token = 'hello'")
16/07/07 18:32:53 INFO SparkSqlParser: Parsing command: token = 'hello'
res0: org.apache.spark.sql.Column = (token = hello)
Tip
|
Enable Add the following line to
Refer to Logging. |
Caution
|
FIXME Review parse method.
|
CatalystSqlParser
is a AbstractSqlParser that comes with its own specialized astBuilder
(i.e. AstBuilder
).
CatalystSqlParser
is used to parse data types (using their canonical string representation), e.g. when creating StructType or executing cast function (on a Column).
import org.apache.spark.sql.types._
scala> val struct = (new StructType).add("a", "int")
16/07/07 18:50:52 INFO CatalystSqlParser: Parsing command: int
struct: org.apache.spark.sql.types.StructType = StructType(StructField(a,IntegerType,true))
scala> expr("token = 'hello'").cast("int")
16/07/07 19:00:26 INFO SparkSqlParser: Parsing command: token = 'hello'
16/07/07 19:00:26 INFO CatalystSqlParser: Parsing command: int
res0: org.apache.spark.sql.Column = CAST((token = hello) AS INT)
It is also used in SimpleCatalogRelation
, MetastoreRelation
, and OrcFileOperator
.
Tip
|
Enable Add the following line to
Refer to Logging. |