diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index ab51dfd6..8bee1882 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-09-07T14:52:44","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-09-22T23:02:01","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/dev/examples/index.html b/dev/examples/index.html index 9f732593..662a10bf 100644 --- a/dev/examples/index.html +++ b/dev/examples/index.html @@ -822,4 +822,4 @@ 10 │ 95538 2009-03-30 2009-09-02 11 │ 107680 2009-06-07 2009-07-30 12 │ 110862 2008-09-07 2010-06-07 -=# +=# diff --git a/dev/guide/index.html b/dev/guide/index.html index e733590a..e39a9a87 100644 --- a/dev/guide/index.html +++ b/dev/guide/index.html @@ -758,4 +758,4 @@ 5 │ 438438 Acute myocardial infarction of a… Condition SNOMED ⋯ 6 │ 444406 Acute subendocardial infarction Condition SNOMED 6 columns omitted -=# +=# diff --git a/dev/index.html b/dev/index.html index 589108d5..f8c3c77a 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · FunSQL.jl

FunSQL.jl

FunSQL is a Julia library for compositional construction of SQL queries.

Table of Contents

+Home · FunSQL.jl

FunSQL.jl

FunSQL is a Julia library for compositional construction of SQL queries.

Table of Contents

diff --git a/dev/reference/index.html b/dev/reference/index.html index 96ac49b6..829e4757 100644 --- a/dev/reference/index.html +++ b/dev/reference/index.html @@ -1,7 +1,7 @@ -API Reference · FunSQL.jl

API Reference

render()

FunSQL.renderMethod
render(node; tables = Dict{Symbol, SQLTable}(),
+API Reference · FunSQL.jl

API Reference

render()

FunSQL.renderMethod
render(node; tables = Dict{Symbol, SQLTable}(),
              dialect = :default,
-             cache = nothing)::SQLString

Create a SQLCatalog object and serialize the query node.

source
FunSQL.renderMethod
render(catalog::Union{SQLConnection, SQLCatalog}, node::SQLNode)::SQLString

Serialize the query node as a SQL statement.

Parameter catalog of SQLCatalog type encapsulates available database tables and the target SQL dialect. A SQLConnection object is also accepted.

Parameter node is a composite SQLNode object.

The function returns a SQLString value. The result is also cached (with the identity of node serving as the key) in the catalog cache.

Examples

julia> catalog = SQLCatalog(
+             cache = nothing)::SQLString

Create a SQLCatalog object and serialize the query node.

source
FunSQL.renderMethod
render(catalog::Union{SQLConnection, SQLCatalog}, node::SQLNode)::SQLString

Serialize the query node as a SQL statement.

Parameter catalog of SQLCatalog type encapsulates available database tables and the target SQL dialect. A SQLConnection object is also accepted.

Parameter node is a composite SQLNode object.

The function returns a SQLString value. The result is also cached (with the identity of node serving as the key) in the catalog cache.

Examples

julia> catalog = SQLCatalog(
            :person => SQLTable(:person, columns = [:person_id, :year_of_birth]),
            dialect = :postgresql);
 
@@ -13,19 +13,19 @@
   "person_1"."person_id",
   "person_1"."year_of_birth"
 FROM "person" AS "person_1"
-WHERE ("person_1"."year_of_birth" >= 1950)
source
FunSQL.renderMethod
render(dialect::Union{SQLConnection, SQLCatalog, SQLDialect},
-       clause::SQLClause)::SQLString

Serialize the syntax tree of a SQL query.

source

reflect()

FunSQL.renderMethod
render(dialect::Union{SQLConnection, SQLCatalog, SQLDialect},
+       clause::SQLClause)::SQLString

Serialize the syntax tree of a SQL query.

source

reflect()

FunSQL.reflectMethod
reflect(conn;
         schema = nothing,
         dialect = nothing,
-        cache = 256)::SQLCatalog

Retrieve the information about available database tables.

The function returns a SQLCatalog object. The catalog will be populated with the tables from the given database schema, or, if parameter schema is not set, from the default database schema (e.g., schema public for PostgreSQL).

Parameter dialect specifies the target SQLDialect. If not set, dialect will be inferred from the type of the connection object.

source

SQLConnection and SQLStatement

FunSQL.SQLConnectionType
SQLConnection(conn; catalog)

Wrap a raw database connection object together with a SQLCatalog object containing information about database tables.

source
DBInterface.connectMethod
DBInterface.connect(DB{RawConnType},
+        cache = 256)::SQLCatalog

Retrieve the information about available database tables.

The function returns a SQLCatalog object. The catalog will be populated with the tables from the given database schema, or, if parameter schema is not set, from the default database schema (e.g., schema public for PostgreSQL).

Parameter dialect specifies the target SQLDialect. If not set, dialect will be inferred from the type of the connection object.

source

SQLConnection and SQLStatement

FunSQL.SQLConnectionType
SQLConnection(conn; catalog)

Wrap a raw database connection object together with a SQLCatalog object containing information about database tables.

source
DBInterface.connectMethod
DBInterface.connect(DB{RawConnType},
                     args...;
                     schema = nothing,
                     dialect = nothing,
                     cache = 256,
-                    kws...)

Connect to the database server, call reflect to retrieve the information about available tables and return a SQLConnection object.

Extra parameters args and kws are passed to the call:

DBInterface.connect(RawConnType, args...; kws...)
source
DBInterface.executeMethod
DBInterface.execute(conn::SQLConnection, sql::SQLNode, params)
-DBInterface.execute(conn::SQLConnection, sql::SQLClause, params)

Serialize and execute the query node.

source
DBInterface.executeMethod
DBInterface.execute(conn::SQLConnection, sql::SQLNode; params...)
-DBInterface.execute(conn::SQLConnection, sql::SQLClause; params...)

Serialize and execute the query node.

source
DBInterface.prepareMethod
DBInterface.prepare(conn::SQLConnection, str::SQLString)::SQLStatement

Generate a prepared SQL statement.

source
DBInterface.prepareMethod
DBInterface.prepare(conn::SQLConnection, sql::SQLNode)::SQLStatement
-DBInterface.prepare(conn::SQLConnection, sql::SQLClause)::SQLStatement

Serialize the query node and return a prepared SQL statement.

source

SQLCatalog, SQLTable, and SQLColumn

FunSQL.SQLCatalogType
SQLCatalog(; tables = Dict{Symbol, SQLTable}(),
+                    kws...)

Connect to the database server, call reflect to retrieve the information about available tables and return a SQLConnection object.

Extra parameters args and kws are passed to the call:

DBInterface.connect(RawConnType, args...; kws...)
source
DBInterface.executeMethod
DBInterface.execute(conn::SQLConnection, sql::SQLNode, params)
+DBInterface.execute(conn::SQLConnection, sql::SQLClause, params)

Serialize and execute the query node.

source
DBInterface.executeMethod
DBInterface.execute(conn::SQLConnection, sql::SQLNode; params...)
+DBInterface.execute(conn::SQLConnection, sql::SQLClause; params...)

Serialize and execute the query node.

source
DBInterface.prepareMethod
DBInterface.prepare(conn::SQLConnection, str::SQLString)::SQLStatement

Generate a prepared SQL statement.

source
DBInterface.prepareMethod
DBInterface.prepare(conn::SQLConnection, sql::SQLNode)::SQLStatement
+DBInterface.prepare(conn::SQLConnection, sql::SQLClause)::SQLStatement

Serialize the query node and return a prepared SQL statement.

source

SQLCatalog, SQLTable, and SQLColumn

FunSQL.SQLCatalogType
SQLCatalog(; tables = Dict{Symbol, SQLTable}(),
              dialect = :default,
              cache = 256,
              metadata = nothing)
@@ -40,8 +40,8 @@
                     SQLColumn(:person_id),
                     SQLColumn(:year_of_birth),
                     SQLColumn(:location_id)),
-           dialect = SQLDialect(:postgresql))
source
FunSQL.SQLColumnType
SQLColumn(; name, metadata = nothing)
-SQLColumn(name; metadata = nothing)

SQLColumn represents a column with the given name and optional metadata.

source
FunSQL.SQLTableType
SQLTable(; qualifiers = [], name, columns, metadata = nothing)
+           dialect = SQLDialect(:postgresql))
source
FunSQL.SQLColumnType
SQLColumn(; name, metadata = nothing)
+SQLColumn(name; metadata = nothing)

SQLColumn represents a column with the given name and optional metadata.

source
FunSQL.SQLTableType
SQLTable(; qualifiers = [], name, columns, metadata = nothing)
 SQLTable(name; qualifiers = [], columns, metadata = nothing)
 SQLTable(name, columns...; qualifiers = [], metadata = nothing)

The structure of a SQL table or a table-like entity (TEMP TABLE, VIEW, etc) for use as a reference in assembling SQL queries.

The SQLTable constructor expects the table name, an optional vector containing the table schema and other qualifiers, an ordered dictionary columns that maps names to columns, and an optional metadata.

Examples

julia> person = SQLTable(qualifiers = ["public"],
                          name = "person",
@@ -51,7 +51,7 @@
          :person,
          SQLColumn(:person_id),
          SQLColumn(:year_of_birth),
-         metadata = [:is_view => false])
source

SQLDialect

SQLDialect

FunSQL.SQLDialectType
SQLDialect(; name = :default, kws...)
 SQLDialect(template::SQLDialect; kws...)
 SQLDialect(name::Symbol, kws...)
 SQLDialect(ConnType::Type)

Properties and capabilities of a particular SQL dialect.

Use SQLDialect(name::Symbol) to create one of the known dialects. The following names are recognized:

  • :duckdb
  • :mysql
  • :postgresql
  • :redshift
  • :spark
  • :sqlite
  • :sqlserver

Keyword parameters override individual properties of a dialect. For details, check the source code.

Use SQLDialect(ConnType::Type) to detect the dialect based on the type of the database connection object. The following types are recognized:

  • DuckDB.DB
  • MySQL.Connection
  • LibPQ.Connection
  • SQLite.DB

Examples

julia> postgresql_dialect = SQLDialect(:postgresql)
@@ -60,7 +60,7 @@
 julia> postgresql_odbc_dialect = SQLDialect(:postgresql,
                                             variable_prefix = '?',
                                             variable_style = :positional)
-SQLDialect(:postgresql, variable_prefix = '?', variable_style = :POSITIONAL)
source

SQLString

FunSQL.SQLStringType
SQLString(raw; columns = nothing, vars = Symbol[])

Serialized SQL query.

Parameter columns is a vector describing the output columns.

Parameter vars is a vector of query parameters (created with Var) in the order they are expected by the DBInterface.execute() function.

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
+SQLDialect(:postgresql, variable_prefix = '?', variable_style = :POSITIONAL)
source

SQLString

FunSQL.SQLStringType
SQLString(raw; columns = nothing, vars = Symbol[])

Serialized SQL query.

Parameter columns is a vector describing the output columns.

Parameter vars is a vector of query parameters (created with Var) in the order they are expected by the DBInterface.execute() function.

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
 
 julia> q = From(person);
 
@@ -97,7 +97,7 @@
             ("person_1"."year_of_birth" >= $1) AND
             ("person_1"."year_of_birth" < ($1 + 10))""",
           columns = [SQLColumn(:person_id), SQLColumn(:year_of_birth)],
-          vars = [:YEAR])
source
FunSQL.packFunction
pack(sql::SQLString, vars::Union{Dict, NamedTuple})::Vector{Any}

Convert a dictionary or a named tuple of query parameters to the positional form expected by DBInterface.execute().

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
+          vars = [:YEAR])
source
FunSQL.packFunction
pack(sql::SQLString, vars::Union{Dict, NamedTuple})::Vector{Any}

Convert a dictionary or a named tuple of query parameters to the positional form expected by DBInterface.execute().

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
 
 julia> q = From(person) |> Where(Fun.and(Get.year_of_birth .>= Var.YEAR,
                                          Get.year_of_birth .< Var.YEAR .+ 10));
@@ -113,7 +113,7 @@
 
 julia> pack(sql, (; YEAR = 1950))
 1-element Vector{Any}:
- 1950
source

SQLNode

Agg

FunSQL.AggMethod
Agg(; over = nothing, name, args = [], filter = nothing)
+ 1950
source

SQLNode

Agg

FunSQL.AggMethod
Agg(; over = nothing, name, args = [], filter = nothing)
 Agg(name; over = nothing, args = [], filter = nothing)
 Agg(name, args...; over = nothing, filter = nothing)
 Agg.name(args...; over = nothing, filter = nothing)

An application of an aggregate function.

An Agg node must be applied to the output of a Group or a Partition node. In a Group context, it is translated to a regular aggregate function, and in a Partition context, it is translated to a window function.

Examples

Number of patients per year of birth.

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
@@ -174,7 +174,7 @@
   "visit_occurrence_1"."person_id",
   "visit_occurrence_1"."visit_start_date",
   ("visit_occurrence_1"."visit_start_date" - (lag("visit_occurrence_1"."visit_start_date") OVER (PARTITION BY "visit_occurrence_1"."person_id" ORDER BY "visit_occurrence_1"."visit_start_date"))) AS "gap"
-FROM "visit_occurrence" AS "visit_occurrence_1"
source

Append

FunSQL.AppendMethod
Append(; over = nothing, args)
+FROM "visit_occurrence" AS "visit_occurrence_1"
source

Append

FunSQL.AppendMethod
Append(; over = nothing, args)
 Append(args...; over = nothing)

Append concatenates input datasets.

Only the columns that are present in every input dataset will be included to the output of Append.

An Append node is translated to a UNION ALL query:

SELECT ...
 FROM $over
 UNION ALL
@@ -199,7 +199,7 @@
 SELECT
   "observation_1"."person_id",
   "observation_1"."observation_date" AS "date"
-FROM "observation" AS "observation_1"
source

As

FunSQL.AsMethod
As(; over = nothing, name)
+FROM "observation" AS "observation_1"
source

As

FunSQL.AsMethod
As(; over = nothing, name)
 As(name; over = nothing)
 name => over

In a scalar context, As specifies the name of the output column. When applied to tabular data, As wraps the data in a nested record.

The arrow operator (=>) is a shorthand notation for As.

Examples

Show all patient IDs.

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
 
@@ -221,7 +221,7 @@
   "person_1"."person_id",
   "location_1"."state"
 FROM "person" AS "person_1"
-JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
source

Bind

FunSQL.BindMethod
Bind(; over = nothing; args)
+JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
source

Bind

FunSQL.BindMethod
Bind(; over = nothing; args)
 Bind(args...; over = nothing)

The Bind node evaluates a query with parameters. Specifically, Bind provides the values for Var parameters contained in the over node.

In a scalar context, the Bind node is translated to a correlated subquery. When Bind is applied to the joinee branch of a Join node, it is translated to a JOIN LATERAL query.

Examples

Show patients with at least one visit to a heathcare provider.

julia> person = SQLTable(:person, columns = [:person_id]);
 
 julia> visit_occurrence = SQLTable(:visit_occurrence, columns = [:visit_occurrence_id, :person_id]);
@@ -264,7 +264,7 @@
   WHERE ("visit_occurrence_1"."person_id" = "person_1"."person_id")
   ORDER BY "visit_occurrence_1"."visit_start_date" DESC
   FETCH FIRST 1 ROW ONLY
-) AS "visit_1" ON TRUE
source

Define

FunSQL.DefineMethod
Define(; over; args = [], before = nothing, after = nothing)
+) AS "visit_1" ON TRUE
source

Define

FunSQL.DefineMethod
Define(; over; args = [], before = nothing, after = nothing)
 Define(args...; over, before = nothing, after = nothing)

The Define node adds or replaces output columns.

By default, new columns are added at the end of the column list while replaced columns retain their position. Set after = true (after = <column>) to add both new and replaced columns at the end (after a specified column). Alternatively, set before = true (before = <column>) to add both new and replaced columns at the front (before the specified column).

Examples

Show patients who are at least 16 years old.

julia> person = SQLTable(:person, columns = [:person_id, :birth_datetime]);
 
 julia> q = From(:person) |>
@@ -294,7 +294,7 @@
 SELECT
   "person_1"."person_id",
   (CASE WHEN ("person_1"."year_of_birth" >= 1930) THEN "person_1"."year_of_birth" ELSE NULL END) AS "year_of_birth"
-FROM "person" AS "person_1"
source

From

From

FunSQL.FromMethod
From(; source)
 From(tbl::SQLTable)
 From(name::Symbol)
 From(^)
@@ -374,7 +374,7 @@
 
 julia> print(render(q, dialect = :postgresql))
 SELECT CAST("regexp_matches_1"."captures"[1] AS INTEGER) AS "_"
-FROM regexp_matches('2,3,5,7,11', '(\d+)', 'g') AS "regexp_matches_1" ("captures")
source

Fun

FunSQL.FunMethod
Fun(; name, args = [])
+FROM regexp_matches('2,3,5,7,11', '(\d+)', 'g') AS "regexp_matches_1" ("captures")
source

Fun

FunSQL.FunMethod
Fun(; name, args = [])
 Fun(name; args = [])
 Fun(name, args...)
 Fun.name(args...)

Application of a SQL function or a SQL operator.

A Fun node is also generated by broadcasting on SQLNode objects. Names of Julia operators (==, !=, &&, ||, !) are replaced with their SQL equivalents (=, <>, and, or, not).

If name contains only symbols, or if name starts or ends with a space, the Fun node is translated to a SQL operator.

If name contains one or more ? characters, it serves as a template of a SQL expression where ? symbols are replaced with the given arguments. Use ?? to represent a literal ? mark. Wrap the template in parentheses if this is necessary to make the SQL expression unambiguous.

Certain names have a customized translation in order to generate common SQL functions and operators with irregular syntax:

Fun nodeSQL syntax
Fun.and(p₁, p₂, …)p₁ AND p₂ AND …
Fun.between(x, y, z)x BETWEEN y AND z
Fun.case(p, x, …)CASE WHEN p THEN x … END
Fun.cast(x, "TYPE")CAST(x AS TYPE)
Fun.concat(s₁, s₂, …)dialect-specific, e.g., (s₁ || s₂ || …)
Fun.current_date()CURRENT_DATE
Fun.current_timestamp()CURRENT_TIMESTAMP
Fun.exists(q)EXISTS q
Fun.extract("FIELD", x)EXTRACT(FIELD FROM x)
Fun.in(x, q)x IN q
Fun.in(x, y₁, y₂, …)x IN (y₁, y₂, …)
Fun.is_not_null(x)x IS NOT NULL
Fun.is_null(x)x IS NULL
Fun.like(x, y)x LIKE y
Fun.not(p)NOT p
Fun.not_between(x, y, z)x NOT BETWEEN y AND z
Fun.not_exists(q)NOT EXISTS q
Fun.not_in(x, q)x NOT IN q
Fun.not_in(x, y₁, y₂, …)x NOT IN (y₁, y₂, …)
Fun.not_like(x, y)x NOT LIKE y
Fun.or(p₁, p₂, …)p₁ OR p₂ OR …

Examples

Replace missing values with N/A.

julia> location = SQLTable(:location, columns = [:location_id, :city]);
@@ -415,7 +415,7 @@
 
 julia> print(render(q, tables = [location]))
 SELECT SUBSTRING("location_1"."zip" FROM 1 FOR 3) AS "_"
-FROM "location" AS "location_1"
source

Get

Get

FunSQL.GetMethod
Get(; over, name)
 Get(name; over)
 Get.name        Get."name"      Get[name]       Get["name"]
 over.name       over."name"     over[name]      over["name"]
@@ -440,7 +440,7 @@
   "person_1"."person_id",
   "location_1"."state"
 FROM "person" AS "person_1"
-JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
source

Group

FunSQL.GroupMethod
Group(; over, by = [], sets = sets, name = nothing)
+JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
source

Group

FunSQL.GroupMethod
Group(; over, by = [], sets = sets, name = nothing)
 Group(by...; over, sets = sets, name = nothing)

The Group node summarizes the input dataset.

Specifically, Group outputs all unique values of the given grouping key. This key partitions the input rows into disjoint groups that are summarized by aggregate functions Agg applied to the output of Group. The parameter sets specifies the grouping sets, either with grouping mode indicators :cube or :rollup, or explicitly as Vector{Vector{Symbol}}. An optional parameter name specifies the field to hold the group.

The Group node is translated to a SQL query with a GROUP BY clause:

SELECT ...
 FROM $over
 GROUP BY $by...

Examples

Total number of patients.

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
@@ -491,13 +491,13 @@
 
 julia> print(render(q, tables = [location]))
 SELECT DISTINCT "location_1"."state"
-FROM "location" AS "location_1"
source

Highlight

Highlight

FunSQL.HighlightMethod
Highlight(; over = nothing; color)
 Highlight(color; over = nothing)

Highlight over with the given color.

The highlighted node is printed with the selected color when the query containing it is displayed.

Available colors can be found in Base.text_colors.

Examples

julia> q = From(:person) |>
            Select(Get.person_id |> Highlight(:bold))
 let q1 = From(:person),
     q2 = q1 |> Select(Get.person_id)
     q2
-end
source

Iterate

Iterate

FunSQL.IterateMethod
Iterate(; over = nothing, iterator)
 Iterate(iterator; over = nothing)

Iterate generates the concatenated output of an iterated query.

The over query is evaluated first. Then the iterator query is repeatedly applied: to the output of over, then to the output of its previous run, and so on, until the iterator produces no data. All these outputs are concatenated to generate the output of Iterate.

The iterator query may explicitly refer to the output of the previous run using From(^) notation.

The Iterate node is translated to a recursive common table expression:

WITH RECURSIVE iterator AS (
   SELECT ...
   FROM $over
@@ -545,7 +545,7 @@
 SELECT
   "__3"."n",
   "__3"."f"
-FROM "__1" AS "__3"
source

Join

FunSQL.JoinMethod
Join(; over = nothing, joinee, on, left = false, right = false, optional = false)
+FROM "__1" AS "__3"
source

Join

FunSQL.JoinMethod
Join(; over = nothing, joinee, on, left = false, right = false, optional = false)
 Join(joinee; over = nothing, on, left = false, right = false, optional = false)
 Join(joinee, on; over = nothing, left = false, right = false, optional = false)

Join correlates two input datasets.

The Join node is translated to a query with a JOIN clause:

SELECT ...
 FROM $over
@@ -563,7 +563,7 @@
   "person_1"."person_id",
   "location_1"."state"
 FROM "person" AS "person_1"
-JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
source

Limit

FunSQL.LimitMethod
Limit(; over = nothing, offset = nothing, limit = nothing)
+JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")
source

Limit

FunSQL.LimitMethod
Limit(; over = nothing, offset = nothing, limit = nothing)
 Limit(limit; over = nothing, offset = nothing)
 Limit(offset, limit; over = nothing)
 Limit(start:stop; over = nothing)

The Limit node skips the first offset rows and then emits the next limit rows.

To make the output deterministic, Limit must be applied directly after an Order node.

The Limit node is translated to a query with a LIMIT or a FETCH clause:

SELECT ...
@@ -581,7 +581,7 @@
   "person_1"."year_of_birth"
 FROM "person" AS "person_1"
 ORDER BY "person_1"."year_of_birth"
-FETCH FIRST 1 ROW ONLY
source

Lit

Lit

FunSQL.LitMethod
Lit(; val)
 Lit(val)

A SQL literal.

In a context where a SQL node is expected, missing, numbers, strings, and datetime values are automatically converted to SQL literals.

Examples

julia> q = Select(:null => missing,
                   :boolean => true,
                   :integer => 42,
@@ -594,7 +594,7 @@
   TRUE AS "boolean",
   42 AS "integer",
   'SQL is fun!' AS "text",
-  '2000-01-01' AS "date"
source

Order

Order

FunSQL.OrderMethod
Order(; over = nothing, by)
 Order(by...; over = nothing)

Order sorts the input rows by the given key.

The Ordernode is translated to a query with an ORDER BY clause:

SELECT ...
 FROM $over
 ORDER BY $by...

Specify the sort order with Asc, Desc, or Sort.

Examples

List patients ordered by their age.

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
@@ -607,7 +607,7 @@
   "person_1"."person_id",
   "person_1"."year_of_birth"
 FROM "person" AS "person_1"
-ORDER BY "person_1"."year_of_birth"
source

Over

FunSQL.OverMethod
Over(; over = nothing, arg, materialized = nothing)
+ORDER BY "person_1"."year_of_birth"
source

Over

FunSQL.OverMethod
Over(; over = nothing, arg, materialized = nothing)
 Over(arg; over = nothing, materialized = nothing)

base |> Over(arg) is an alias for With(base, over = arg).

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
 
 julia> condition_occurrence =
@@ -635,7 +635,7 @@
 WHERE ("person_1"."person_id" IN (
   SELECT "essential_hypertension_2"."person_id"
   FROM "essential_hypertension_1" AS "essential_hypertension_2"
-))
source

Partition

FunSQL.PartitionMethod
Partition(; over, by = [], order_by = [], frame = nothing, name = nothing)
+))
source

Partition

FunSQL.PartitionMethod
Partition(; over, by = [], order_by = [], frame = nothing, name = nothing)
 Partition(by...; over, order_by = [], frame = nothing, name = nothing)

The Partition node relates adjacent rows.

Specifically, Partition specifies how to relate each row to the adjacent rows in the same dataset. The rows are partitioned by the given key and ordered within each partition using order_by key. The parameter frame customizes the extent of related rows. These related rows are summarized by aggregate functions Agg applied to the output of Partition. An optional parameter name specifies the field to hold the partition.

The Partition node is translated to a query with a WINDOW clause:

SELECT ...
 FROM $over
 WINDOW w AS (PARTITION BY $by... ORDER BY $order_by...)

Examples

Enumerate patients' visits.

julia> visit_occurrence =
@@ -673,7 +673,7 @@
   "person_1"."year_of_birth",
   (avg(count(*)) OVER (ORDER BY "person_1"."year_of_birth" RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING)) AS "avg"
 FROM "person" AS "person_1"
-GROUP BY "person_1"."year_of_birth"
source

Select

Select

FunSQL.SelectMethod
Select(; over; args)
 Select(args...; over)

The Select node specifies the output columns.

SELECT $args...
 FROM $over

Set the column labels with As.

Examples

List patient IDs and their age.

julia> person = SQLTable(:person, columns = [:person_id, :birth_datetime]);
 
@@ -685,7 +685,7 @@
 SELECT
   "person_1"."person_id",
   (now() - "person_1"."birth_datetime") AS "age"
-FROM "person" AS "person_1"
source

Sort, Asc, and Desc

FunSQL.AscMethod
Asc(; over = nothing, nulls = nothing)

Ascending order indicator.

source
FunSQL.DescMethod
Desc(; over = nothing, nulls = nothing)

Descending order indicator.

source
FunSQL.SortMethod
Sort(; over = nothing, value, nulls = nothing)
+FROM "person" AS "person_1"
source

Sort, Asc, and Desc

FunSQL.AscMethod
Asc(; over = nothing, nulls = nothing)

Ascending order indicator.

source
FunSQL.DescMethod
Desc(; over = nothing, nulls = nothing)

Descending order indicator.

source
FunSQL.SortMethod
Sort(; over = nothing, value, nulls = nothing)
 Sort(value; over = nothing, nulls = nothing)
 Asc(; over = nothing, nulls = nothing)
 Desc(; over = nothing, nulls = nothing)

Sort order indicator.

Use with Order or Partition nodes.

Examples

List patients ordered by their age.

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
@@ -698,7 +698,7 @@
   "person_1"."person_id",
   "person_1"."year_of_birth"
 FROM "person" AS "person_1"
-ORDER BY "person_1"."year_of_birth" DESC NULLS FIRST
source

Var

FunSQL.VarMethod
Var(; name)
+ORDER BY "person_1"."year_of_birth" DESC NULLS FIRST
source

Var

FunSQL.VarMethod
Var(; name)
 Var(name)
 Var.name        Var."name"      Var[name]       Var["name"]

A reference to a query parameter.

Specify the value for the parameter with Bind to create a correlated subquery or a lateral join.

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
 
@@ -710,7 +710,7 @@
   "person_1"."person_id",
   "person_1"."year_of_birth"
 FROM "person" AS "person_1"
-WHERE ("person_1"."year_of_birth" > :YEAR)
source

Where

FunSQL.WhereMethod
Where(; over = nothing, condition)
+WHERE ("person_1"."year_of_birth" > :YEAR)
source

Where

FunSQL.WhereMethod
Where(; over = nothing, condition)
 Where(condition; over = nothing)

The Where node filters the input rows by the given condition.

Where is translated to a SQL query with a WHERE clause:

SELECT ...
 FROM $over
 WHERE $condition

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
@@ -723,7 +723,7 @@
   "person_1"."person_id",
   "person_1"."year_of_birth"
 FROM "person" AS "person_1"
-WHERE ("person_1"."year_of_birth" > 2000)
source

With

FunSQL.WithMethod
With(; over = nothing, args, materialized = nothing)
+WHERE ("person_1"."year_of_birth" > 2000)
source

With

FunSQL.WithMethod
With(; over = nothing, args, materialized = nothing)
 With(args...; over = nothing, materialized = nothing)

With assigns a name to a temporary dataset. The dataset content can be retrieved within the over query using the From node.

With is translated to a common table expression:

WITH $args...
 SELECT ...
 FROM $over

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
@@ -753,7 +753,7 @@
 WHERE ("person_1"."person_id" IN (
   SELECT "essential_hypertension_2"."person_id"
   FROM "essential_hypertension_1" AS "essential_hypertension_2"
-))
source

WithExternal

WithExternal

FunSQL.WithExternalMethod
WithExternal(; over = nothing, args, qualifiers = [], handler = nothing)
 WithExternal(args...; over = nothing, qualifiers = [], handler = nothing)

WithExternal assigns a name to a temporary dataset. The dataset content can be retrieved within the over query using the From node.

The definition of the dataset is converted to a Pair{SQLTable, SQLClause} object and sent to handler, which can use it, for instance, to construct a SELECT INTO statement.

Examples

julia> person = SQLTable(:person, columns = [:person_id, :year_of_birth]);
 
 julia> condition_occurrence =
@@ -786,7 +786,7 @@
 WHERE ("person_1"."person_id" IN (
   SELECT "essential_hypertension_1"."person_id"
   FROM "essential_hypertension" AS "essential_hypertension_1"
-))
source

SQLClause

AGG

FunSQL.AGGMethod
AGG(; name, args = [], filter = nothing, over = nothing)
+))
source

SQLClause

AGG

FunSQL.AGGMethod
AGG(; name, args = [], filter = nothing, over = nothing)
 AGG(name; args = [], filter = nothing, over = nothing)
 AGG(name, args...; filter = nothing, over = nothing)

An application of an aggregate function.

Examples

julia> c = AGG(:max, :year_of_birth);
 
@@ -797,19 +797,19 @@
 (count(*) FILTER (WHERE ("year_of_birth" > 1970)))
julia> c = AGG(:row_number, over = PARTITION(:year_of_birth));
 
 julia> print(render(c))
-(row_number() OVER (PARTITION BY "year_of_birth"))
source

AS

FunSQL.ASMethod
AS(; over = nothing, name, columns = nothing)
+(row_number() OVER (PARTITION BY "year_of_birth"))
source

AS

FunSQL.ASMethod
AS(; over = nothing, name, columns = nothing)
 AS(name; over = nothing, columns = nothing)

An AS clause.

Examples

julia> c = ID(:person) |> AS(:p);
 
 julia> print(render(c))
 "person" AS "p"
julia> c = ID(:person) |> AS(:p, columns = [:person_id, :year_of_birth]);
 
 julia> print(render(c))
-"person" AS "p" ("person_id", "year_of_birth")
source

FROM

FunSQL.FROMMethod
FROM(; over = nothing)
+"person" AS "p" ("person_id", "year_of_birth")
source

FROM

FunSQL.FROMMethod
FROM(; over = nothing)
 FROM(over)

A FROM clause.

Examples

julia> c = ID(:person) |> AS(:p) |> FROM() |> SELECT((:p, :person_id));
 
 julia> print(render(c))
 SELECT "p"."person_id"
-FROM "person" AS "p"
source

FUN

FUN

FunSQL.FUNMethod
FUN(; name, args = [])
 FUN(name; args = [])
 FUN(name, args...)

An invocation of a SQL function or a SQL operator.

Examples

julia> c = FUN(:concat, :city, ", ", :state);
 
@@ -820,7 +820,7 @@
 ("city" || ', ' || "state")
julia> c = FUN("SUBSTRING(? FROM ? FOR ?)", :zip, 1, 3);
 
 julia> print(render(c))
-SUBSTRING("zip" FROM 1 FOR 3)
source

GROUP

FunSQL.GROUPMethod
GROUP(; over = nothing, by = [], sets = nothing)
+SUBSTRING("zip" FROM 1 FOR 3)
source

GROUP

FunSQL.GROUPMethod
GROUP(; over = nothing, by = [], sets = nothing)
 GROUP(by...; over = nothing, sets = nothing)

A GROUP BY clause.

Examples

julia> c = FROM(:person) |>
            GROUP(:year_of_birth) |>
            SELECT(:year_of_birth, AGG(:count));
@@ -830,7 +830,7 @@
   "year_of_birth",
   count(*)
 FROM "person"
-GROUP BY "year_of_birth"
source

HAVING

HAVING

FunSQL.HAVINGMethod
HAVING(; over = nothing, condition)
 HAVING(condition; over = nothing)

A HAVING clause.

Examples

julia> c = FROM(:person) |>
            GROUP(:year_of_birth) |>
            HAVING(FUN(">", AGG(:count), 10)) |>
@@ -840,7 +840,7 @@
 SELECT "person_id"
 FROM "person"
 GROUP BY "year_of_birth"
-HAVING (count(*) > 10)
source

ID

FunSQL.IDMethod
ID(; over = nothing, name)
+HAVING (count(*) > 10)
source

ID

FunSQL.IDMethod
ID(; over = nothing, name)
 ID(name; over = nothing)
 ID(qualifiers, name)

A SQL identifier. Specify over or use the |> operator to make a qualified identifier.

Examples

julia> c = ID(:person);
 
@@ -851,7 +851,7 @@
 "p"."birth_datetime"
julia> c = ID([:pg_catalog], :pg_database);
 
 julia> print(render(c))
-"pg_catalog"."pg_database"
source

JOIN

FunSQL.JOINMethod
JOIN(; over = nothing, joinee, on, left = false, right = false, lateral = false)
+"pg_catalog"."pg_database"
source

JOIN

FunSQL.JOINMethod
JOIN(; over = nothing, joinee, on, left = false, right = false, lateral = false)
 JOIN(joinee; over = nothing, on, left = false, right = false, lateral = false)
 JOIN(joinee, on; over = nothing, left = false, right = false, lateral = false)

A JOIN clause.

Examples

julia> c = FROM(:p => :person) |>
            JOIN(:l => :location,
@@ -864,7 +864,7 @@
   "p"."person_id",
   "l"."state"
 FROM "person" AS "p"
-LEFT JOIN "location" AS "l" ON ("p"."location_id" = "l"."location_id")
source

LIMIT

FunSQL.LIMITMethod
LIMIT(; over = nothing, offset = nothing, limit = nothing, with_ties = false)
+LEFT JOIN "location" AS "l" ON ("p"."location_id" = "l"."location_id")
source

LIMIT

FunSQL.LIMITMethod
LIMIT(; over = nothing, offset = nothing, limit = nothing, with_ties = false)
 LIMIT(limit; over = nothing, offset = nothing, with_ties = false)
 LIMIT(offset, limit; over = nothing, with_ties = false)
 LIMIT(start:stop; over = nothing, with_ties = false)

A LIMIT clause.

Examples

julia> c = FROM(:person) |>
@@ -874,21 +874,21 @@
 julia> print(render(c))
 SELECT "person_id"
 FROM "person"
-FETCH FIRST 1 ROW ONLY
source

LIT

LIT

FunSQL.LITMethod
LIT(; val)
 LIT(val)

A SQL literal.

In a context of a SQL clause, missing, numbers, strings and datetime values are automatically converted to SQL literals.

Examples

julia> c = LIT(missing);
 
 julia> print(render(c))
 NULL
julia> c = LIT("SQL is fun!");
 
 julia> print(render(c))
-'SQL is fun!'
source

NOTE

FunSQL.NOTEMethod
NOTE(; over = nothing, text, postfix = false)
+'SQL is fun!'
source

NOTE

FunSQL.NOTEMethod
NOTE(; over = nothing, text, postfix = false)
 NOTE(text; over = nothing, postfix = false)

A free-form prefix of postfix annotation.

Examples

julia> c = FROM(:p => :person) |>
            NOTE("TABLESAMPLE SYSTEM (50)", postfix = true) |>
            SELECT((:p, :person_id));
 
 julia> print(render(c))
 SELECT "p"."person_id"
-FROM "person" AS "p" TABLESAMPLE SYSTEM (50)
source

ORDER

FunSQL.ORDERMethod
ORDER(; over = nothing, by = [])
+FROM "person" AS "p" TABLESAMPLE SYSTEM (50)
source

ORDER

FunSQL.ORDERMethod
ORDER(; over = nothing, by = [])
 ORDER(by...; over = nothing)

An ORDER BY clause.

Examples

julia> c = FROM(:person) |>
            ORDER(:year_of_birth) |>
            SELECT(:person_id);
@@ -896,7 +896,7 @@
 julia> print(render(c))
 SELECT "person_id"
 FROM "person"
-ORDER BY "year_of_birth"
source

PARTITION

FunSQL.PARTITIONMethod
PARTITION(; over = nothing, by = [], order_by = [], frame = nothing)
+ORDER BY "year_of_birth"
source

PARTITION

FunSQL.PARTITIONMethod
PARTITION(; over = nothing, by = [], order_by = [], frame = nothing)
 PARTITION(by...; over = nothing, order_by = [], frame = nothing)

A window definition clause.

Examples

julia> c = FROM(:person) |>
            SELECT(:person_id,
                   AGG(:row_number, over = PARTITION(:year_of_birth)));
@@ -930,7 +930,7 @@
   "year_of_birth",
   (avg(count(*)) OVER (ORDER BY "year_of_birth" RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING))
 FROM "person"
-GROUP BY "year_of_birth"
source

SELECT

FunSQL.SELECTMethod
SELECT(; over = nothing, top = nothing, distinct = false, args)
+GROUP BY "year_of_birth"
source

SELECT

FunSQL.SELECTMethod
SELECT(; over = nothing, top = nothing, distinct = false, args)
 SELECT(args...; over = nothing, top = nothing, distinct = false)

A SELECT clause. Unlike raw SQL, SELECT() should be placed at the end of a clause chain.

Set distinct to true to add a DISTINCT modifier.

Examples

julia> c = SELECT(true, false);
 
 julia> print(render(c))
@@ -941,7 +941,7 @@
 
 julia> print(render(c))
 SELECT DISTINCT "zip"
-FROM "location"
source

SORT, ASC, and DESC

FunSQL.ASCMethod
ASC(; over = nothing, nulls = nothing)

Ascending order indicator.

source
FunSQL.DESCMethod
DESC(; over = nothing, nulls = nothing)

Descending order indicator.

source
FunSQL.SORTMethod
SORT(; over = nothing, value, nulls = nothing)
+FROM "location"
source

SORT, ASC, and DESC

FunSQL.ASCMethod
ASC(; over = nothing, nulls = nothing)

Ascending order indicator.

source
FunSQL.DESCMethod
DESC(; over = nothing, nulls = nothing)

Descending order indicator.

source
FunSQL.SORTMethod
SORT(; over = nothing, value, nulls = nothing)
 SORT(value; over = nothing, nulls = nothing)
 ASC(; over = nothing, nulls = nothing)
 DESC(; over = nothing, nulls = nothing)

Sort order options.

Examples

julia> c = FROM(:person) |>
@@ -951,7 +951,7 @@
 julia> print(render(c))
 SELECT "person_id"
 FROM "person"
-ORDER BY "year_of_birth" DESC
source

UNION

FunSQL.UNIONMethod
UNION(; over = nothing, all = false, args)
+ORDER BY "year_of_birth" DESC
source

UNION

FunSQL.UNIONMethod
UNION(; over = nothing, all = false, args)
 UNION(args...; over = nothing, all = false)

A UNION clause.

Examples

julia> c = FROM(:measurement) |>
            SELECT(:person_id, :date => :measurement_date) |>
            UNION(all = true,
@@ -967,18 +967,18 @@
 SELECT
   "person_id",
   "observation_date" AS "date"
-FROM "observation"
source

VALUES

VALUES

FunSQL.VALUESMethod
VALUES(; rows)
 VALUES(rows)

A VALUES clause.

Examples

julia> c = VALUES([("SQL", 1974), ("Julia", 2012), ("FunSQL", 2021)]);
 
 julia> print(render(c))
 VALUES
   ('SQL', 1974),
   ('Julia', 2012),
-  ('FunSQL', 2021)
source

VAR

VAR

FunSQL.VARMethod
VAR(; name)
 VAR(name)

A placeholder in a parameterized query.

Examples

julia> c = VAR(:year);
 
 julia> print(render(c))
-:year
source

WHERE

WHERE

FunSQL.WHEREMethod
WHERE(; over = nothing, condition)
 WHERE(condition; over = nothing)

A WHERE clause.

Examples

julia> c = FROM(:location) |>
            WHERE(FUN("=", :zip, "60614")) |>
            SELECT(:location_id);
@@ -986,7 +986,7 @@
 julia> print(render(c))
 SELECT "location_id"
 FROM "location"
-WHERE ("zip" = '60614')
source

WINDOW

WINDOW

FunSQL.WINDOWMethod
WINDOW(; over = nothing, args)
 WINDOW(args...; over = nothing)

A WINDOW clause.

Examples

julia> c = FROM(:person) |>
            WINDOW(:w1 => PARTITION(:year_of_birth),
                   :w2 => :w1 |> PARTITION(order_by = [:month_of_birth, :day_of_birth])) |>
@@ -999,7 +999,7 @@
 FROM "person"
 WINDOW
   "w1" AS (PARTITION BY "year_of_birth"),
-  "w2" AS ("w1" ORDER BY "month_of_birth", "day_of_birth")
source

WITH

FunSQL.WITHMethod
WITH(; over = nothing, recursive = false, args)
+  "w2" AS ("w1" ORDER BY "month_of_birth", "day_of_birth")
source

WITH

FunSQL.WITHMethod
WITH(; over = nothing, recursive = false, args)
 WITH(args...; over = nothing, recursive = false)

A WITH clause.

Examples

julia> c = FROM(:person) |>
            WHERE(FUN(:in, :person_id,
                           FROM(:essential_hypertension) |>
@@ -1056,4 +1056,4 @@
   WHERE ("cr"."relationship_id" = 'Subsumes')
 )
 SELECT *
-FROM "essential_hypertension"
source
+FROM "essential_hypertension"
source
diff --git a/dev/test/clauses/index.html b/dev/test/clauses/index.html index 5edb9d8c..cebc8f24 100644 --- a/dev/test/clauses/index.html +++ b/dev/test/clauses/index.html @@ -1224,4 +1224,4 @@ #=> SELECT * FROM "condition_occurrence" -=# +=# diff --git a/dev/test/index.html b/dev/test/index.html index 3bfe038b..b180a5c2 100644 --- a/dev/test/index.html +++ b/dev/test/index.html @@ -1,2 +1,2 @@ -Test Suite · FunSQL.jl

Test Suite

+Test Suite · FunSQL.jl

Test Suite

diff --git a/dev/test/nodes/index.html b/dev/test/nodes/index.html index ec4139b2..6eb7ef16 100644 --- a/dev/test/nodes/index.html +++ b/dev/test/nodes/index.html @@ -3359,4 +3359,4 @@ │ ) AS "visit_group_1" ON ("person_2"."person_id" = "visit_group_1"."person_id")""", │ columns = [SQLColumn(:person_id), SQLColumn(:max_visit_start_date)]) └ @ FunSQL … -=# +=# diff --git a/dev/test/other/index.html b/dev/test/other/index.html index fdd18678..9f8566c2 100644 --- a/dev/test/other/index.html +++ b/dev/test/other/index.html @@ -273,4 +273,4 @@ pack(sql, Dict("YEAR" => 1950)) #-> Any[1950]

pack can also be applied to a regular string, in which case it returns the parameters unchanged.

pack("SELECT * FROM person WHERE year_of_birth >= ?", (1950,))
-#-> (1950,)
+#-> (1950,) diff --git a/dev/two-kinds-of-sql-query-builders/index.html b/dev/two-kinds-of-sql-query-builders/index.html index 8ba806f9..8e2fcc9b 100644 --- a/dev/two-kinds-of-sql-query-builders/index.html +++ b/dev/two-kinds-of-sql-query-builders/index.html @@ -41,4 +41,4 @@ having orderby limit -end

Individual slots of this structure are populated by the corresponding pipeline nodes.

"Where" node acting on the syntax tree

This explains why the pipeline is insensitive to the order of the nodes. Indeed, as long as the content of the slots stays the same, it makes no difference in what order the slots are populated.

Pipeline is insensitive to the order of the nodes

This method of incrementally constructing a composite structure is known as the builder pattern. We can call the query builders that employ this pattern syntax-oriented.

Both data-oriented and syntax-oriented query builders are compositional: the difference is in the nature of the information processed by the units of composition. Data-oriented query builders incrementally refine the query output; syntax-oriented query builders incrementally assemble the SQL syntax tree. Their interfaces look almost identical, but their methods of operation are fundamentally different.

But which one is better? Syntax-oriented query builders have two definite advantages: they are easy to implement and they could support the full range of SQL features. Indeed, the interface of a syntax-oriented query builder is just a collection of builders for the SQL syntax tree. How complete the representation of the syntax tree determines how well various SQL features are supported.

On the other hand, syntax-oriented query builders are harder to use. As they directly represent the SQL grammar, they inherit all of its deficiencies. In particular, the rigid clause order makes it difficult to assemble complex data processing pipelines, especially when the arrangement of pipeline nodes is not predetermined.

A data-oriented query builder directly represents data processing nodes, which makes assembling data processing pipelines much more straightforward—as long as we can find the necessary nodes among those offered by the builder. But where does the builder get its collection of data processing nodes? And how can we tell if this collection is complete?

One way to implement a data-oriented query builder is to adapt a general-purpose query framework. Indeed, this is the origin of EF/LINQ, which is adapted from LINQ, and dbplyr, which is adapted from dplyr. The query framework determines what processing nodes are available and how they operate. In principle, any query framework could be adapted to SQL databases by introducing just one new node, a node that loads the content of a database table. If we place this node at the beginning of a pipeline and make the rest of it out of regular nodes, we obtain a pipeline that processes data from a SQL database. However, this pipeline will be very inefficient compared to a SQL engine, which can use indexes to avoid loading the entire table into memory and thus can process the same data much faster. This is why EF/LINQ and dbplyr generate a SQL query that replaces the pipeline as a whole. The pipeline itself no longer runs directly, but now serves as a specification, with the assumption that if it were to run, it would produce the same output as the SQL query. This method of transforming a general-purpose query framework to a SQL query builder is called SQL pushdown.

However, SQL pushdown has a serious limitation. A general-purpose query framework is not designed with SQL compatibility in mind. For this reason, some of the pipelines assembled within this framework cannot be converted to SQL. Even worse, many useful SQL queries have no equivalent pipelines and thus cannot be generated using SQL pushdown. Indeed, SQL accumulated a wide range of features and capabilities since it first appeared in 1974. The first revision of the SQL standard, SQL-86, already supported Cartesian products, filtering, grouping, aggregation, and correlated subqueries. The next revision, SQL-92, added many join types and introduced query nesting. SQL:1999 greatly expanded its analytical capabilities by adding two types of queries: recursive queries, for processing hierarchical data, and data cube queries, which generalize histograms, cross-tabulations, roll-ups, drill-downs, and sub-totals. The follow-up revision, SQL:2003, added support for aggregate functions over a running window. Admittedly, SQL is a quintessential enterprise abomination, a hodgepodge of features added to support every imaginable use case, but with inadequate syntax, weird gaps in functionality, and no regards to internal consistency. Nevertheless, the breadth of SQL's capabilities has not been matched by any other query framework, including LINQ or dplyr. So when we generate SQL queries using EF/LINQ or dbplyr, a large subset of these capabilities remains inaccessible.

FunSQL is a data-oriented query builder created specifically to expose full expressive power of SQL. Unlike EF/LINQ and dbplyr, FunSQL was not adapted from an existing query framework, but was carefully designed from scratch to match SQL's capabilities. These capabilities include, for example, support for correlated subqueries and lateral joins (with Bind node), aggregate and window functions (using Group and Partition nodes), as well as recursive queries (with Iterate node). This comprehensive support for SQL capabilities makes FunSQL the only SQL query builder suitable for assembling complex data processing pipelines. Moreover, even though FunSQL pipelines cannot be run directly, every FunSQL node has a well-defined data processing semantics, which means that, in principle, FunSQL could be developed into a full-blown query framework. This potentially opens a path for replacing SQL with an equally powerful, but a more coherent and expressive query language.

+end

Individual slots of this structure are populated by the corresponding pipeline nodes.

"Where" node acting on the syntax tree

This explains why the pipeline is insensitive to the order of the nodes. Indeed, as long as the content of the slots stays the same, it makes no difference in what order the slots are populated.

Pipeline is insensitive to the order of the nodes

This method of incrementally constructing a composite structure is known as the builder pattern. We can call the query builders that employ this pattern syntax-oriented.

Both data-oriented and syntax-oriented query builders are compositional: the difference is in the nature of the information processed by the units of composition. Data-oriented query builders incrementally refine the query output; syntax-oriented query builders incrementally assemble the SQL syntax tree. Their interfaces look almost identical, but their methods of operation are fundamentally different.

But which one is better? Syntax-oriented query builders have two definite advantages: they are easy to implement and they could support the full range of SQL features. Indeed, the interface of a syntax-oriented query builder is just a collection of builders for the SQL syntax tree. How complete the representation of the syntax tree determines how well various SQL features are supported.

On the other hand, syntax-oriented query builders are harder to use. As they directly represent the SQL grammar, they inherit all of its deficiencies. In particular, the rigid clause order makes it difficult to assemble complex data processing pipelines, especially when the arrangement of pipeline nodes is not predetermined.

A data-oriented query builder directly represents data processing nodes, which makes assembling data processing pipelines much more straightforward—as long as we can find the necessary nodes among those offered by the builder. But where does the builder get its collection of data processing nodes? And how can we tell if this collection is complete?

One way to implement a data-oriented query builder is to adapt a general-purpose query framework. Indeed, this is the origin of EF/LINQ, which is adapted from LINQ, and dbplyr, which is adapted from dplyr. The query framework determines what processing nodes are available and how they operate. In principle, any query framework could be adapted to SQL databases by introducing just one new node, a node that loads the content of a database table. If we place this node at the beginning of a pipeline and make the rest of it out of regular nodes, we obtain a pipeline that processes data from a SQL database. However, this pipeline will be very inefficient compared to a SQL engine, which can use indexes to avoid loading the entire table into memory and thus can process the same data much faster. This is why EF/LINQ and dbplyr generate a SQL query that replaces the pipeline as a whole. The pipeline itself no longer runs directly, but now serves as a specification, with the assumption that if it were to run, it would produce the same output as the SQL query. This method of transforming a general-purpose query framework to a SQL query builder is called SQL pushdown.

However, SQL pushdown has a serious limitation. A general-purpose query framework is not designed with SQL compatibility in mind. For this reason, some of the pipelines assembled within this framework cannot be converted to SQL. Even worse, many useful SQL queries have no equivalent pipelines and thus cannot be generated using SQL pushdown. Indeed, SQL accumulated a wide range of features and capabilities since it first appeared in 1974. The first revision of the SQL standard, SQL-86, already supported Cartesian products, filtering, grouping, aggregation, and correlated subqueries. The next revision, SQL-92, added many join types and introduced query nesting. SQL:1999 greatly expanded its analytical capabilities by adding two types of queries: recursive queries, for processing hierarchical data, and data cube queries, which generalize histograms, cross-tabulations, roll-ups, drill-downs, and sub-totals. The follow-up revision, SQL:2003, added support for aggregate functions over a running window. Admittedly, SQL is a quintessential enterprise abomination, a hodgepodge of features added to support every imaginable use case, but with inadequate syntax, weird gaps in functionality, and no regards to internal consistency. Nevertheless, the breadth of SQL's capabilities has not been matched by any other query framework, including LINQ or dplyr. So when we generate SQL queries using EF/LINQ or dbplyr, a large subset of these capabilities remains inaccessible.

FunSQL is a data-oriented query builder created specifically to expose full expressive power of SQL. Unlike EF/LINQ and dbplyr, FunSQL was not adapted from an existing query framework, but was carefully designed from scratch to match SQL's capabilities. These capabilities include, for example, support for correlated subqueries and lateral joins (with Bind node), aggregate and window functions (using Group and Partition nodes), as well as recursive queries (with Iterate node). This comprehensive support for SQL capabilities makes FunSQL the only SQL query builder suitable for assembling complex data processing pipelines. Moreover, even though FunSQL pipelines cannot be run directly, every FunSQL node has a well-defined data processing semantics, which means that, in principle, FunSQL could be developed into a full-blown query framework. This potentially opens a path for replacing SQL with an equally powerful, but a more coherent and expressive query language.