Add support for BigQuery table and view options #1061

iffyio · 2023-12-03T07:40:33Z

Extends the parser with BigQuery support for

Table/View level options
Column level options
Table creation configurations CLUSTER BY, PARTITION BY

Extends the parser with BigQuery support for - Table/View level options - Column level options - Table creation configurations `CLUSTER BY`, `PARTITION BY`

coveralls · 2023-12-03T07:42:44Z

Pull Request Test Coverage Report for Build 7584334675

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.3%) to 87.607%

Totals
Change from base Build 7575556197:	-0.3%
Covered Lines:	19121
Relevant Lines:	21826

💛 - Coveralls

also replace custom trimming with concat

tobyhede · 2023-12-09T23:18:32Z

src/ast/mod.rs

@@ -2713,6 +2752,25 @@ impl fmt::Display for Statement {
                if let Some(order_by) = order_by {
                    write!(f, " ORDER BY ({})", display_comma_separated(order_by))?;
                }
+                if let Some(bigquery_config) = big_query_config {


The names bigquery_config and big_query_config are different, is that intentional?

Ah right that wasn't intentional, updated!

tobyhede · 2023-12-09T23:22:08Z

src/ast/helpers/stmt_create_table.rs

@@ -72,6 +72,7 @@ pub struct CreateTableBuilder {
    pub on_commit: Option<OnCommit>,
    pub on_cluster: Option<String>,
    pub order_by: Option<Vec<Ident>>,
+    pub big_query_config: Option<Box<BigQueryCreateTableConfiguration>>,


Are there other dialects that will have similar variations on Create Table?
If yes, might be worth pulling this up to a more generic type - something like CreateTableConfiguration with variants for different dialects.

Might be a premature abstraction.

I think that makes sense to do! I've updated to use an enum as a result. There are a couple fields on the struct like the hive_* ones that could use that format so it seems reasonable to introduce yeah

alamb

Thank you for the contribution @iffyio and I apologize for the late reivew (I typically batch the reviews of sqlparser PRs prior to release to maximize efficiency)

I left some suggestions, please let me know what you think

alamb · 2023-12-20T21:27:40Z

src/ast/ddl.rs

@@ -527,6 +528,29 @@ impl fmt::Display for ColumnDef {
    }
 }

+/// Column definition for a view.


Could you please add some examples to this doc comment showing what is allowed (and not)?

Updated the description

alamb · 2023-12-20T21:28:37Z

src/ast/ddl.rs

+    /// ```
+    /// [1]: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#view_column_option_list
+    /// [2]: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#column_option_list
+    SqlOptions(Vec<SqlOption>),


Given the SQL is OPTIONs what would you think about keeping the AST similar?

Suggested change

SqlOptions(Vec<SqlOption>),

Options(Vec<SqlOption>),

alamb · 2023-12-20T21:35:49Z

src/ast/mod.rs

+#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
+#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
+#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
+pub enum CreateTableConfiguration {


I wonder if you considered inlining these into Statement::CreateTable rather than adding a new structure?

Like

/// CREATE TABLE CreateTable { or_replace: bool, temporary: bool, external: bool, /// A partition expression for the table. /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#partition_expression> partition_by: Option<Expr>, /// Table clustering column list. /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#table_option_list> cluster_by: Option<Vec<Ident>>, /// Table options list. /// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#table_option_list> options: Option<Vec<SqlOption>>, ...

I think that might be more consistent with other dialect specific fields as well as order_by and strict.

Having an extra struct here also may be confusing as it doesn't seem to mirror the structure of the SQL -- for example, it seems to imply to me that there is some sort of "CONFIG" keyword / clause that appears in the SQL when https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_table_statement seems to imply that PARTITION BY CLUSTER BY and OPTIONS are clauses of the CREATE TABLE statement itself

Ah yes I figured to have them inline initially, agree that makes it less confusing. The only issue was with serializing back to SQL text - since the fields in bigquery's case have a defined order (e.g PARTITION BY before CLUSTER BY), the inlined fields might end up not being reusable by other dialects if they show up at different parts of the CREATE statement. I'm not so sure if that's worth optimizing for though, so happy to just inline them if that's preferrable

I think an inlined version would be better to maintain consistency. If someone needs a different order when serializing for another dialect we can add that as a follow on PR perhaps.

If for some reason we need to keep the separate struct I think that would be acceptable if there were some doc comments explaining the rational (as that is not easy to understand from the code).

That sounds good! Will make updates to this

iffyio · 2024-01-19T13:27:42Z

@alamb sorry for the delay in following up on this - I've addressed the comments and it should be good for another pass when you get the chance to!

alamb

Thank you @iffyio -- this is looking great.

Also thanks again @tobyhede for your reviews

alamb · 2024-01-23T22:20:58Z

tests/sqlparser_bigquery.rs

+                    Some(vec![
+                        SqlOption {
+                            name: Ident::new("partition_expiration_days"),
+                            value: Expr::Value(number("1")),


👍 for testing that the value came through as an Expr

Add support for BigQuery table and view options

b374f0e

Extends the parser with BigQuery support for - Table/View level options - Column level options - Table creation configurations `CLUSTER BY`, `PARTITION BY`

format option labels without leading space

3b2dec1

also replace custom trimming with concat

iffyio force-pushed the bigquery-create-table-view-options branch from a345ec1 to 3b2dec1 Compare December 7, 2023 08:06

tobyhede reviewed Dec 9, 2023

View reviewed changes

use enum to represent table config

f7a1c56

alamb mentioned this pull request Dec 19, 2023

DataFusion weekly project plan (Andrew Lamb) - Dec 18, 2023 apache/datafusion#8577

Closed

7 tasks

alamb reviewed Dec 20, 2023

View reviewed changes

iffyio added 2 commits January 19, 2024 14:12

Merge remote-tracking branch 'upstream/main' into tmp

a894a74

Inline table options, add coment to ViewColumnDef

87642c8

This was referenced Jan 19, 2024

DataFusion weekly project plan (Andrew Lamb) - Jan 15, 2024 apache/datafusion#8864

Closed

DataFusion weekly project plan (Andrew Lamb) - Jan 22, 2024 apache/datafusion#8933

Closed

alamb approved these changes Jan 23, 2024

View reviewed changes

alamb merged commit 3a6d3ec into apache:main Jan 23, 2024
10 checks passed

iffyio deleted the bigquery-create-table-view-options branch July 16, 2024 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for BigQuery table and view options #1061

Add support for BigQuery table and view options #1061

iffyio commented Dec 3, 2023

coveralls commented Dec 3, 2023 •

edited

Loading

tobyhede Dec 9, 2023

iffyio Dec 16, 2023

tobyhede Dec 9, 2023

iffyio Dec 16, 2023

alamb left a comment

alamb Dec 20, 2023

iffyio Jan 19, 2024

alamb Dec 20, 2023

iffyio Jan 19, 2024

alamb Dec 20, 2023

iffyio Dec 30, 2023

alamb Jan 1, 2024

iffyio Jan 9, 2024

iffyio Jan 19, 2024

iffyio commented Jan 19, 2024

alamb left a comment

alamb Jan 23, 2024

Add support for BigQuery table and view options #1061

Add support for BigQuery table and view options #1061

Conversation

iffyio commented Dec 3, 2023

coveralls commented Dec 3, 2023 • edited Loading

Pull Request Test Coverage Report for Build 7584334675

💛 - Coveralls

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iffyio commented Jan 19, 2024

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Dec 3, 2023 •

edited

Loading