-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for BigQuery table and view options #1061
Add support for BigQuery table and view options #1061
Conversation
Extends the parser with BigQuery support for - Table/View level options - Column level options - Table creation configurations `CLUSTER BY`, `PARTITION BY`
Pull Request Test Coverage Report for Build 7584334675
💛 - Coveralls |
also replace custom trimming with concat
a345ec1
to
3b2dec1
Compare
src/ast/mod.rs
Outdated
@@ -2713,6 +2752,25 @@ impl fmt::Display for Statement { | |||
if let Some(order_by) = order_by { | |||
write!(f, " ORDER BY ({})", display_comma_separated(order_by))?; | |||
} | |||
if let Some(bigquery_config) = big_query_config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names bigquery_config
and big_query_config
are different, is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right that wasn't intentional, updated!
src/ast/helpers/stmt_create_table.rs
Outdated
@@ -72,6 +72,7 @@ pub struct CreateTableBuilder { | |||
pub on_commit: Option<OnCommit>, | |||
pub on_cluster: Option<String>, | |||
pub order_by: Option<Vec<Ident>>, | |||
pub big_query_config: Option<Box<BigQueryCreateTableConfiguration>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other dialects that will have similar variations on Create Table?
If yes, might be worth pulling this up to a more generic type - something like CreateTableConfiguration
with variants for different dialects.
Might be a premature abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense to do! I've updated to use an enum as a result. There are a couple fields on the struct like the hive_* ones that could use that format so it seems reasonable to introduce yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @iffyio and I apologize for the late reivew (I typically batch the reviews of sqlparser PRs prior to release to maximize efficiency)
I left some suggestions, please let me know what you think
src/ast/ddl.rs
Outdated
@@ -527,6 +528,29 @@ impl fmt::Display for ColumnDef { | |||
} | |||
} | |||
|
|||
/// Column definition for a view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add some examples to this doc comment showing what is allowed (and not)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the description
src/ast/ddl.rs
Outdated
/// ``` | ||
/// [1]: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#view_column_option_list | ||
/// [2]: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#column_option_list | ||
SqlOptions(Vec<SqlOption>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the SQL is OPTION
s what would you think about keeping the AST similar?
SqlOptions(Vec<SqlOption>), | |
Options(Vec<SqlOption>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
src/ast/mod.rs
Outdated
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))] | ||
pub enum CreateTableConfiguration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if you considered inlining these into Statement::CreateTable
rather than adding a new structure?
Like
/// CREATE TABLE
CreateTable {
or_replace: bool,
temporary: bool,
external: bool,
/// A partition expression for the table.
/// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#partition_expression>
partition_by: Option<Expr>,
/// Table clustering column list.
/// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#table_option_list>
cluster_by: Option<Vec<Ident>>,
/// Table options list.
/// <https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#table_option_list>
options: Option<Vec<SqlOption>>,
...
I think that might be more consistent with other dialect specific fields as well as order_by
and strict
.
Having an extra struct here also may be confusing as it doesn't seem to mirror the structure of the SQL -- for example, it seems to imply to me that there is some sort of "CONFIG" keyword / clause that appears in the SQL when https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_table_statement seems to imply that PARTITION BY
CLUSTER BY
and OPTIONS
are clauses of the CREATE TABLE
statement itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes I figured to have them inline initially, agree that makes it less confusing. The only issue was with serializing back to SQL text - since the fields in bigquery's case have a defined order (e.g PARTITION BY
before CLUSTER BY
), the inlined fields might end up not being reusable by other dialects if they show up at different parts of the CREATE statement. I'm not so sure if that's worth optimizing for though, so happy to just inline them if that's preferrable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an inlined version would be better to maintain consistency. If someone needs a different order when serializing for another dialect we can add that as a follow on PR perhaps.
If for some reason we need to keep the separate struct I think that would be acceptable if there were some doc comments explaining the rational (as that is not easy to understand from the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good! Will make updates to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlined!
@alamb sorry for the delay in following up on this - I've addressed the comments and it should be good for another pass when you get the chance to! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some(vec![ | ||
SqlOption { | ||
name: Ident::new("partition_expiration_days"), | ||
value: Expr::Value(number("1")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for testing that the value came through as an Expr
Extends the parser with BigQuery support for
CLUSTER BY
,PARTITION BY