-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Column-level lexically-scoped CTE expressions #10826
Conversation
This reverts commit b7be792.
|
||
rename self (name : Text) (binder : Text) = | ||
binders = self.binders_ref.get | ||
if binders.contains_key binder then binders.get binder else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick as that's probably not a bottleneck, but what if we binders.get binder if_missing=Nothing
? We could avoid accessing the hashmap twice 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really really great. I really appreciate the comprehensive documentation.
We should remember to run Snowflake tests before integrating.
new (existing_tables : Vector Text) -> Let_Binder_Renamer = | ||
Let_Binder_Renamer.Value (Hashset.from_vector existing_tables) (Ref.new Dictionary.empty) (Ref.new 0) | ||
|
||
rename self (name : Text) (binder : Text) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the consensus is all methods should have ##
comment, probably with PRIVATE
here.
Also I'd appreciate a short note on the logic here - I assume binder
is the unique identifier and when we first encounter it, we try to find a shorter name based on name
for it, and if we encounter the same identifier again we return the already assigned name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I added docs.
Just curious -- if the type is PRIVATE, do the methods have to be PRIVATE too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I don't know, hence I add PRIVATE just to be sure.
I think they are not private, so if you somehow got a value of this private type, they probably would show up in CB on it. But that seems harmless as long as they do not show up in CB search 'in general' (so e.g. in the default mode where you have no value selected).
I think we just had this convention to mark methods as PRIVATE unless meant for users, but it was from time before we got private
types so it's a question of how we want to amend that convention for that case.
Also, I'm not sure how this plays with @jdunkerley's code linter/formatter. I assume it may 'warn' about these non-PRIVATE functions, but again I'm not sure.
@@ -212,6 +212,11 @@ type DB_Column | |||
## PRIVATE | |||
Column-level manual CTE factoring. | |||
|
|||
Arguments: | |||
- name: A prefix to use for the generated let binder. | |||
- callback: A functiont that receives the let reference and returns a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- callback: A functiont that receives the let reference and returns a | |
- callback: A function that receives the let reference and returns a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -707,15 +717,23 @@ default_fetch_types_query dialect expression context where_filter_always_false_l | |||
|
|||
## PRIVATE | |||
Helper class for shortening the binder names generated for WITH clauses. | |||
|
|||
Each time `rename` is called on an identifier it hasn't seen before, it | |||
generates a new, short identifier. The short identifier consits of the `name` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generates a new, short identifier. The short identifier consits of the `name` | |
generates a new, short identifier. The short identifier consists of the `name` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the comprehensive docs
_ : Float -> x.is_finite | ||
_ -> False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ : Float -> x.is_finite | |
_ -> False | |
_ : Float -> x.is_finite | |
_ : Number -> True | |
_ : Decimal -> True | |
_ -> False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually changing to:
_ : Float -> x.is_finite
_ -> True
Because we're already using expect_numeric
, and only Float can be infinite.
group_builder.teardown <| | ||
connection_data.teardown | ||
|
||
if connection_data.connection.dialect.supports_nested_with_clause then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@radeusgd I'm using the dialect for this. It could also be a test flag, but then we would have this flag in both the test config and the dialect, which seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 that seems very reasonable
Snowflake CI run successful. |
Pull Request Description
This implements
DB_Column.with
, which usesWITH ... AS
SQL clauses to remove duplicates in the generated SQL.After a discussion with @radeusgd, we concluded that we will probably want a more complete CTE implementation, so this one is useful for now to deal with big queries (like
round
).Important Notes
Still to do in this PR:
with
tolet
(or something similar)State
hack by moving query generation into a class and using aRef
field for scopingResults on
round_float
:Compare the SQL:
without-ctes.sql.txt
with-ctes.sql.txt
Update, with name shortening:
without-cte.txt
with-cte.txt
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.