You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysts say the SQL generated by ibis is sometimes difficult to read, which can make it difficult to debug ibis logic and, importantly, also makes it difficult to communicate about what ibis is doing to audiences that primarily understand SQL.
This example with distinct() is one that came up frequently in my discussions. So, the goal of this feature is primarily to improve the legibility of SQL generated by ibis to humans. It's also possible that it could also improve performance for some backends, though I imagine most would optimize this away when generating their internal execution plan.
This would probably have to be an optimization in our expression rewriting system.
Alternatively, if simplifying distinct expressions generically is too complex, allowing a .select() expression to take a distinct=True argument and generating directly a SELECT DISTINCT would also be an acceptable solution.
What version of ibis are you running?
9.3.0
What backend(s) are you using, if any?
this problem is backend-independent
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
Originally discussed here.
When compiling expressions that use
.distinct()
to SQL, it seems likeibis
always generates a subquery selection.E.g., consider the following ibis program.
This will produce something like:
but ideally we'd get something like:
What is the motivation behind your request?
Analysts say the SQL generated by ibis is sometimes difficult to read, which can make it difficult to debug ibis logic and, importantly, also makes it difficult to communicate about what ibis is doing to audiences that primarily understand SQL.
This example with
distinct()
is one that came up frequently in my discussions. So, the goal of this feature is primarily to improve the legibility of SQL generated by ibis to humans. It's also possible that it could also improve performance for some backends, though I imagine most would optimize this away when generating their internal execution plan.Describe the solution you'd like
per @cpcloud here:
Alternatively, if simplifying distinct expressions generically is too complex, allowing a
.select()
expression to take adistinct=True
argument and generating directly aSELECT DISTINCT
would also be an acceptable solution.What version of ibis are you running?
9.3.0
What backend(s) are you using, if any?
this problem is backend-independent
Code of Conduct
The text was updated successfully, but these errors were encountered: