-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] --empty doesn't work with relation that are already named and might conflict #124
Comments
Yes, some do (e.g. Postgres) but some don't (e.g. Snowflake). This could be an attribute of the dialect's adapter, e.g. For those that do not required, aliasing shouldn't be used at all. Then the problem is solved arbitrarily, no other fixes required. For those that do require aliases, that is where it gets trickier. One of the issues I see with something like, So, based on that, and based on the fact that most (all?) OLAP systems (i.e. what typical users are using dbt with) don't require aliasing subqueries, I actually think there is a strong argument for not aliasing these subqueries at all: OLAP people don't need aliases; Postgres people do but the code isn't made simpler by forcing it into the Jinja instead of the SQL. (In that case, you also wouldn't even need a |
Just want to add +1 as we're having the same issue (we're using dbt-bigquery). |
+1 on this, using dbt-snowflake |
Working on a PR for this, approach is going to be to make the aliasing configurable for each adapter so we can keep it for postgres or mysql but drop for platforms that don't need it (i.e. bigquery / snowflake). This should at least fix this case for those platforms. Worth noting the workaround, in the meantime, is to move the ref into a CTE.
|
Users may still be using aliases for tables in Postgres, e.g. select a.*
from {{ ref("base_table") }} a
left join {{ ref("foreign_entity") }} b
on a.foreign_id = b.id The above is valid Postgres SQL, after rendering. So it seems like adding aliases can still end up breaking workflows for some Postgres users, unless I'm missing something. |
You're not missing something and we still need to come up with a better long term solution here (other than asking people to write their SQL differently if they want to use |
I guess my point is, for SQL for dialects requiring aliased subqueries, the path to making all queries working arbitrarily with
In both cases, SQL rewrites are required (one set of users has to make changes), and between the two I think people would rather write Furthermore, unless someone can chime in with expertise on the Postgres query planner, I'd not rule out the possibility that replacing |
Ah, it seems like it does not ever impact the query planner to replace One more note on Postgres, specifically, for what it's worth, since apparently my information is a little out of date: Postgres ≤13 requires aliases, but ≥14 does not. |
@dwreeves It may be true that most users have those preferences but changing this behavior in dbt-postgres would amount to a breaking change for the set of users for whom this is working as opposed to preserving the present behavior where it's always been broken for this case. Ultimately we need to find a better long term approach for supporting |
closing, tracking the long term fix here: #199 |
For posterity: aliases became optional in v16, not v14! See: postgres/postgres@bcedd8f |
<!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation As per this bug [here](dbt-labs/dbt-adapters#124), the `--empty` flag doesn't work with relation that are already named and fails. A quick repro: * Create a simple model with: `SELECT * FROM {{ ref('my_upstream_model')}} some_alias` * The generated SQL when using the `--empty` flag, looks like this: ```sql as ( SELECT * FROM ( select * from "my_upstream_model" where false limit 0) _dbt_limit_subq_my_upstream_model alias ); ``` <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] This PR includes the following [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note): - <!-- Add release notes here or explicitly state that there are no user-facing behavior changes. -->
Thanks! Seems I have poor reading comprehension, or confused myself clicking through the docs versions 😞 |
Hello everyone. I use the adapter dbt-vertica |
Is this a new bug?
Current Behavior
This a follow up from my comment on the original PR.
If any is using code like
FROM {{ source('schema', 'table') }} my_table_alias
while using the --empty alias.The resulting SQL is the following:
FROM (select * from `db`.`schema`.`table` where false limit 0) _dbt_limit_subq my_table_alias
The problem is linked to the alias that's already set up here:
dbt-adapters/dbt/adapters/base/relation.py
Lines 213 to 214 in 2a99e36
which prevents from aliasing by ourselves. Also if there are multiple references, I guess the tables will all be aliased to
_dbt_limit_subq
resulting in an error.Expected Behavior
The simplest expected would be that it results in
FROM (select * from `db`.`schema`.`table` where false limit 0) my_table_alias
(practically that no table name is suffixed automatically). However if the user didn't name its table, it will fail... which is another problem. Yet we could look into alternatives:Steps To Reproduce
my_model.sql
with an SQL like:SELECT * FROM {{ source('schema', 'table') }} my_table_alias
dbt run -s my_model --empty
Relevant log output
No response
Environment
Additional Context
No response
The text was updated successfully, but these errors were encountered: