-
-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expect_Column_Values_To_Match_Regex Test is Failing with an argument error #274
Comments
Hi @brian-custer! Doesn't look like |
I've done that and it is still failing with the error I gave you. Any ideas how we can coax the test into working? |
Sorry, not sure I'm following. What exactly have you already done? |
I stated what I had done in the issue. I have configured the test as shown in the issue and it errors out with the error regarding too many arguments for the regexp_instr function. I've inspected the compiled test and sure enough it is putting two 1's in addition to the column expression and the regex. Databricks is throwing the exception.
Thanks,
Brian Custer
206-661-2674
…________________________________
From: Claus Herther ***@***.***>
Sent: Wednesday, August 16, 2023 1:05 PM
To: calogica/dbt-expectations ***@***.***>
Cc: Brian Custer ***@***.***>; Mention ***@***.***>
Subject: EXTERNAL - Re: [calogica/dbt-expectations] Expect_Column_Values_To_Match_Regex Test is Failing with an argument error (Issue #274)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Sorry, not sure I'm following. What exactly have you already done?
—
Reply to this email directly, view it on GitHub<#274 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A3FCQ5BILRGUY3GWQSBRYFLXVURXBANCNFSM6AAAAAA3SVCFYA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Right, your issue is that the dbt-sparkutils package does nothing to help you run dbt-expectations on databricks since it doesn't implement any shims for it. Unless you or someone adds spark support for regexp_instr to dbt-sparkutils, you're going continue getting this error. Your other option is to implement the shim locally in your project. |
That's not the impression i got when I installed it in my project. It explicitly said to install the spark_utils package which would shim the expectations package. It did not say anything about coding this myself. I've had good luck running other expectations tests so I'm surprised that this one fails.
Thanks,
Brian Custer
206-661-2674
…________________________________
From: Claus Herther ***@***.***>
Sent: Wednesday, August 16, 2023 2:46 PM
To: calogica/dbt-expectations ***@***.***>
Cc: Brian Custer ***@***.***>; Mention ***@***.***>
Subject: EXTERNAL - Re: [calogica/dbt-expectations] Expect_Column_Values_To_Match_Regex Test is Failing with an argument error (Issue #274)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Right, your issue is that the dbt-sparkutils package does nothing to help you run dbt-expectations on databricks since it doesn't implement any shims for it. Unless you or someone adds spark support for regexp_instr to dbt-sparkutils, you're going continue getting this error. Your other option is to implement the shim locally in your project.
—
Reply to this email directly, view it on GitHub<#274 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A3FCQ5ED6NA6ZHLKCLVAIOLXVU5U3ANCNFSM6AAAAAA3SVCFYA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
We actually removed the reference to spark-utils in the README when we deprecated support for dbt-utils back in Nov '22 (#217 https://github.com/calogica/dbt-expectations/pull/217/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L44) |
Hi y'all I had the same issue with the As a quick fix, I added the following macro in my project: -- myproject/macros/databricks__regexp_instr.sql
{% macro databricks__regexp_instr(source_value, regexp, position, occurrence, is_raw, flags) %}
-- Put your Databricks-compatible regexp_instr call here
-- This is just an example; you'll need to modify it based on your needs and if your regexp is raw or not
-- https://docs.databricks.com/en/sql/language-manual/functions/regexp_instr.html
-- https://docs.databricks.com/en/sql/language-manual/data-types/string-type.html
regexp_instr({{ source_value }}, '{{ regexp }}')
{% endmacro %} |
Thanks for the info. I'll do that and see if I can get it to work.
Thanks,
Brian Custer
206-661-2674
…________________________________
From: bry890 ***@***.***>
Sent: Thursday, August 17, 2023 1:27 PM
To: calogica/dbt-expectations ***@***.***>
Cc: Brian Custer ***@***.***>; Mention ***@***.***>
Subject: EXTERNAL - Re: [calogica/dbt-expectations] Expect_Column_Values_To_Match_Regex Test is Failing with an argument error (Issue #274)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi y'all
I had the same issue with the dbt_expectations.expect_column_values_to_match_regex test on Databricks. As @clausherther<https://github.com/clausherther> mentioned the problem seems to be that Databricks regexp_instr function only accepts two arguments, whereas the default is passing in four.
As a quick fix, I added the following macro in my project:
-- myproject/macros/databricks__regexp_instr.sql
{% macro databricks__regexp_instr(source_value, regexp, position, occurrence, is_raw, flags) %}
-- Put your Databricks-compatible regexp_instr call here
-- This is just an example; you'll need to modify it based on your needs and if your regexp is raw or not
-- https://docs.databricks.com/en/sql/language-manual/functions/regexp_instr.html
-- https://docs.databricks.com/en/sql/language-manual/data-types/string-type.html
regexp_instr({{ source_value }}, '{{ regexp }}')
{% endmacro %}
—
Reply to this email directly, view it on GitHub<#274 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A3FCQ5B2X73HEVIIHYAA7XDXVZ5B5ANCNFSM6AAAAAA3SVCFYA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
FYI, support for Spark in dbt-date released today, working on Spark support for dbt-expectations. See https://getdbt.slack.com/archives/CU4MRJ7QB/p1692723790034329. |
If anyone has experience with Regex parsing in dbt-spark, I'd appreciate the assist here: https://getdbt.slack.com/archives/CNGCW8HKL/p1692733472369839 |
Thanks, good to know. I'll keep an eye out for an update.
Thanks,
Brian Custer
206-661-2674
…________________________________
From: Claus Herther ***@***.***>
Sent: Tuesday, August 22, 2023 12:51 PM
To: calogica/dbt-expectations ***@***.***>
Cc: Brian Custer ***@***.***>; Mention ***@***.***>
Subject: EXTERNAL - Re: [calogica/dbt-expectations] Expect_Column_Values_To_Match_Regex Test is Failing with an argument error (Issue #274)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
FYI, support for Spark in dbt-date released today, working on Spark support for dbt-expectations. See https://getdbt.slack.com/archives/CU4MRJ7QB/p1692723790034329.
—
Reply to this email directly, view it on GitHub<#274 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A3FCQ5CWTSWMNNPITDGMJQTXWUEUTANCNFSM6AAAAAA3SVCFYA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Is this a new bug in dbt-expectations?
-I believe this is a new bug
I have defined the test using the following code in my models yaml file:
regex: "^[0-9]{2}/[0-9]{2}$"
is_raw: true
row_condition: "CreditCardExpirationDate is not null"
Expected Behavior
I expect the test to work.
Steps To Reproduce
Configure your test like the above in a models yaml file.
Relevant log output
The log output is: Error executing test: regexp_instr requires 2 arguments but 4 were given.
Environment
The environment is vs code and dbt core.
Which database adapter are you using with dbt?
dbt-databricks 1.5.5
Note: dbt-expectations currently does not support database adapters other than the ones listed below.
Additional Context
I am using the shim dbt-sparkutils to compensate for the fact that expectations doesn't run in databricks.
The text was updated successfully, but these errors were encountered: