You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Although all values are distinct in that model column (i.e. all column values are "the most common"), any value in value_set different from 'b' (the first result naturally ordered) will fail.
Relevant log output
with value_counts as (
select
col_string_b as value_field,
count(*) as value_count
fromdbt_expectations_integration_tests.data_testgroup by col_string_b
),
value_counts_ranked as (
select*,
row_number() over(order by value_count desc) as value_count_rank
from
value_counts
),
value_count_top_n as (
select
value_field
from
value_counts_ranked
where
value_count_rank =1
),
set_values as (
select'ab'as value_field
),
unique_set_values as (
select distinct value_field
from
set_values
),
validation_errors as (
-- values from the model that are not in the setselect
value_field
from
value_count_top_n
where
value_field not in (select value_field from unique_set_values)
)
select*from validation_errors
Note: dbt-expectations currently does not support database adapters other than the ones listed below.
Postgres
Snowflake
BigQuery
Additional Context
Great expectations has an additional ties_okay argument for partial matches. We could also add support for that.
So following tests would succeed
# Expect error if not all most common values are in the set
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['b']top_n: 1config:
error_if: "=0"warn_if: "<3"# Expect success if not all most common values are in the set but ties_okay is set
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['b']top_n: 1ties_okay: true# Expect error if none of the most common values are in the set and ties_okay is set
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['invalid_value']top_n: 1ties_okay: trueconfig:
error_if: "=0"warn_if: "<4"# Expect success if not all most common values are in the set but ties_okay is set# and the set contains extra values
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['b', 'invalid_value']top_n: 1ties_okay: true# Expect success if not all most common values are in the set but ties_okay is set# and value is not first one of the column naturally ordered
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['ab']top_n: 1ties_okay: true# Expect success if all most common values are in the set
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['b', 'ab', 'abc', 'abcd']top_n: 1# Expect success if all most common values are in the set # and the set contains extra values
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['b', 'ab', 'abc', 'abcd', 'invalid_value']top_n: 1# Expect error if none of the most common values are in the set # and the set contains extra values
- dbt_expectations.expect_column_most_common_value_to_be_in_set:
value_set: ['invalid_value1', 'invalid_value2', 'invalid_value3', 'invalid_value4', 'invalid_value5']top_n: 1config:
error_if: "=0"warn_if: "<4"
The text was updated successfully, but these errors were encountered:
Is this a new bug in dbt-expectations?
Current Behavior
When most common values are non-unique, running expect_column_most_common_value_to_be_in_set only works for 1 of the most common values.
Expected Behavior
expect_column_most_common_value_to_be_in_set should match all most common values
Steps To Reproduce
in dbt-expectations/integration_tests/models/schema_tests/schema.yml, add following column test for model data_test column col_string_b.:
Although all values are distinct in that model column (i.e. all column values are "the most common"), any value in value_set different from 'b' (the first result naturally ordered) will fail.
Relevant log output
Environment
Which database adapter are you using with dbt?
Postgres
Note: dbt-expectations currently does not support database adapters other than the ones listed below.
Additional Context
Great expectations has an additional ties_okay argument for partial matches. We could also add support for that.
So following tests would succeed
The text was updated successfully, but these errors were encountered: