Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add union data #132

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
# dbt_zendesk v0.14.0

## 🎉 Feature Update 🎉
- This release supports running the package on multiple Zendesk sources at once! See the [README](https://github.com/fivetran/dbt_zendesk?tab=readme-ov-file#step-3-define-database-and-schema-variables) for details on how to leverage this feature ([PR #132](https://github.com/fivetran/dbt_zendesk/pull/132)).

> Please note: This is a **🚨Breaking Change🚨** in that we have a added a new field, `source_relation`, that points to the source connector from which the record originated. This field addition will require a `dbt run --full-refresh`.

## 🐞 Bug Fixes 🐞
- Previous versions of this package determined the [default ticket schedule](https://github.com/fivetran/dbt_zendesk/blob/v0.13.1/models/intermediate/int_zendesk__ticket_schedules.sql#L19-L55) without considering whether schedules have been deleted or not. We now only consider non-deleted schedules in our default ticket schedule logic ([PR #132](https://github.com/fivetran/dbt_zendesk/pull/132)).

# dbt_zendesk v0.13.1

[PR #128](https://github.com/fivetran/dbt_zendesk/pull/128) includes the following changes:
Expand Down
55 changes: 52 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,21 +59,68 @@ Include the following zendesk package version in your `packages.yml` file:
```yml
packages:
- package: fivetran/zendesk
version: [">=0.13.0", "<0.14.0"]
version: [">=0.14.0", "<0.15.0"]

```
> **Note**: Do not include the Zendesk source package. The Zendesk transform package already has a dependency on the source in its own `packages.yml` file.

## Step 3: Define database and schema variables
### Option 1: Single connector 💃
By default, this package runs using your destination and the `zendesk` schema. If this is not where your zendesk data is (for example, if your zendesk schema is named `zendesk_fivetran`), update the following variables in your root `dbt_project.yml` file accordingly:

```yml
vars:
zendesk_database: your_destination_name
zendesk_schema: your_schema_name
```
> **Note**: If you are running the package on one source connector, each model will have a `source_relation` column that is just an empty string.

### Option 2: Union multiple connectors 👯
If you have multiple Zendesk connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `zendesk_union_schemas` OR `zendesk_union_databases` variables (cannot do both, though a more flexible approach is in the works...) in your root `dbt_project.yml` file:

```yml
# dbt_project.yml

vars:
zendesk_union_schemas: ['zendesk_usa','zendesk_canada'] # use this if the data is in different schemas/datasets of the same database/project
zendesk_union_databases: ['zendesk_usa','zendesk_canada'] # use this if the data is in different databases/projects but uses the same schema name
```

#### Recommended: Incorporate unioned sources into DAG
By default, this package defines one single-connector source, called `zendesk`, which will be disabled if you are unioning multiple connectors. This means that your DAG will not include your Zendesk sources, though the package will run successfully.

To properly incorporate all of your Zendesk connectors into your project's DAG:
1. Define each of your sources in a `.yml` file in your project. Utilize the following template for the `source`-level configurations, and, **most importantly**, copy and paste the table and column-level definitions from the package's `src_zendesk.yml` [file](https://github.com/fivetran/dbt_zendesk_source/blob/main/models/src_zendesk.yml#L15-L351).

```yml
# a .yml file in your root project
sources:
- name: <name> # ex: zendesk_usa
schema: <schema_name> # one of var('zendesk_union_schemas') if unioning schemas, otherwise just 'zendesk'
database: <database_name> # one of var('zendesk_union_databases') if unioning databases, otherwise whatever DB your zendesk schemas all live in
loader: fivetran
loaded_at_field: _fivetran_synced

freshness: # feel free to adjust to your liking
warn_after: {count: 72, period: hour}
error_after: {count: 168, period: hour}

tables: # copy and paste from zendesk_source/models/src_zendesk.yml
```

> **Note**: If there are source tables you do not have (see [Step 4](https://github.com/fivetran/dbt_zendesk?tab=readme-ov-file#step-4-disable-models-for-non-existent-sources)), you may still include them, as long as you have set the right variables to `False`. Otherwise, you may remove them from your source definitions.

2. Set the `has_defined_sources` variable (scoped to the `zendesk_source` package) to `True`, like such:
```yml
# dbt_project.yml
vars:
zendesk_source:
has_defined_sources: true
```

## Step 4: Disable models for non-existent sources
> _This step is unnecessary (but still available for use) if you are unioning multiple connectors together in the previous step. That is, the `union_data` macro we use will create completely empty staging models for sources that are not found in any of your Zendesk schemas/databases. However, you can still leverage the below variables if you would like to avoid this behavior._

This package takes into consideration that not every Zendesk account utilizes the `schedule`, `schedule_holiday`, `ticket_schedule` `daylight_time`, `time_zone`, `domain_name`, `user_tag`, `organization_tag`, or `ticket_form_history` features, and allows you to disable the corresponding functionality. By default, all variables' values are assumed to be `true`. Add variables for only the tables you want to disable:
```yml
vars:
Expand All @@ -85,6 +132,7 @@ vars:
```

## (Optional) Step 5: Additional configurations
<details open><summary>Expand/collapse configurations</summary>

### Adding passthrough columns
This package includes all source columns defined in the staging models. However, the `stg_zendesk__ticket` model allows for additional columns to be added using a pass-through column variable. This is extremely useful if you'd like to include custom fields to the package.
Expand Down Expand Up @@ -164,7 +212,7 @@ models:
+schema: my_new_schema_name # leave blank for just the target_schema
```

### Change the source table references
### Change the source table references (only if using a single connector)
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:

> IMPORTANT: See this project's [`dbt_project.yml`](https://github.com/fivetran/dbt_zendesk_source/blob/main/dbt_project.yml) variable declarations to see the expected names.
Expand All @@ -173,6 +221,7 @@ If an individual source table has a different name than the package expects, add
vars:
zendesk_<default_source_table_name>_identifier: your_table_name
```
</details>

## (Optional) Step 6: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for details</summary>
Expand All @@ -188,7 +237,7 @@ This dbt package is dependent on the following dbt packages. Please be aware tha
```yml
packages:
- package: fivetran/zendesk_source
version: [">=0.10.0", "<0.11.0"]
version: [">=0.11.0", "<0.12.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
3 changes: 1 addition & 2 deletions dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
name: 'zendesk'
version: '0.13.1'

version: '0.14.0'

config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
config-version: 2

name: 'zendesk_integration_tests'
version: '0.13.1'
version: '0.14.0'

profile: 'integration_tests'

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ with ticket_historical_status as (

select
ticket_historical_status.ticket_id,
ticket_historical_status.source_relation,
ticket_historical_status.status as ticket_status,
ticket_schedules.schedule_id,

Expand All @@ -33,13 +34,15 @@ with ticket_historical_status as (
from ticket_historical_status
left join ticket_schedules
on ticket_historical_status.ticket_id = ticket_schedules.ticket_id
and ticket_historical_status.source_relation = ticket_schedules.source_relation
-- making sure there is indeed real overlap
where {{ dbt.datediff('greatest(valid_starting_at, schedule_created_at)', 'least(valid_ending_at, schedule_invalidated_at)', 'second') }} > 0

), ticket_full_solved_time as (

select
ticket_id,
source_relation,
ticket_status,
schedule_id,
status_schedule_start,
Expand All @@ -59,7 +62,7 @@ with ticket_historical_status as (
{{ dbt_date.week_start('ticket_status_crossed_with_schedule.status_schedule_start','UTC') }} as start_week_date

from ticket_status_crossed_with_schedule
{{ dbt_utils.group_by(n=7) }}
{{ dbt_utils.group_by(n=8) }}

), weeks as (

Expand Down Expand Up @@ -90,6 +93,7 @@ with ticket_historical_status as (

select
weekly_periods.ticket_id,
weekly_periods.source_relation,
weekly_periods.week_number,
weekly_periods.schedule_id,
weekly_periods.ticket_status,
Expand All @@ -103,6 +107,7 @@ with ticket_historical_status as (
ticket_week_start_time <= schedule.end_time_utc
and ticket_week_end_time >= schedule.start_time_utc
and weekly_periods.schedule_id = schedule.schedule_id
and weekly_periods.source_relation = schedule.source_relation
-- this chooses the Daylight Savings Time or Standard Time version of the schedule
-- We have everything calculated within a week, so take us to the appropriate week first by adding the week_number * minutes-in-a-week to the minute-mark where we start and stop counting for the week
and cast( {{ dbt.dateadd(datepart='minute', interval='week_number * (7*24*60) + ticket_week_end_time', from_date_or_timestamp='start_week_date') }} as {{ dbt.type_timestamp() }}) > cast(schedule.valid_from as {{ dbt.type_timestamp() }})
Expand All @@ -112,6 +117,7 @@ with ticket_historical_status as (

select
ticket_id,
source_relation,
ticket_status,
case when ticket_status in ('pending') then scheduled_minutes
else 0 end as agent_wait_time_in_minutes,
Expand All @@ -133,6 +139,7 @@ with ticket_historical_status as (

select
ticket_id,
source_relation,
sum(agent_wait_time_in_minutes) as agent_wait_time_in_business_minutes,
sum(requester_wait_time_in_minutes) as requester_wait_time_in_business_minutes,
sum(solve_time_in_minutes) as solve_time_in_business_minutes,
Expand All @@ -141,4 +148,4 @@ with ticket_historical_status as (
sum(new_status_duration_minutes) as new_status_duration_in_business_minutes,
sum(open_status_duration_minutes) as open_status_duration_in_business_minutes
from business_minutes
group by 1
group by 1,2
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ with ticket_historical_status as (

select
ticket_id,
source_relation,
status,
case when status in ('pending') then status_duration_calendar_minutes
else 0 end as agent_wait_time_in_minutes,
Expand All @@ -24,8 +25,8 @@ with ticket_historical_status as (
else 0 end as open_status_duration_minutes,
case when status = 'deleted' then 1
else 0 end as ticket_deleted,
first_value(valid_starting_at) over (partition by ticket_id order by valid_starting_at desc, ticket_id rows unbounded preceding) as last_status_assignment_date,
case when lag(status) over (partition by ticket_id order by valid_starting_at) = 'deleted' and status != 'deleted'
first_value(valid_starting_at) over (partition by ticket_id, source_relation order by valid_starting_at desc, ticket_id rows unbounded preceding) as last_status_assignment_date,
case when lag(status) over (partition by ticket_id, source_relation order by valid_starting_at) = 'deleted' and status != 'deleted'
then 1
else 0
end as ticket_recoveries
Expand All @@ -36,6 +37,7 @@ with ticket_historical_status as (

select
ticket_id,
source_relation,
last_status_assignment_date,
sum(ticket_deleted) as ticket_deleted_count,
sum(agent_wait_time_in_minutes) as agent_wait_time_in_calendar_minutes,
Expand All @@ -47,4 +49,4 @@ select
sum(open_status_duration_minutes) as open_status_duration_in_calendar_minutes,
sum(ticket_recoveries) as total_ticket_recoveries
from calendar_minutes
group by 1, 2
group by 1, 2, 3
5 changes: 4 additions & 1 deletion models/intermediate/int_zendesk__assignee_updates.sql
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ with ticket_updates as (
), ticket_requester as (
select
ticket.ticket_id,
ticket.source_relation,
ticket.assignee_id,
ticket_updates.valid_starting_at

Expand All @@ -17,16 +18,18 @@ with ticket_updates as (
left join ticket_updates
on ticket_updates.ticket_id = ticket.ticket_id
and ticket_updates.user_id = ticket.assignee_id
and ticket_updates.source_relation = ticket.source_relation

), final as (
select
ticket_id,
source_relation,
assignee_id,
max(valid_starting_at) as last_updated,
count(*) as total_updates
from ticket_requester

group by 1, 2
group by 1, 2, 3
)

select *
Expand Down
3 changes: 2 additions & 1 deletion models/intermediate/int_zendesk__comment_metrics.sql
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ with ticket_comments as (
comment_counts as (
select
ticket_id,
source_relation,
last_comment_added_at,
sum(case when commenter_role = 'internal_comment' and is_public = true
then 1
Expand Down Expand Up @@ -38,7 +39,7 @@ comment_counts as (
end) as count_agent_replies
from ticket_comments

group by 1, 2
group by 1, 2, 3
),

final as (
Expand Down
3 changes: 2 additions & 1 deletion models/intermediate/int_zendesk__latest_ticket_form.sql
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,14 @@ with ticket_form_history as (
latest_ticket_form as (
select
*,
row_number() over(partition by ticket_form_id order by updated_at desc) as latest_form_index
row_number() over(partition by ticket_form_id, source_relation order by updated_at desc) as latest_form_index
from ticket_form_history
),

final as (
select
ticket_form_id,
source_relation,
created_at,
updated_at,
display_name,
Expand Down
18 changes: 12 additions & 6 deletions models/intermediate/int_zendesk__organization_aggregates.sql
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,15 @@ with organizations as (
), tag_aggregates as (
select
organizations.organization_id,
organizations.source_relation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I am running into the following error when running on the zendesk_yashwanth schema. Do you have any ideas why this may be occurring?

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also not getting this error... could you rerun @fivetran-joemarkiewicz ?

{{ fivetran_utils.string_agg('organization_tags.tags', "', '" ) }} as organization_tags
from organizations

left join organization_tags
using (organization_id)
on organizations.organization_id = organization_tags.organization_id
and organizations.source_relation = organization_tags.source_relation

group by 1
group by 1, 2
{% endif %}

--If you use using_domain_names tags this will be included, if not it will be ignored.
Expand All @@ -30,13 +32,15 @@ with organizations as (
), domain_aggregates as (
select
organizations.organization_id,
organizations.source_relation,
{{ fivetran_utils.string_agg('domain_names.domain_name', "', '" ) }} as domain_names
from organizations

left join domain_names
using(organization_id)
on organizations.organization_id = domain_names.organization_id
and organizations.source_relation = domain_names.source_relation

group by 1
group by 1, 2
{% endif %}


Expand All @@ -59,13 +63,15 @@ with organizations as (
--If you use using_domain_names tags this will be included, if not it will be ignored.
{% if var('using_domain_names', True) %}
left join domain_aggregates
using(organization_id)
on organizations.organization_id = domain_aggregates.organization_id
and organizations.source_relation = domain_aggregates.source_relation
{% endif %}

--If you use organization tags this will be included, if not it will be ignored.
{% if var('using_organization_tags', True) %}
left join tag_aggregates
using(organization_id)
on organizations.organization_id = tag_aggregates.organization_id
and organizations.source_relation = tag_aggregates.source_relation
{% endif %}
)

Expand Down
5 changes: 4 additions & 1 deletion models/intermediate/int_zendesk__requester_updates.sql
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ with ticket_updates as (
), ticket_requester as (
select
ticket.ticket_id,
ticket.source_relation,
ticket.requester_id,
ticket_updates.valid_starting_at

Expand All @@ -17,16 +18,18 @@ with ticket_updates as (
left join ticket_updates
on ticket_updates.ticket_id = ticket.ticket_id
and ticket_updates.user_id = ticket.requester_id
and ticket_updates.source_relation = ticket.source_relation

), final as (
select
ticket_id,
source_relation,
requester_id,
max(valid_starting_at) as last_updated,
count(*) as total_updates
from ticket_requester

group by 1, 2
group by 1, 2, 3
)

select *
Expand Down
Loading