⚠️ ADDING THIS PACKAGE TO YOUR DBT PROJECT CAN SIGNIFICANTLY SLOW DOWN YOUR DBT RUNS. This is due to the number of insert statements executed by this package, especially as a post-hook. Please consider if this package is appropriate for your use case before using it.
Requires dbt >= 0.18.0
This package provides out-of-the-box functionality to log events for all dbt
invocations, including run start, run end, model start, and model end. It
outputs all data and models to schema [target.schema]_meta
. There are three
convenience models to make it easier to parse the event log data.
- Include this package in your
packages.yml
-- check here for installation instructions. - Include the following in your
dbt_project.yml
directly within yourmodels:
block (making sure to handle indenting appropriately):
# dbt_project.yml
...
models:
...
pre-hook: "{{ logging.log_model_start_event() }}"
post-hook: "{{ logging.log_model_end_event() }}"
That's it! You'll now have a stream of events for all dbt invocations in your warehouse.
It's possible to customise the audit schema for any project by adding a macro named: get_audit_schema
into your DBT project.
For example to always log into a specific schema, say analytics_meta
, regardless of DBT schema, you can include the following in your project:
-- your_dbt_project/macros/get_audit_schema.sql
{% macro get_audit_schema() %}
{{ return('analytics_meta') }}
{% endmacro %}
This package is currently compatible with dbt's BigQuery1, Snowflake, Redshift, and Postgres integrations.
1 BigQuery support may only work when 1 thread is set in your profiles.yml
file. Anything larger may result in "quota exceeded" errors.
New columns were added in v0.2.0:
- event_user as user -
varchar(512)
the user who ran the model - event_target as target -
varchar(512)
the target used when running DBT - event_is_full_refresh as is_full_refresh -
boolean
whether the DBT run was a full refresh
These will be added to your existing audit table automatically in the on-run-start
DBT hook, and added to the staging tables deployed by this table when they are ran. The existing event_schema
column will also be propagated into to stg_dbt_model_deployments
as schema
.
Additional contributions to this repo are very welcome! Check out this post on the best workflow for contributing to a package. All PRs should only include functionality that is contained within all Segment deployments; no implementation-specific details should be included.