Skip to content

Commit

Permalink
Fixed formatting of recommendations sections in the best practices guide
Browse files Browse the repository at this point in the history
  • Loading branch information
okabak123 committed Oct 16, 2024
1 parent 3f79d08 commit 5ee1e4b
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 14 deletions.
22 changes: 17 additions & 5 deletions docs/best-practices/detection-and-coverage.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,12 @@ Elementary offers two types of anomaly detection monitors:
- **Automated Monitors** - Out-of-the-box volume and freshness monitors activated automatically, that query metadata only.
- **Opt-in anomaly detection tests** - Monitors that query raw data and require configuration.

- [ ] Deploy the packages dbt-utils and dbt-expectations in your dbt projects, to enrich your available tests
- [ ] Refer to the [dbt test hub](https://www.elementary-data.com/dbt-test-hub) by Elementary, to explore available tests by use case
<Check>
### Recommendations

- Deploy the packages dbt-utils and dbt-expectations in your dbt projects, to enrich your available tests
- Refer to the [dbt test hub](https://www.elementary-data.com/dbt-test-hub) by Elementary, to explore available tests by use case
</Check>

## Fine-tuning automated monitors

Expand Down Expand Up @@ -72,8 +76,12 @@ To detect issues in sources updates, you should monitor volume, freshness and sc
- Low cardinality columns / strict set of values - If there are fields with a specific set of values you expect use `accepted_values`. If you also expect a consistency in ratio of these values, use `dimension_anomalies` and group by this column.
- Business requirements - If you are aware of expectations specific to your business, try to enforce early to detect when issues are at the source. Some examples: `expect_column_values_to_be_between`, `expect_column_values_to_be_increasing`, `expect-column-values-to-have-consistent-casing`

- [ ] Add data freshness and volume validations for relevant source tables, on top of the automated monitors (advanced)
- [ ] Add schema tests for source tables
<Check>
### Recommendations

- Add data freshness and volume validations for relevant source tables, on top of the automated monitors (advanced)
- Add schema tests for source tables
</Check>

### Primary / foreign key columns in your transformation models

Expand All @@ -84,7 +92,11 @@ Tables should be covered with:

For incremental tables, it’s recommended to use a `where` clause in the tests, and only validate recent data. This will prevent running the tests on large data sets which is costly and slow.

- [ ] Add `unique` and `not_null` tests to key columns
<Check>
#### Recommendations

- Add `unique` and `not_null` tests to key columns
</Check>

### Public tables

Expand Down
20 changes: 16 additions & 4 deletions docs/best-practices/governance-for-observability.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,15 @@ models:
- Make sure to have clear ownership defined for all your public-facing tables. We also recommend adding subscribers to the relevant public tables.
- Usually, the owners of these public tables are the analytics engineering team, and the subscribers are the relevant data analysts who rely on the data from these tables.

<Check>
### Recommendations

- Add business domain tags to public tables
- Define owners for public facing tables
- Add data consumers as subscribers to relevant public facing tables
</Check>



## Priorities (optional)

Expand All @@ -79,8 +85,12 @@ Decide how many levels of priority you wish to maintain, and implement by adding

This will enable you to filter the results in Elementary by priority, and establish workflows such as sending `critical` alerts to Pagerduty, and the rest to Slack.

- (Optional) Add priorities / critical tags to tables / tests
- (Optional) Add owners to all top priority tables / tests
<Check>
### Recommendations

- Add priorities / critical tags to tables / tests (Optional)
- Add owners to all top priority tables / tests (Optional)
</Check>

## Data sources

Expand All @@ -106,17 +116,19 @@ sources:
subscribers: "@analytics.engineer"
```

<Check>
### Recommendations

- Add tags to source tables that describe the source system and / or ingestion method
- Add owners and subscribers to source tables
</Check>

## Recommendations

- Add business domain tags to public tables
- Define owners for public facing tables
- Add data consumers as subscribers to relevant public facing tables

- (Optional) Add priorities / critical tags to tables / tests
- (Optional) Add owners to all top priority tables / tests

- Add tags to source tables that describe the source system and / or ingestion method
- Add owners and subscribers to source tables
15 changes: 10 additions & 5 deletions docs/best-practices/triage-and-response.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,15 @@ For every test or monitor you add, think about the following -

According to these answers, you should add configuration that will impact the alert, alert distribution, and triage:

- Add a test description that details what it means if this test fails, and context on resolving it. Descriptions can be added in UI or in code.
- Each failure should have an owner, that should look into the failure. It can be the owner of the data set or an owner of a specific test.
- If others need to be notified, add subscribers.
- Use the [severity of failures](https://docs.getdbt.com/reference/resource-configs/severity) intentionally, and even leverage conditional expressions (`error_if`, `warn_if`)
- Test failures and alerts include a sample of the failed results, and the test query. You can change the test query and / or add comments to it, that can provide triage context.
<Check>
### Recommendations

- Add a test description that details what it means if this test fails, and context on resolving it. Descriptions can be added in UI or in code.
- Each failure should have an owner, that should look into the failure. It can be the owner of the data set or an owner of a specific test.
- If others need to be notified, add subscribers.
- Use the [severity of failures](https://docs.getdbt.com/reference/resource-configs/severity) intentionally, and even leverage conditional expressions (`error_if`, `warn_if`)
- Test failures and alerts include a sample of the failed results, and the test query. You can change the test query and / or add comments to it, that can provide triage context.
</Check>

```yaml
tests:
Expand Down Expand Up @@ -131,6 +135,7 @@ These are the questions that should be asked, and product tips on how to answer

![Model runs portion of the dashboard](/pics/dashboard-model-runs.png)


- How important is the data asset?
- Check in the catalog or node info section in the lineage if it has a tag like `critical` , `public` or a data product tag. You can also look at the description of the data asset, whether it’s a table or a column.
- Does the failure impact important downstream assets? Did the issue propagate to downstream assets?
Expand Down

0 comments on commit 5ee1e4b

Please sign in to comment.