Skip to content

Conversation

@ppiont
Copy link

@ppiont ppiont commented Nov 26, 2025

Summary

Add support for ingesting DBT model contracts (introduced in dbt 1.5) as DataHub Data Contract entities with schema and data quality assertions.

Closes #11927

What problem does this solve?

DBT introduced model contracts in v1.5, allowing teams to define and enforce schema guarantees on models. However, DataHub had no way to ingest this contract metadata - it was lost during ingestion, and users couldn't see which models had contractual guarantees.

This PR bridges DBT's data governance (contracts) with DataHub's data governance (Data Contracts, Assertions), giving teams a unified view of their data quality guarantees.

What changes are being made?

New Configuration Options

Option Default Description
ingest_contracts false Enable Data Contract creation from DBT contracts
contract_test_tag "contract" Tag for tests to include in contract
ingest_column_constraints_as_assertions true Create assertions from not_null, unique, primary_key constraints

New Data Structures

  • DBTContract dataclass - captures enforced, alias_types, checksum from manifest
  • DBTConstraint dataclass - captures column/model-level constraints (not_null, unique, primary_key, etc.)

Contract Ingestion Flow

When ingest_contracts: true and a model has contract.enforced: true:

  1. Schema Assertion - Created from contracted model columns with exact match compatibility
  2. Constraint Assertions (optional) - Created for not_null, unique, primary_key constraints
  3. Tagged Test Assertions - Existing DBT tests tagged with contract_test_tag are linked
  4. Data Contract Entity - Bundles all assertions into a DataContractPropertiesClass

Platform Support

Platform Contract Support Notes
dbt Core Full Extracts from manifest.json
dbt Cloud Best-effort Reads from meta.contract or meta.datahub_contract (API doesn't expose contracts directly)

How was this tested?

  • Added unit tests for new dataclasses (DBTContract, DBTConstraint)
  • Added unit tests for configuration options
  • Added integration test for contract extraction from manifest
  • Tested locally with sample manifests containing contract.enforced: true

Checklist

Screenshots/Demo

After ingestion with ingest_contracts: true, models with contract.enforced: true will have:

  • A Data Contract entity linked to the dataset
  • Schema assertions validating the contracted columns
  • (Optionally) Constraint assertions for not_null/unique/primary_key

…Contracts

Add support for extracting DBT model contracts (contract.enforced=true) and
creating DataHub Data Contract entities with schema assertions.

Key changes:
- Add DBTContract and DBTConstraint dataclasses for contract/constraint data
- Extract contract configuration and constraints from manifest files
- Add configuration options: ingest_contracts, contract_test_tag,
  ingest_column_constraints_as_assertions
- Create schema assertions from contracted model columns
- Create constraint assertions for not_null, unique, primary_key
- Support tagged tests as data quality assertions in contracts
- Add best-effort dbt Cloud support via meta field
- Add unit tests for contract functionality

Closes datahub-project#11927
@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata community-contribution PR or Issue raised by member(s) of DataHub Community labels Nov 26, 2025
@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Nov 26, 2025
@codecov
Copy link

codecov bot commented Nov 26, 2025

Bundle Report

Changes will decrease total bundle size by 14.13kB (-0.05%) ⬇️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 28.7MB -14.13kB (-0.05%) ⬇️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js -14.13kB 19.08MB -0.07%

@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Ingest DBT Contract Information as a DataHub Data Contract

1 participant