3.0.0
TL;DR of What's Changed Since 2.9.0
dataform.json
-> workflow_settings.yaml
workflow_settings.yaml
has been introduced, which will gradually replace dataform.json
in a later version; there is no immediate action to be taken, as dataform.json
files are still valid in projects with Dataform Core 3.0.0
.
dataform.json
is being deprecated in favor of workflow_settings.yaml
. This means that:
- Workflow settings are now strictly typed, in Protobuf format.
- The Dataform Core version can be specified directly in the
workflow_settings.yaml
file. Note: to have more than just @dataform/core as a dependency, apackage.json
must still be used.
Example conversion of workflow_settings.yaml
:
defaultProject: dataform-demos
defaultLocation: us
defaultDataset: dataform
defaultAssertionDataset: dataform_assertions
dataformCoreVersion: 3.0.0
vars:
environmentName: "development"
The above is equivalent to the dataform.json
file:
{
"warehouse": "bigquery",
"defaultDatabase": "dataform-demos",
"defaultLocation": "us",
"defaultSchema": "dataform",
"assertionSchema": "dataform_assertions"
"vars": {
"environmentName": "development"
}
}
Notebooks Actions and actions.yaml
Notebooks as Dataform actions are on their way - but not quite yet! They're part of the compiled graph, and soon they'll be executable.
A new way of configuring action configs through actions.yaml
has been implemented to support this.
An example of loading a notebook in Dataform can be seen at https://github.com/dataform-co/dataform/tree/main/examples/extreme_weather_programming.
Stateless Package Installation by @dataform/cli
Package installation by @dataform/cli is now stateless! The CLI will install NPM packages during compilation if version
is defined in the workflow_settings.yaml
file.
This means no node_modules folder has to be seen in the project, and Dataform users no longer need to be familiar with NPM.
Compilation Output is Now Warehouse Agnostic
Previously the output of compilation results from @dataform/core would insert warehouse specific SQL into the compiled graph. Where possible, this has been removed - transferring the responsibility of inserting warehouse specific SQL into whichever execution engine is running Dataform.
Additionally, support for non-BigQuery warehouses has been dropped. We're in discussions with Datashell for them to provide a warehouse-agnostic CLI execution engine based off of Dataform compiled graphs. In the meantime however, if you need support for a non-BigQuery warehouse, please continue using the latest version starting with 2.x.x!
dependOnDependencyAssertions
An easier ways to add assertions from dependency as dependencies has been introduced.
dependOnDependencyAssertions
in config blocks can be used to add assertions from all dependencies of the action as dependencies.
config {
type: "view",
dependOnDependencyAssertions: true,
dependencies: ["some_table"]
}
select test from ${ref("some_other_table")}
Additionally, the includeDependentAssertions
parameter can be used when setting individual dependencies either in config.dependencies or in ref() to add assertions for these dependencies as the dependencies for current action.
config {
type: "view",
dependencies: [{name: "some_table", includeDependentAssertions: true}]
}
select test from ${ref({name: "some_other_table", includeDependentAssertions: true})}
Full Changelog from 2.9.0: 2.9.0...3.0.0