dbt-databricks 1.7.4 (TBD)

Fixes

Added python model specific connection handling to prevent using invalid sessions (547)
Allow schema to be specified in testing (thanks @case-k-git!) (538)
Fix dbt incremental_strategy behavior by fixing schema table existing check (thanks @case-k-git!) (530)
Fixed bug that was causing streaming tables to be dropped and recreated instead of refreshed. (552)
Fixed Hive performance regression by streamlining materialization type acquisition (557)
Fix: Python models authentication could be overridden by a .netrc file in the user's home directory (338)
Fix: MV/ST REST api authentication could be overriden by a .netrc file in the user's home directory (555)
Show details in connection errors (562)
Updated connection debugging logging and setting connection last used time on session open.(565)

Under the Hood

Adding retries around API calls in python model submission (549)
Upgrade to databricks-sql-connector 3.0.0 (554)
Pinning pandas to < 2.2.0 to keep from breaking multiple tests (564)

dbt-databricks 1.7.3 (Dec 12, 2023)

Fixes

Fix for issue where we were invoking create schema or not exists when the schema already exists (leading to permission issue) (529)
Fix for issue where we never reused connections (517)

Under the Hood

Refactor macro tests to be more usable (524)

dbt-databricks 1.7.2 (Nov 30, 2023)

Features

Adding capability to specify compute on a per model basis (488)
Selectively persist column docs that have changed between runs of incremental (513)
Enabling access control list for job runs (thanks @srggrs!)(518)
Allow persisting of column comments on views and retrieving comments for docs on Hive (519)

dbt-databricks 1.7.1 (Nov 13, 2023)

Under the Hood

Another attempt to improve catalog gathering performance (503)

dbt-databricks 1.7.0 (November 9, 2023)

Features

Added support for getting info only on specified relations to improve performance of gathering metadata (486), also (with generous help from from @mikealfare) (499)
Added support for getting freshness from metadata (481)

Fixes

Node info now gets added to SQLQuery event (thanks @davidharting!) (494)
Compatibility with dbt-spark and dbt-core 1.7.1 (499)

Under the Hood

Added required adapter tests to ensure compatibility with 1.7.0 (487)
Improved large seed performance by not casting every value (thanks @nrichards17!) (493). Note: for file_format="parquet" we still need to cast.

dbt-databricks 1.7.0rc1 (October 13, 2023)

Fixes

Fixed a bug where setting a primary key constraint before a null constraint would fail by ensuring null constraints happen first (479)
Foreign key constraints now work with dbt's constraint structure (479)

Under the Hood

Compatibility with dbt-spark 1.7.0rc1 (479)

dbt-databricks 1.6.6 (October 9, 2023)

Fixes

Optimize now runs after creating / updating liquid clustering tables (463)
Fixing an issue where the new python library install from index behavior breaks users who were already customizing their installs (472)

Under the Hood

fix Pylance import errors (thanks @dataders) (471)

dbt-databricks 1.6.5 (September 26, 2023)

Features

When installing python libraries onto clusters, you can now specify an index_url (Thanks @casperdamen123) (367)
Log job run information such as run_id when submitting Python jobs to databricks (Thanks @jeffrey-harrison) (#454)

Fixes

Node info now gets added to SQLQueryStatus (Thanks @colin-rogers-dbt) (453)
Fixing python model compatibility with newer DBRs (459)
Updated the Databricks SDK dependency so as to prevent reliance on an insecure version of requests (460)
Update logic around submitting python jobs so that if the cluster is already starting, just wait for it to start rather than failing (461)

dbt-databricks 1.6.4 (September 14, 2023)

Fixes

Fixed an issue with AWS OAuth M2M flow (#445)
Fixed an issue where every table in hive_metastore would get described (#446)

dbt-databricks 1.6.3 (September 8, 2023)

Fixes

Improved legibility of python stack traces (#434).
Add fetchmany, resolves #408 (Thanks @NodeJSmith) (#409)
Improved legibility of python stack traces (#434)
Update our Databricks Workflow README to make clear that jobs clusters are not supported targets (#436)
Relaxed the constraint on databricks-sql-connector to allow newer versions (#436)
Streamlined sql connector output in dbt.log (#437)

Under the hood

Switch to running integration tests with OAuth (#436)

dbt-databricks 1.6.2 (August 29, 2023)

Features

Follow up: re-implement fix for issue where the show tables extended command is limited to 2048 characters. (#326). Set DBT_DESCRIBE_TABLE_2048_CHAR_BYPASS to true to enable this behaviour.
Add liquid_clustered_by config to enable Liquid Clustering for Delta-based dbt models (Thanks @ammarchalifah) (#398).

Under the hood

Dropping the databricks_sql_endpoint test profile as not truly testing different behavior than databricks_uc_sql_endpoint profile (#417)
Improve testing of python model support so that we can package the new config options in this release (#421)

dbt-databricks 1.6.1 (August 2, 2023)

Fixes

Revert change from #326 as it breaks DESCRIBE table in cases where the dbt API key does not have access to all tables in the schema

dbt-databricks 1.6.0 (August 2, 2023)

Features

Support for dbt-core==1.6
Added support for materialized_view and streaming_table materializations
Support dbt clone operation
Support new dbt limit command-line flag

Fixes

Fix issue where the show tables extended command is limited to 2048 characters. (#326)
Extend python model support to cover the same config options as SQL (#379)

Other

Drop support for Python 3.7
Support for revamped dbt debug

dbt-databricks 1.5.5 (July 7, 2023)

Fixes

Fixed issue where starting a terminated cluster in the python path would never return

Features

Include log events from databricks-sql-connector in dbt logging output.
Adapter now populates the query_id field in run_results.json with Query History API query ID.

dbt-databricks 1.5.4 (June 9, 2023)

Features

Added support for model contracts (#336)

dbt-databricks 1.5.3 (June 8, 2023)

Fixes

Pins dependencies to minor versions
Sets default socket timeout to 180s

dbt-databricks 1.5.2 (May 17, 2023)

Fixes

Sets databricks sdk dependency to 0.1.6 to avoid SDK breaking changes

dbt-databricks 1.5.1 (May 9, 2023)

Fixes

Add explicit dependency to protobuf >4 to work around dbt-core issue

dbt-databricks 1.5.0 (May 2, 2023)

Features

Added support for OAuth (SSO and client credentials) (#327)

Fixes

Fix integration tests (#316)

Dependencies

Updated dbt-spark from >=1.4.1 to >= 1.5.0 (#316)

Under the hood

Throw an error if a model has an enforced contract. (#322)

dbt-databricks 1.4.3 (April 19, 2023)

Fixes

fix database not found error matching (#281)
Auto start cluster for Python models (#306)
databricks-sql-connector to 2.5.0 (#311)

Features

Adding replace_where incremental strategy (#293) (#310)
[feat] Support ZORDER as a model config (#292) (#297)

Dependencies

Added keyring>=23.13.0 for oauth token cache
Added databricks-sdk>=0.1.1 for oauth flows
Updated databricks-sql-connector from >=2.4.0 to >= 2.5.0

Under the hood

Throw an error if a model has an enforced contract. (#322)

dbt-databricks 1.4.2 (February 17, 2023)

Fixes

Fix test_grants to use the error class to check the error. (#273)
Raise exception on unexpected error of list relations (#270)

dbt-databricks 1.4.1 (January 31, 2023)

Fixes

Ignore case sensitivity in relation matches method. (#265)

dbt-databricks 1.4.0 (January 25, 2023)

Breaking changes

Raise an exception when schema contains '.'. (#222)
- Containing a catalog in schema is not allowed anymore.
- Need to explicitly use catalog instead.

Features

Support Python 3.11 (#233)
Support incremental_predicates (#161)
Apply connection retry refactor, add defaults with exponential backoff (#137)
Quote by Default (#241)
Avoid show table extended command. (#231)
Use show table extended with table name list for get_catalog. (#237)
Add support for a glob pattern in the databricks_copy_into macro (#259)

dbt-databricks 1.3.2 (November 9, 2022)

Fixes

Fix copy into macro when passing expression_list. (#223)
Partially revert to fix the case where schema config contains uppercase letters. (#224)

dbt-databricks 1.3.1 (November 1, 2022)

Under the hood

Show and log a warning when schema contains '.'. (#221)

dbt-databricks 1.3.0 (October 14, 2022)

Features

Support python model through run command API, currently supported materializations are table and incremental. (dbt-labs/dbt-spark#377, #126)
Enable Pandas and Pandas-on-Spark DataFrames for dbt python models (dbt-labs/dbt-spark#469, #181)
Support job cluster in notebook submission method (dbt-labs/dbt-spark#467, #194)
- In all_purpose_cluster submission method, a config http_path can be specified in Python model config to switch the cluster where Python model runs.
```
def model(dbt, _):
    dbt.config(
        materialized='table',
        http_path='...'
    )
    ...
```
Use builtin timestampadd and timestampdiff functions for dateadd/datediff macros if available (#185)
Implement testing for a test for various Python models (#189)
Implement testing for type_boolean in Databricks (dbt-labs/dbt-spark#471, #188)
Add a macro to support COPY INTO (#190)

Under the hood

Apply "Initial refactoring of incremental materialization" (#148)
- Now dbt-databricks uses adapter.get_incremental_strategy_macro instead of dbt_spark_get_incremental_sql macro to dispatch the incremental strategy macro. The overwritten dbt_spark_get_incremental_sql macro will not work anymore.
Better interface for python submission (dbt-labs/dbt-spark#452, #178)

dbt-databricks 1.2.3 (September 26, 2022)

Fixes

Fix cancellation (#173)
http_headers should be dict in the profile (#174)

dbt-databricks 1.2.2 (September 8, 2022)

Fixes

Data is duplicated on reloading seeds that are using an external table (#114, #149)

Under the hood

Explicitly close cursors (#163)
Upgrade databricks-sql-connector to 2.0.5 (#166)
Embed dbt-databricks and databricks-sql-connector versions to SQL comments (#167)

dbt-databricks 1.2.1 (August 24, 2022)

Features

Support Python 3.10 (#158)

dbt-databricks 1.2.0 (August 16, 2022)

Features

Add grants to materializations (dbt-labs/dbt-spark#366, dbt-labs/dbt-spark#381)
Add connection_parameters for databricks-sql-connector connection parameters (#135)
- This can be used to customize the connection by setting additional parameters.
- The full parameters are listed at Databricks SQL Connector for Python.
- Currently, the following parameters are reserved for dbt-databricks. Please use the normal credential settings instead.
  - server_hostname
  - http_path
  - access_token
  - session_configuration
  - catalog
  - schema

Fixes

Incremental materialization updated to not drop table first if full refresh for delta lake format, as it already runs create or replace table (dbt-labs/dbt-spark#286, dbt-labs/dbt-spark#287)

Under the hood

Update SparkColumn.numeric_type to return decimal instead of numeric, since SparkSQL exclusively supports the former (dbt-labs/dbt-spark#380)
Make minimal changes to support dbt Core incremental materialization refactor (dbt-labs/dbt-spark#402, dbt-labs/dbt-spark#394, #136)
Add new basic tests TestDocsGenerateDatabricks and TestDocsGenReferencesDatabricks (#134)
Set upper bound for databricks-sql-connector when Python 3.10 (#154)
- Note that databricks-sql-connector does not officially support Python 3.10 yet.

Contributors

@grindheim (dbt-labs/dbt-spark#287)

dbt-databricks 1.1.1 (July 19, 2022)

Features

Support for Databricks CATALOG as a DATABASE in DBT compilations (#95, #89, #94, #105)
- Setting an initial catalog with session_properties is deprecated and will not work in the future release. Please use catalog or database to set the initial catalog.
- When using catalog, spark_build_snapshot_staging_table macro will not be used. If trying to override the macro, databricks_build_snapshot_staging_table should be overridden instead.

Fixes

Block taking jinja2.runtime.Undefined into DatabricksAdapter (#98)
Avoid using Cursor.schema API when database is None (#100)

Under the hood

Drop databricks-sql-connector 1.0 (#108)

dbt-databricks 1.1.0 (May 11, 2022)

Features

Add support for Delta constraints (#71)

Under the hood

Port testing framework changes from dbt-labs/dbt-spark#299 and dbt-labs/dbt-spark#314 (#70)

dbt-databricks 1.0.3 (April 26, 2022)

Fixes

Make internal macros use macro dispatch pattern (#72)

dbt-databricks 1.0.2 (March 31, 2022)

Features

Support for setting table properties as part of a model configuration (#33, #49)
Get the session_properties map to work (#57)
Bump up databricks-sql-connector to 1.0.1 and use the Cursor APIs (#50)

dbt-databricks 1.0.1 (February 8, 2022)

Features

Inherit from dbt-spark for backward compatibility with spark-utils and other dbt packages (#32, #35)
Add SQL Endpoint specific integration tests (#45, #46)

Fixes

Close the connection properly (#34, #37)

dbt-databricks 1.0.0 (December 6, 2021)

Features

Make the connection use databricks-sql-connector (#3, #7)
Make the default file format 'delta' (#14, #16)
Make the default incremental strategy 'merge' (#23)
Remove unnecessary stack trace (#10)

dbt-spark 1.0.0 (December 3, 2021)

Fixes

Incremental materialization corrected to respect full_refresh config, by using should_full_refresh() macro (#260, #262)

Contributors

@grindheim (#262)

dbt-spark 1.0.0rc2 (November 24, 2021)

Features

Add support for Apache Hudi (hudi file format) which supports incremental merge strategies (#187, #210)

Under the hood

Refactor seed macros: remove duplicated code from dbt-core, and provide clearer logging of SQL parameters that differ by connection method (#249, #250)
Replace sample_profiles.yml with profile_template.yml, for use with new dbt init (#247)

Contributors

@vingov (#210)

dbt-spark 1.0.0rc1 (November 10, 2021)

Under the hood

Remove official support for python 3.6, which is reaching end of life on December 23, 2021 (dbt-core#4134, #253)
Add support for structured logging (#251)

dbt-spark 0.21.1 (Release TBD)

dbt-spark 0.21.1rc1 (November 3, 2021)

Fixes

Fix --store-failures for tests, by suppressing irrelevant error in comment_clause() macro (#232, #233)
Add support for on_schema_change config in incremental models: ignore, fail, append_new_columns. For sync_all_columns, removing columns is not supported by Apache Spark or Delta Lake (#198, #226, #229)
Add persist_docs call to incremental model (#224, #234)

Contributors

@binhnefits (#234)

dbt-spark 0.21.0 (October 4, 2021)

Fixes

Enhanced get_columns_in_relation method to handle a bug in open source deltalake which doesnt return schema details in show table extended in databasename like '*' query output. This impacts dbt snapshots if file format is open source deltalake (#207)
Parse properly columns when there are struct fields to avoid considering inner fields: Issue (#202)

Under the hood

Add unique_field to better understand adapter adoption in anonymous usage tracking (#211)

Contributors

@harryharanb (#207)
@SCouto (#204)

dbt-spark 0.21.0b2 (August 20, 2021)

Fixes

Add pyodbc import error message to dbt.exceptions.RuntimeException to get more detailed information when running dbt debug (#192)
Add support for ODBC Server Side Parameters, allowing options that need to be set with the SET statement to be used (#201)
Add retry_all configuration setting to retry all connection issues, not just when the _is_retryable_error function determines (#194)

Contributors

@JCZuurmond (#192)
@jethron (#201)
@gregingenii (#194)

dbt-spark 0.21.0b1 (August 3, 2021)

dbt-spark 0.20.1 (August 2, 2021)

dbt-spark 0.20.1rc1 (August 2, 2021)

Fixes

Fix get_columns_in_relation when called on models created in the same run (#196, #197)

Contributors

@ali-tny (#197)

dbt-spark 0.20.0 (July 12, 2021)

dbt-spark 0.20.0rc2 (July 7, 2021)

Features

Add support for merge_update_columns config in merge-strategy incremental models (#183, #184)

Fixes

Fix column-level persist_docs on Delta tables, add tests (#180)

dbt-spark 0.20.0rc1 (June 8, 2021)

Features

Allow user to specify use_ssl (#169)
Allow setting table OPTIONS using config (#171)
Add support for column-level persist_docs on Delta tables (#84, #170)

Fixes

Cast table_owner to string to avoid errors generating docs (#158, #159)
Explicitly cast column types when inserting seeds (#139, #166)

Under the hood

Parse information returned by list_relations_without_caching macro to speed up catalog generation (#93, #160)
More flexible host passing, https:// can be omitted (#153)

Contributors

@friendofasquid (#159)
@franloza (#160)
@Fokko (#165)
@rahulgoyal2987 (#169)
@JCZuurmond (#171)
@cristianoperez (#170)

dbt-spark 0.19.1 (April 2, 2021)

dbt-spark 0.19.1b2 (February 26, 2021)

Under the hood

Update serialization calls to use new API in dbt-core 0.19.1b2 (#150)

dbt-spark 0.19.0.1 (February 26, 2021)

Fixes

Fix package distribution to include incremental model materializations (#151, #152)

dbt-spark 0.19.0 (February 21, 2021)

Breaking changes

Incremental models have incremental_strategy: append by default. This strategy adds new records without updating or overwriting existing records. For that, use merge or insert_overwrite instead, depending on the file format, connection method, and attributes of your underlying data. dbt will try to raise a helpful error if you configure a strategy that is not supported for a given file format or connection. (#140, #141)

Fixes

Capture hard-deleted records in snapshot merge, when invalidate_hard_deletes config is set (#109, #126)

dbt-spark 0.19.0rc1 (January 8, 2021)

Breaking changes

Users of the http and thrift connection methods need to install extra requirements: pip install dbt-spark[PyHive] (#109, #126)

Under the hood

Enable CREATE OR REPLACE support when using Delta. Instead of dropping and recreating the table, it will keep the existing table, and add a new version as supported by Delta. This will ensure that the table stays available when running the pipeline, and you can track the history.
Add changelog, issue templates (#119, #120)

Fixes

Handle case of 0 retries better for HTTP Spark Connections (#132)

Contributors

@danielvdende (#132)
@Fokko (#125)

dbt-spark 0.18.1.1 (November 13, 2020)

Fixes

Fix extras_require typo to enable pip install dbt-spark[ODBC] ((#121), (#122))

dbt-spark 0.18.1 (November 6, 2020)

Features

Allows users to specify auth and kerberos_service_name (#107)
Add support for ODBC driver connections to Databricks clusters and endpoints (#116)

Under the hood

Updated README links (#115)
Support complete atomic overwrite of non-partitioned incremental models (#117)
Update to support dbt-core 0.18.1 (#110, #118)

Contributors

@danielhstahl (#107)
@collinprather (#115)
@charlottevdscheun (#117)
@Fokko (#117)

dbt-spark 0.18.0 (September 18, 2020)

Under the hood

Make a number of changes to support dbt-adapter-tests (#103)
Update to support dbt-core 0.18.0. Run CI tests against local Spark, Databricks (#105)

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

dbt-databricks 1.7.4 (TBD)

Fixes

Under the Hood

dbt-databricks 1.7.3 (Dec 12, 2023)

Fixes

Under the Hood

dbt-databricks 1.7.2 (Nov 30, 2023)

Features

dbt-databricks 1.7.1 (Nov 13, 2023)

Under the Hood

dbt-databricks 1.7.0 (November 9, 2023)

Features

Fixes

Under the Hood

dbt-databricks 1.7.0rc1 (October 13, 2023)

Fixes

Under the Hood

dbt-databricks 1.6.6 (October 9, 2023)

Fixes

Under the Hood

dbt-databricks 1.6.5 (September 26, 2023)

Features

Fixes

dbt-databricks 1.6.4 (September 14, 2023)

Fixes

dbt-databricks 1.6.3 (September 8, 2023)

Fixes

Under the hood

dbt-databricks 1.6.2 (August 29, 2023)

Features

Under the hood

dbt-databricks 1.6.1 (August 2, 2023)

Fixes

dbt-databricks 1.6.0 (August 2, 2023)

Features

Fixes

Other

dbt-databricks 1.5.5 (July 7, 2023)

Fixes

Features

dbt-databricks 1.5.4 (June 9, 2023)

Features

dbt-databricks 1.5.3 (June 8, 2023)

Fixes

dbt-databricks 1.5.2 (May 17, 2023)

Fixes

dbt-databricks 1.5.1 (May 9, 2023)

Fixes

dbt-databricks 1.5.0 (May 2, 2023)

Features

Fixes

Dependencies

Under the hood

dbt-databricks 1.4.3 (April 19, 2023)

Fixes

Features

Dependencies

Under the hood

dbt-databricks 1.4.2 (February 17, 2023)

Fixes

dbt-databricks 1.4.1 (January 31, 2023)

Fixes

dbt-databricks 1.4.0 (January 25, 2023)

Breaking changes

Features

dbt-databricks 1.3.2 (November 9, 2022)

Fixes

dbt-databricks 1.3.1 (November 1, 2022)

Under the hood

dbt-databricks 1.3.0 (October 14, 2022)

Features

Under the hood

dbt-databricks 1.2.3 (September 26, 2022)

Fixes