Python models: next steps #5742
Replies: 9 comments 2 replies
-
Hey! I couldn't find this mentioned by someone else, so I will list it here. Currently we run DBT in Apache Airflow (with the dbt-bigquery adapter). Once DBT is done Airflow makes some HTTP calls to a python API, containing statistical functions that we need to calculate some results. When reading up about the python models, our team got initially quite enthusiastic, as this has great potential for us to embed these statistical functions into the datapipeline. My points are:
|
Beta Was this translation helpful? Give feedback.
-
Cross-posted from... I'm curious what the thinking is regarding dynamic dependency and non-deterministic I can think of a few solutions, like a two-pass approach, a separate codepath in the python model itself, pausing/restarting execution if a requested input isn't ready, etc. Is this already handled and/or is perhaps already in scope for current/future iterations? |
Beta Was this translation helpful? Give feedback.
-
Not specific to python but I think it makes it even more likely that people would like to connect to external sources of data so secret variables in models (that would be hidden from the logs) becomes even more wanted feature to securely provide the connection details. I know that source nodes that actually import data are not the originally intended use case but it makes a ton of sense to support in my opinion, especially with python support added. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Apologies if this has been asked elsewhere; is there a timeline for dbt-py to be enabled for Redshift? |
Beta Was this translation helpful? Give feedback.
-
Hello, couldnt see an issue noted anywhere, can see above there is an open issue for python models in dbt-redshift, is there a similar issue open for postgres ? |
Beta Was this translation helpful? Give feedback.
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
Hi, is there any news about adding a code re-usage option to DBT Python models? My team wants to migrate some Python functions (currently run on Kubernetes) to DBT Python models and we would really appreciate this feature. |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm looking to run more complex SQL statements in my Python model referring to other dbt SQL models. I want to reference these models directly, similar to how it's done with Jinja, without creating DataFrames from them. Is there an existing way to do this? If not, I think this would be a valuable feature to consider for the next release. |
Beta Was this translation helpful? Give feedback.
-
At dbt Labs, we're very excited Python models are out in the 1.3 beta and on track for launch at Coalesce. The purpose of this discussion is to talk about the next steps beyond 1.3.
Before jumping into this discussion, first read:
Overview
Python models in dbt today are a first step. As noted in the original discussion, they open pandoras box. One obvious domain we'll continue to investigate and increase support for is data science and the machine learning lifecycle management -- dbt is well-positioned for this with Python models as a first step.
There are a number of open product decisions to work through. We hope by opening these discussions here for your input, we can come to conclusions on those decisions with the community.
Planned discovery work
We've opened two relevant, focused discussions so far:
In addition to those topics, we plan to investigate and likely open further discussions for:
If any of these are particularly interesting to you, or there's something we're missing, let us know! We plan to investigate these over coming months and continue to provide updates on our roadmap.
Beta Was this translation helpful? Give feedback.
All reactions