Replies: 1 comment
-
@zechigan thanks for the question. Yes there's a few ways to do this, and we also have two issues open related to this #922 & #701 (and also this #1045 so it's clearer what the options are). Let me quickly recap a few ways you could try to get at this:
def raw_regression(...) -> pd.DataFrame:
# code to load
return df
@pipe(
@step(_winsorsized_observations),
@step(_normalized_winsorsized_observations),
)
def regression(raw_regression: pd.DataFrame) -> pd.DataFrame:
return raw_regression
Without knowing more about your context, it's hard to say what would work best for you -- also there's more decorators that could also help (e.g. My suggestion is to watch the youtube video I did and see if that helps you -- if not let's chat / add more here to determine how we can make it better :) References:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello!
Suppose I have a DAG that does the following:
|observations| -> winsorsized_observations -> normalized_winsorsized_observations -> ...
Given an external input
observations
, the nodewinsorsized_observations
winsorsizes it followed bynormalized_winsorsized_observations
that performs normalization on top of that.Something I thought could be good to have is to be able to rename some intermediary nodes back to the input name i.e., I rename
normalized_winsorsized_observations
back toobservations
, and any child nodes now knowsobservations
no longer refers to the inputobservations
:Obviously, this could be achieved by having two DAGs executed separately. But I would like to know whether being able to somehow splice the two DAGs is a good idea, and if it isn't, what Hamilton principles does this go against.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions