-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sample_posterior_predictive
API less surprising
#7069
Comments
sample_posterior_predictive
API more intuitivesample_posterior_predictive
API less surprising
I don't think I also think there are coding conventions we could adopt that would make the functionality less surprising. For example, in the blog post, there is this model
And this forecast with pm.Model() as forecast_m:
mu = pm.Normal("mu")
# Flat sigma for illustration purposes
sigma = pm.Flat("sigma")
# init_dist now starts on last observed value of y
pm.GaussianRandomWalk(
"y_forecast",
init_dist=pm.DiracDelta.dist(y_obs[-1]),
mu=mu,
sigma=sigma,
steps=99,
)
pp = pm.sample_posterior_predictive(
idata,
var_names=["y_forecast"],
predictions=True,
random_seed=rng,
) I think the forecast should just drop with pm.Model() as forecast_m:
mu = pm.Deterministic("mu", mu)
sigma = pm.Deterministic("sigma", sigma)
# init_dist now starts on last observed value of y
pm.GaussianRandomWalk(
"y_forecast",
init_dist=pm.DiracDelta.dist(y_obs[-1]),
mu=mu,
sigma=sigma,
steps=99,
)
pp = pm.sample_posterior_predictive(
idata,
var_names=["y_forecast"],
predictions=True,
random_seed=rng,
) I know this convention is in conflict with the way I have been using |
I don't like that. Sharing variables across models is error-prone and would force users to create a model even in settings where they are just using a stored Regarding documentation there is an open PR just for that: #7014 |
It doesn't need to be passed in, and I'm equally happy with an explicit input variable. Especially if the error messages when forgetting to include a trace are helpful. But I don't like this business of saying we ignore the distribution so don't worry. |
But ignoring model variables is exactly what happens in normal uses of |
I think articulating what model variables are ignored is also something that can be better documented. Since my mental model right now is, all variables that were not observations present in the trace object. |
I'll first address the stuff that @zaxtax said. I agree with @ricardoV94. The idea of having a with pm.model.fgraph.clone_model(model) as forecast_model:
pm.GaussianRandomWalk(
"y_forecast",
init_dist=pm.DiracDelta.dist(y_obs[-1]),
mu=forecast_model["mu"],
sigma=forecast_model["sigma"],
steps=99,
)
pp = pm.sample_posterior_predictive(idata, var_names="y_forecast") The alternative that @ricardoV94 is suggesting is to use a completely new model instance but define with model.clone() as forecast_model:
mu = pt.scalar("mu")
sigma = pt.scalar("sigma")
pm.GaussianRandomWalk(
"y_forecast",
init_dist=pm.DiracDelta.dist(y_obs[-1]),
mu=mu,
sigma=sigma,
steps=99,
)
pp = pm.sample_posterior_predictive(idata, var_names="y_forecast") This second approach basically says that The original discussionLet me start off by saying that I don't want to change the meaning of the It's true that The only thing that I think that we might want to invest some time in, is to try to design some lower level functions (in a similar spirit to The main reason I can think of for wanting to have this is to be able to run distributed predictions from a same compiled model. |
Another thing I am inclining too, regardless of which kwarg we use, is to require users to explicitly enumerate the variables that exist in the trace but must be resampled due to volatility. I would guess 99.9% of the users don't really want to resample variables that are in the trace, due to the volatility logic, and when this happens it's usually a bug / conceptual mistake. For instance this user recently wrote a model where a variable shape depended on MutableData, while trying to figure out how to resize things for We could raise when a variable in a trace is found to be volatile but was not manually added to It makes The low-level function we use doesn't need to be opinionated/foolproof, just the user facing |
I like separating |
I have updated the top comment with the latest proposal |
Description
The current behavior of
sample_posterior_predictive
regarding "volatility" andvar_names
is very opaque and error-prone.If you include a model variable in
var_names
it will always be sampled, even if it was an unobserved variable with values in the trace. Furthermore, any variables that depend on variables being sampled ("volatile") will also be sampled even if they are not invar_names
, and become "volatile" themselves. With this, it's easy for users to end up resampling the whole model from the prior by accident when they only wanted to do predictions and copy some variables from the posterior to the predictions group.Most of the times I've seen users including unobserved variables in
var_names
, they seemed to only want to copy those over from the posterior to the posterior_predictive, to be together with the newly sampled observed variables. The volatile logic was a bug from their perspective.Furthermore if an unobserved variable depends on MutableData/Coords that have changed that will also be resampled, and the whole cascade of volatility propagation happens the same.
If this doesn't make sense, I tried to explain the current behavior in #7014
Why did we ever think this behavior was a good idea?
Both behaviors are sensible in that they provide internally consistent draws, and avoid reusing posteriors in case they are likely to make little theoretical sense (if you have an intercept for the cities ["Paris", "Lisbon"], and now changed the coords to be ["London", "Madrid"], there is no reason to think the old posteriors should apply to the new cities), or computational sense (the length of the intercepts may have changed, and reusing the old ones would likely cause shape errors or silent bugs).
Aside: Maybe in that example users set the new coords to be ["Paris, "Lisbon", "London"], or they changed the shape from 2 to 3, expecting the new dim would be sampled, but the original two reused. That would be nice but hard to do, and on the verge of mind reading if only the shape changed. Such behavior should not be the competence of
sample_posterior_predictive
, but helper model transformation function.sample_posterior_predictive
should only have to worry whether the variables with the same name as those in the posterior trace can be used as is, or must be resampled.The current flexibility is not only a burden! It allows to do many interesting things like https://www.pymc-labs.io/blog-posts/out-of-model-predictions-with-pymc/
Anyway, the original motivation for these changes was because of Deterministics, which should sometimes be resampled (because they depended on MutableData that has changed) and other times reused to avoid useless re-computations in
sample_posterior_predictive
.pm.set_data
does not affect posterior predictive shape whenpm.Deterministic
is used #5512We initially went overboard and resampled everything that depended on MutableData, regardless of whether that had changed
Which was fixed in #6147
Because the behavior still seemed too implicit and hard to debug, we tried to make logging explicit, although some environments like vscode/colab like to suppress those logs #5973
Latest proposal
IMO the issue with the current behavior, is that it's too silent/implicit and only rarely what users want. Instead of making it easy for power-users to do what they want, I would err on the side of caution and alerting users for those cases where reusing posterior draws may not be reasonable. My proposal:
Distinguish between
var_names
andsample_vars
. The first determines which variables are included in the trace, the second which variables are being sampled. By default, observed variables are included in both fields, and the function behaves as the current default. I think most users think ofvar_names
like this already from use in arviz functions.Instead of defaulting to sampling new variables that depend on trace constant_data / coords / dims that have changed, but which are not in
sample_vars
try to reuse those from the trace but issue a warning that the posterior draws may no longer be valid since the model structure has changed. If they are insample_vars
, there is no warning and it behaves as now. We could make it a hard error... but someone will certainly complain that's actually want they wanted to do, and how can they do it then.The warning should happen regardless of whether those constant_data / coords / dims are defined in the model as Mutable/ConstantData or coords/coords_mutable (brought up in
sample_posterior_predictive
does not consider changes in "ConstantData" and "Constant coords" #6876)Deterministics, which is the reason why much of the behavior was written as is (we want to update Deterministics if model data / coords have changed, but reuse from the trace otherwise), still follow the current implicit volatile logic. They cannot be included in
sample_vars
, but can be invar_names
in which case they are recomputed or copied, depending on whether there were any changes in the model structure leading to them (or if they depend on a variable that is being sampled).More extreme proposal (rejected in the comments)
Only allow sampling of observed variables insample_posterior_predictive
and create a new functionpredict
(or whatever, just not so verbose) that has all the flexibilitysample_posterior_predictive
currently has. Could also be used for prior when noidata
is passed.This could also allow the values ofObservedRV
s to be used in place of the variables when they are not being sampled, which is often useful in forecasting models.Recent examples where
var_names
did not do what the user expected:The text was updated successfully, but these errors were encountered: