Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministics not resampled in posterior predictive #5302

Closed
ericmjl opened this issue Jan 2, 2022 · 10 comments · Fixed by #5759
Closed

Deterministics not resampled in posterior predictive #5302

ericmjl opened this issue Jan 2, 2022 · 10 comments · Fixed by #5759

Comments

@ericmjl
Copy link
Member

ericmjl commented Jan 2, 2022

Description of your problem

I noticed that when I want to use pm.set_data(), if my model has a pm.Deterministic transformation that gets passed into the likelihood, I will get shape issues.

Without the use of pm.Deterministic in my model, the shapes of posterior predictive samples, when I pass in new data of length 1, are of shape (chain, draw, 1).

With the use of pm.Deterministic in my model, the shapes of posterior predictive samples, when I pass in new data, are of shape (chain, draw, n), where n == "length of data that I used to fit the model".

I believe this is a bug with pm.Deterministics interacting with pm.set_data, but I'm not quite sure how to pinpoint the problem.

Please provide a minimal, self-contained, and reproducible example.

I found it much easier to provide a reprex as a Jupyter notebook available here: https://gist.github.com/ericmjl/e82b40805805d9feeaff9f39182ff32c

Please provide the full traceback.

No traceback necessary.

Please provide any additional information below.

Versions and main components

  • PyMC/PyMC3 Version: 4.0.0b1
  • Aesara/Theano Version: 2.3.2
  • Python Version: 3.9.9
  • Operating system: Ubuntu Linux
  • How did you install PyMC/PyMC3: conda
@ricardoV94
Copy link
Member

ricardoV94 commented Jan 2, 2022

Does this also happen without InferenceData?

@ericmjl
Copy link
Member Author

ericmjl commented Jan 2, 2022

Yes, the same thing happens with MultiTrace objects. I updated the gist notebook: https://gist.github.com/ericmjl/e82b40805805d9feeaff9f39182ff32c

@ricardoV94 ricardoV94 added the bug label Jan 2, 2022
@ricardoV94
Copy link
Member

ricardoV94 commented Jan 3, 2022

Here is a minimal example:

import numpy as np
import pymc as pm

with pm.Model() as m:
    x = pm.Data('x', np.ones(5))
    y = pm.Data('y', np.ones(5))
    b = pm.Normal('b')
    # mu = b*x  # This works fine
    mu = pm.Deterministic('mu', b*x)
    like = pm.Normal('like', mu, observed=y)
    trace = pm.sample()

    pm.set_data({'x': np.ones(3)})
    ppc = pm.sample_posterior_predictive(trace)
    assert ppc.posterior_predictive['like'].shape == (4, 1000, 3)  # Fails

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 3, 2022

I think this has to do with the fact that posterior_predictive gives preference to variables that are already in the posterior trace, regardless of whether they are random or deterministics.

This also fails, (now for a good reason, as we wouldn't know what the posterior of the new mu entries would be like):

import numpy as np
import pymc as pm

with pm.Model() as m:
    x = pm.Data('x', np.ones(5))
    y = pm.Data('y', np.ones(5))    
    mu = pm.Normal('mu', x)
    like = pm.Normal('like', mu, observed=y)
    trace = pm.sample()

    pm.set_data({'x': np.ones(3)})
    ppc = pm.sample_posterior_predictive(trace)
    assert ppc.posterior_predictive['like'].shape == (4, 1000, 3)  # Fails

@herve-91
Copy link
Contributor

Hello

When trying to reproduce the minimal example above, I noticed that it is more than a shape problem: mu is not re-computed when sample_posterior_predictive is called.

Indeed, if we consider the following variant of it:

import pymc as pm

with pm.Model() as m:
    x = pm.Data('x', [1, 2, 3])
    y = pm.Data('y', [1, 2, 3])
    b = pm.Normal('b')
    
    mu = pm.Deterministic('mu', b*x) 
   
    like = pm.Normal('like', mu, observed=y)
    trace = pm.sample()

    pm.set_data({'x': [4, 5, 6]})
    ppc = pm.sample_posterior_predictive(trace)
       
    assert ppc.posterior_predictive['like'].shape == (2, 1000, 3)  
    
    print('predicted:', ppc.posterior_predictive['like'].mean(axis=(0, 1)))

we don't have the shape issue anymore (and the assertion is verified) but the predicted value is around [1, 2, 3] instead of the expected [4, 5, 6].

This suggests a workaround which is to force the computation of mu by adding 'mu' to the var_names parameter of sample_posterior_predictive (having var_names=['mu', 'like']) to force its computation. I checked and it works.

I think the reason why mu is not computed in sample_posterior_predictive is related to the following piece of code in this method which adds mu to the list of inputs of the aseara function used to make the predictions (as long as it it not explicitly specified as a variable to sample) because it is neither a constant nor a shared variable:

    if not hasattr(_trace, "varnames"):
        inputs_and_names = [
            (rv, rv.name)
            for rv in walk_model(vars_to_sample, walk_past_rvs=True)
            if rv not in vars_to_sample
            and rv in model.named_vars.values()
            and not isinstance(rv, (Constant, SharedVariable))
        ]
        if inputs_and_names:
            inputs, input_names = zip(*inputs_and_names)
        else:
            inputs, input_names = [], []
    else:
        output_names = [v.name for v in vars_to_sample if v.name is not None]
        input_names = [
            n
            for n in _trace.varnames
            if n not in output_names and not isinstance(model[n], (Constant, SharedVariable))
        ]
        inputs = [model[n] for n in input_names]

I think it is because of this that the computations rely on the old values of mu rather than on new ones computed from the new values of x.

I am no expert of pymc or aesara so I am not 100% sure of myself but an argument in favor of this is that if we run the minimal example with a modified version of sample_posterior_predictive that calls the method compile_pymc without the parameter on_unused_input="ignore" we get an exception which complains about b being somewhat hidden by mu:

aesara.compile.function.types.UnusedInputError: aesara.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 1 is not part of the computational graph needed to compute the outputs: b.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to aesara.function. To disable it completely, use on_unused_input='ignore'

Not very conclusive but I hope it helps...

Regards,

@ricardoV94 ricardoV94 changed the title pm.Deterministic causes shape issues with pm.set_data? Deterministics not resampled in posterior predictive Mar 28, 2022
@ricardoV94
Copy link
Member

ricardoV94 commented Mar 28, 2022

People seem to be finding this issue repeatedly. I think we could do the following:

  1. Automatically resample deterministics that are ancestors of likelihood terms (but not ancestors of others unobserved variables). We can check if any shared variables are involved to avoid wasted resampling when they couldn't have changed.

  2. Check if deterministics that are not direct ancestors of likelihood terms have changed shape (would be useful to know if they changed value, but we don't have access to their old values) and warn thay they won't be resampled in posterior predictive because we don't know what the posterior traces of the intermediate unobserved variables would be like.

I don't think point 2 is as critical as 1, but I am afraid people will eventually try it and not realize it doesn't and shouldn't work

@OriolAbril
Copy link
Member

OriolAbril commented Mar 28, 2022

We can check if any shared variables are involved to avoid wasted resampling when they couldn't have changed.

I think this would be necessary (not sure it is possible though). Deterministics could be expensive and if this were done there wouldn't be a way to prevent resampling for deterministics that are expensive and don't change. Even if they are in between a mutable data and an observed, we might want to sample straight away to perform posterior predictive checks (without resampling from the deterministic) and after that updating the data to generate some predictions.


Not directly related to the issue itself, but I think the abuse of Deterministic in the documentation is a bit to blame for the confusion on this topic. I remember when I first started learning pymc it took me a couple weeks to realize that it was possible to store intermediate values as "regular"/non-model variables because all the examples I had seen stored them in a Deterministic even if they were only an intermediate result moved to its own line for convenience and had no use anywhere in the notebook.

@ricardoV94
Copy link
Member

ricardoV94 commented Mar 29, 2022

I think this would be necessary (not sure it is possible though). Deterministics could be expensive and if this were done there wouldn't be a way to prevent resampling for deterministics that are expensive and don't change. Even if they are in between a mutable data and an observed, we might want to sample straight away to perform posterior predictive checks (without resampling from the deterministic) and after that updating the data to generate some predictions.

I would err on the side of assuming most people want such Deterministics to be resampled. It should be quite feasible to walk the graph to check that MutableData dependency.

Such variables would be automatically resampled if they were not part of a Deterministic, and I doubt many people are using Deterministics as a means to prevent resampling of costly variables. At least that idea had never occurred to me.

One can always make a more specialized posterior_predictive function, and hopefully our API is not terribly difficult if that's required.

I agree with your other point regarding abuse of Deterministics, but that's a separate question.

@zaxtax
Copy link
Contributor

zaxtax commented Mar 25, 2024

I was bitten by this today. We should also add if the Deterministic has dims that have changed, it should also be resampled.

@ricardoV94
Copy link
Member

I was bitten by this today. We should also add if the Deterministic has dims that have changed, it should also be resampled.

My guess is you are seeing a side-effect of #6876

I imagine you are building a distinct model with constant Data / coords for posterior predictive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants