Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Input dimension mismatch due to sampler proposing invalid values for discrete variable #7066

Open
AlphaScorpii-dev opened this issue Dec 14, 2023 · 2 comments

Comments

@AlphaScorpii-dev
Copy link

Describe the issue:

I wanted to test the detection for several breaking points in a time series, so I came up with the following model (cf. code example) to detect 3 switching points.
As long as I don't add any observation data, the model runs fine and generates all the variables at sampling time. The idx tensor variable seems to be of the proper shape (1 dimension and size of 1000).
However, adding them generates an error (cf. message). The dimension mismatch is also variable, depending on the execution, it can say input[1].shape[0] = 998, or input[1].shape[0] = 1004, it seems pretty random.

The problem seems to be with the idx index, but generating it in another way works. If I do for instance

xxx = None
    for i in range(K-2,-1,-1):
        if xxx is None:
            xxx = pymc.math.switch(numpy.arange(N)<pymc.math.cumsum(p)[i], i, i+1)
        else:
            xxx = pymc.math.switch(numpy.arange(N)<pymc.math.cumsum(p)[i], i, xxx)
idx = pymc.Deterministic("idx", xxx)

everything ends up being fine.

This index is basically used to map the proper value of the mean on a given interval.
Based on the outputs from the DirichletMultinomial, which is a tuple of K numbers ($a_0$, $a_1$,..., $a_{K-1}$) such as $$\sum_{i=0}^{K-1}a_i=N$$
I create an array that would look like [0,0,0,1,1,1,1,1,........,K-1,K-1,K-1]. 0 is repeated $a_0$ times, 1 is repeated $a_1$ times, etc...

Reproduceable code example:

import arviz, pymc, numpy, pytensor

data=numpy.concatenate([numpy.random.randn(300)/10-0.1,numpy.random.randn(450)/10+0.2,numpy.random.randn(250)/10+0.6])

with pymc.Model() as model:
    K, N = 3, 1000
    r = pymc.Normal("r", mu=0, sigma=1, shape=K)
    w = pymc.Dirichlet("w", numpy.ones(K))
    p = pymc.DirichletMultinomial("p", N, w)
    idx = pymc.Deterministic("idx", pymc.math.concatenate(
        [pytensor.tensor.tile(i,p[i]) for i in range(K)]))
    obs = pymc.Normal("obs", mu=r[idx], sigma=1e-3, observed=data) # disabling this works
    trace = pymc.sample(1000)

Error message:

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
  File "d:\Envs\jupyter\Lib\site-packages\pytensor\compile\function\types.py", line 970, in __call__
    self.vm()
ValueError: Input dimension mismatch: (input[0].shape[0] = 1000, input[1].shape[0] = 998)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\Envs\jupyter\Lib\site-packages\pymc\sampling\parallel.py", line 122, in run
    self._start_loop()
  File "d:\Envs\jupyter\Lib\site-packages\pymc\sampling\parallel.py", line 174, in _start_loop
    point, stats = self._step_method.step(self._point)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\Envs\jupyter\Lib\site-packages\pymc\step_methods\compound.py", line 231, in step
    point, sts = method.step(point)
                 ^^^^^^^^^^^^^^^^^^
  File "d:\Envs\jupyter\Lib\site-packages\pymc\step_methods\arraystep.py", line 100, in step
    apoint, stats = self.astep(q)
                    ^^^^^^^^^^^^^
  File "d:\Envs\jupyter\Lib\site-packages\pymc\step_methods\metropolis.py", line 274, in astep
    accept_rate = self.delta_logp(q, q0d)
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\Envs\jupyter\Lib\site-packages\pytensor\compile\function\types.py", line 983, in __call__
    raise_with_op(
  File "d:\Envs\jupyter\Lib\site-packages\pytensor\link\utils.py", line 531, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "d:\Envs\jupyter\Lib\site-packages\pytensor\compile\function\types.py", line 970, in __call__
    self.vm()
ValueError: Input dimension mismatch: (input[0].shape[0] = 1000, input[1].shape[0] = 998)
Apply node that caused the error: Composite{(5.988816745777465 + (-0.5 * sqr((1000.0 * (i0 - i1)))))}(obs{[-2.072045 ... 05966e-01]}, AdvancedSubtensor1.0)
Toposort index: 56
Inputs types: [TensorType(float64, shape=(1000,)), TensorType(float64, shape=(None,))]
Inputs shapes: [(1000,), (998,)]
Inputs strides: [(8,), (8,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Sum{axes=None}(sigma > 0)]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
"""

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
ValueError: Input dimension mismatch: (input[0].shape[0] = 1000, input[1].shape[0] = 998)
Apply node that caused the error: Composite{(5.988816745777465 + (-0.5 * sqr((1000.0 * (i0 - i1)))))}(obs{[-2.072045 ... 05966e-01]}, AdvancedSubtensor1.0)
Toposort index: 56
Inputs types: [TensorType(float64, shape=(1000,)), TensorType(float64, shape=(None,))]
Inputs shapes: [(1000,), (998,)]
Inputs strides: [(8,), (8,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Sum{axes=None}(sigma > 0)]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

The above exception was the direct cause of the following exception:

ParallelSamplingError                     Traceback (most recent call last)
Cell In[140], line 9
      6 idx = pymc.Deterministic("idx", pymc.math.concatenate(
      7     [pytensor.tensor.tile(i,p[i]) for i in range(K)]))
      8 obs = pymc.Normal("obs", mu=r[idx], sigma=1e-3, observed=data)
----> 9 trace = pymc.sample(100000)

File d:\Envs\jupyter\Lib\site-packages\pymc\sampling\mcmc.py:802, in sample(draws, tune, chains, cores, random_seed, progressbar, step, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, model, **kwargs)
    800 _print_step_hierarchy(step)
    801 try:
--> 802     _mp_sample(**sample_args, **parallel_args)
    803 except pickle.PickleError:
    804     _log.warning("Could not pickle model, sampling singlethreaded.")

File d:\Envs\jupyter\Lib\site-packages\pymc\sampling\mcmc.py:1191, in _mp_sample(draws, tune, step, chains, cores, random_seed, start, progressbar, traces, model, callback, mp_ctx, **kwargs)
   1189 try:
   1190     with sampler:
-> 1191         for draw in sampler:
   1192             strace = traces[draw.chain]
   1193             strace.record(draw.point, draw.stats)

File d:\Envs\jupyter\Lib\site-packages\pymc\sampling\parallel.py:448, in ParallelSampler.__iter__(self)
    445     self._progress.update(self._total_draws)
    447 while self._active:
--> 448     draw = ProcessAdapter.recv_draw(self._active)
    449     proc, is_last, draw, tuning, stats = draw
    450     self._total_draws += 1

File d:\Envs\jupyter\Lib\site-packages\pymc\sampling\parallel.py:330, in ProcessAdapter.recv_draw(processes, timeout)
    328     else:
    329         error = RuntimeError(f"Chain {proc.chain} failed.")
--> 330     raise error from old_error
    331 elif msg[0] == "writing_done":
    332     proc._readable = True

ParallelSamplingError: Chain 0 failed with: Input dimension mismatch: (input[0].shape[0] = 1000, input[1].shape[0] = 998)
Apply node that caused the error: Composite{(5.988816745777465 + (-0.5 * sqr((1000.0 * (i0 - i1)))))}(obs{[-2.072045 ... 05966e-01]}, AdvancedSubtensor1.0)
Toposort index: 56
Inputs types: [TensorType(float64, shape=(1000,)), TensorType(float64, shape=(None,))]
Inputs shapes: [(1000,), (998,)]
Inputs strides: [(8,), (8,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Sum{axes=None}(sigma > 0)]]

HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

PyMC version information:

I am running a conda environment, set up from miniforge.

Name Version Build Channel

arviz 0.16.1 pyhd8ed1ab_1 conda-forge
pymc 5.10.2 hd8ed1ab_0 conda-forge
pymc-base 5.10.2 pyhd8ed1ab_0 conda-forge
pytensor 2.18.3 py311hfb14a36_0 conda-forge
pytensor-base 2.18.3 py311hf63dbb6_0 conda-forge
python 3.11.6 h2628c8c_0_cpython conda-forge

Context for the issue:

No response

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 15, 2023

The problem is that you're sampling a discrete variable with a generic mcmc sampler, and the sampler will sometimes propose values for p that sum to more or less than n. They would get rejected because they have 0 probability, but the logp graph errors out in the meantime. This is a known problem with using discrete variables structurally such as in indexing operations.

One proposed solution was to have discrete transforms: #6102 (comment)
I guess we could also have the Metropolis sampler catch errors and treat them as 0 probability, but that could mask bugs in the user code.

For your case you can post-process p so bad values never make it to the indexes. It shouldn't bias your results because those draws would never be accepted in the first place. Something like:

import arviz, pymc, numpy, pytensor
data=numpy.concatenate([numpy.random.randn(300)/10-0.1,numpy.random.randn(450)/10+0.2,numpy.random.randn(250)/10+0.6])

with pymc.Model() as model:
    K, N = 3, 1000
    r = pymc.Normal("r", mu=0, sigma=1, shape=K)
    w = pymc.Dirichlet("w", numpy.ones(K))
    p = pymc.DirichletMultinomial("p", N, w)

    # Avoid index errors: if p does not sum to N, replace by [N, 0, 0, ...]
    p = pymc.math.switch(pymc.math.eq(p.sum(), N), p, [N] + [0]*(K-1))

    idx = pymc.Deterministic("idx", pymc.math.concatenate(
        [pytensor.tensor.tile(i,p[i]) for i in range(K)]))
    obs = pymc.Normal("obs", mu=r[idx], sigma=1e-3, observed=data)
    trace = pymc.sample(1000)   

The model will probably sample horribly, because the DirichletMultinomial is very constrained and the Metropolis sampler doesn't know about these contraints. It may be better to transform the Dirichlet deterministically into a valid p

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 15, 2023

Can't your model be recast as a Normal-Mixture model? That would avoid the discrete variable all-together

@ricardoV94 ricardoV94 changed the title BUG: Input dimension mismatch when adding observation data BUG: Input dimension mismatch due to sampler proposing invalid values for discrete variable Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants