Skip to content

Conversation

@opherdonchin
Copy link

Summary

As discussed in [#602](#602), this PR adds an optional name parameter to PyMCStateSpace. When provided, it is used to prefix all variables and registered data created by that state-space model. This makes it possible to use multiple state-space models inside one PyMC model without name collisions.

What this changes

  • Add name: str | None = None to PyMCStateSpace.__init__.

  • Add _prefix_name(self, var_name: str) -> str, which:

    • returns var_name unchanged if self.name is None
    • otherwise returns "{name}_{var_name}".
  • Apply _prefix_name when:

    • creating pm.Deterministic variables,
    • registering Kalman filter and likelihood outputs,
    • constructing names in SequenceMvNormal,
    • naming H_jittered, x0_slice, P0_slice, forecast_*, irf, initial_shock,
    • building names for posterior predictive variables.
  • Update data helpers (add_data_to_active_model, register_data_with_pymc) so that state-space models can optionally pass prefixed data names.

Backwards compatibility

  • If name is not provided, everything behaves exactly as before.
  • Existing code that uses a single PyMCStateSpace should work without any changes.

Example

ssm1 = PyMCStateSpace(..., name="vision")
ssm2 = PyMCStateSpace(..., name="motor")

This results in variables such as "vision_obs", "motor_obs", "vision_data", etc., allowing both models to coexist cleanly in the same PyMC model.

Testing

  • Added tests/statespace/test_register_variable.py to check the basic naming and variable-registration behavior.
  • Attempted to run pytest tests/statespace in a clean pip install -e ".[dev]" environment.
  • Test collection currently stops with FutureWarning errors from pytensor.graph.basic.* imports (these have moved to pytensor.graph.traversal.*). This appears unrelated to this PR and originates from existing imports in the codebase and test suite.

@opherdonchin
Copy link
Author

Friendly ping for review when someone has a moment — this PR implements the naming support requested in #602. Thanks!

@jessegrabowski
Copy link
Member

jessegrabowski commented Nov 26, 2025

Thanks for the poke, this one slipped past me. Will review asap.

Looks like pre-commit is failing though, start by clearing that.

@opherdonchin
Copy link
Author

I’ve cleared the pre-commit.ci formatting issue (ruff-format is now clean).

The remaining required check (all_tests) is currently awaiting maintainer approval, since this PR comes from a fork. Please let me know when you’d like the workflows to be approved/run.

Sphinx/autodoc queries dunder attributes like __mro__ during documentation
build, which was triggering UnsupportedDistributionError. Now raises
AttributeError for dunder attributes to allow proper introspection.
@opherdonchin
Copy link
Author

Read the Docs failure is due to pymc_extras.prior.getattr attempting to resolve mro during Sphinx introspection. I pushed a small guard so dunder attributes raise AttributeError (no distribution lookup). RTD should now build cleanly.

all_tests is still awaiting maintainer approval because this PR is from a fork.

@opherdonchin
Copy link
Author

Bumping this.

Copy link
Member

@jessegrabowski jessegrabowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea a lot, but I think the implementation will be a lot cleaner after we merge #607 . I have two issues with this PR as it stands:

  1. I'm hesitant about the register_variable method. It adds another bookkeeping dictionary, and I'm not 100% sure what the intended use-case is. I suspect we can accomplish what you want to without it, for example using deterministics, as in:
with pm.Model() as m:
    x_for_model_1 = pm.Normal('x_for_model_1', ...)
    x_for_model_2 = pm.Deterministic('x_for_model_2')

If you really think it's necessary, let's talk about it in another PR.

  1. It's a heavy PR -- every single usage of every single named object (matrix, parameter, data, prediction, etc) needs to be wrapped with prefixed_name. What I would prefer is to intervene on the names themselves. I think this will be facilitated by #607. Maybe have a look at that PR and propose a way to work a model name into that framework, then we can revisit this in a few days once it's done?

"""
raise NotImplementedError("The add_default_priors property has not been implemented!")

def register_variable(self, name: str, variable: pt.TensorVariable) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of a workflow where you want to use this? I'm hesitant because it breaks the pattern assumed by statespace, in which the statespace model and the pymc model are totally independent.

Comment on lines +1579 to +1581
# Protect against Sphinx/RTD introspection of dunder attributes
if name.startswith("__") and name.endswith("__"):
raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this, it's unrelated this to PR (cherry pick it if you want to fix the issue)

with pymc_model:
for param_name in self.param_names:
param = getattr(pymc_model, param_name, None)
# Look for the parameter with the prefixed name in the PyMC model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear away all these comments everywhere in the PR, they don't add helpful context to future devs.

@opherdonchin
Copy link
Author

opherdonchin commented Jan 15, 2026

Thanks! I agree waiting for #607 and building on the refactor seems like the cleanest path.

My current thinking is to keep all semantic metadata unscoped (parameters/states/data roles remain “base names”) and apply scoping only at the PyMC/PyTensor graph boundary, where collisions actually occur. Concretely I think that means adding an optional name/namespace to PyMCStateSpace plus a single graph_name(base: str) helper, and then using it in three places:

  • Symbolic placeholder creation/registration (e.g., make_and_register_variable / make_and_register_data): accept base names, but create/register placeholders under graph_name(base).
  • Hard-coded graph literals (e.g., "data", "x0_slice", "irf"): replace with graph_name("data"), etc.
  • PyMC model-namespace interactions (e.g., pm.set_data, membership checks, model[...], named_vars_to_dims): translate base → graph name at the boundary.

I don’t currently see a need for a reverse base_name(...) mapping in the statespace code (we don’t seem to parse arbitrary model vars back into semantics), but can add it later if a concrete use-case emerges.

This should let me drop the extra register_variable bookkeeping from the current PR and re-implement with a much smaller diff after #607. Functionally it’s orthogonal, but #607 should make it easier to keep the change localized and reviewable.

Do you think this could be a good way forward?

Opher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants