-
Notifications
You must be signed in to change notification settings - Fork 78
Add optional name parameter to PyMCStateSpace for multiple models #611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Friendly ping for review when someone has a moment — this PR implements the naming support requested in #602. Thanks! |
|
Thanks for the poke, this one slipped past me. Will review asap. Looks like pre-commit is failing though, start by clearing that. |
|
I’ve cleared the pre-commit.ci formatting issue (ruff-format is now clean). The remaining required check (all_tests) is currently awaiting maintainer approval, since this PR comes from a fork. Please let me know when you’d like the workflows to be approved/run. |
Sphinx/autodoc queries dunder attributes like __mro__ during documentation build, which was triggering UnsupportedDistributionError. Now raises AttributeError for dunder attributes to allow proper introspection.
|
Read the Docs failure is due to pymc_extras.prior.getattr attempting to resolve mro during Sphinx introspection. I pushed a small guard so dunder attributes raise AttributeError (no distribution lookup). RTD should now build cleanly. all_tests is still awaiting maintainer approval because this PR is from a fork. |
|
Bumping this. |
jessegrabowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea a lot, but I think the implementation will be a lot cleaner after we merge #607 . I have two issues with this PR as it stands:
- I'm hesitant about the
register_variablemethod. It adds another bookkeeping dictionary, and I'm not 100% sure what the intended use-case is. I suspect we can accomplish what you want to without it, for example using deterministics, as in:
with pm.Model() as m:
x_for_model_1 = pm.Normal('x_for_model_1', ...)
x_for_model_2 = pm.Deterministic('x_for_model_2')
If you really think it's necessary, let's talk about it in another PR.
- It's a heavy PR -- every single usage of every single named object (matrix, parameter, data, prediction, etc) needs to be wrapped with
prefixed_name. What I would prefer is to intervene on the names themselves. I think this will be facilitated by #607. Maybe have a look at that PR and propose a way to work a model name into that framework, then we can revisit this in a few days once it's done?
| """ | ||
| raise NotImplementedError("The add_default_priors property has not been implemented!") | ||
|
|
||
| def register_variable(self, name: str, variable: pt.TensorVariable) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example of a workflow where you want to use this? I'm hesitant because it breaks the pattern assumed by statespace, in which the statespace model and the pymc model are totally independent.
| # Protect against Sphinx/RTD introspection of dunder attributes | ||
| if name.startswith("__") and name.endswith("__"): | ||
| raise AttributeError(f"module '{__name__}' has no attribute '{name}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this, it's unrelated this to PR (cherry pick it if you want to fix the issue)
| with pymc_model: | ||
| for param_name in self.param_names: | ||
| param = getattr(pymc_model, param_name, None) | ||
| # Look for the parameter with the prefixed name in the PyMC model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clear away all these comments everywhere in the PR, they don't add helpful context to future devs.
|
Thanks! I agree waiting for #607 and building on the refactor seems like the cleanest path. My current thinking is to keep all semantic metadata unscoped (parameters/states/data roles remain “base names”) and apply scoping only at the PyMC/PyTensor graph boundary, where collisions actually occur. Concretely I think that means adding an optional name/namespace to PyMCStateSpace plus a single graph_name(base: str) helper, and then using it in three places:
I don’t currently see a need for a reverse base_name(...) mapping in the statespace code (we don’t seem to parse arbitrary model vars back into semantics), but can add it later if a concrete use-case emerges. This should let me drop the extra register_variable bookkeeping from the current PR and re-implement with a much smaller diff after #607. Functionally it’s orthogonal, but #607 should make it easier to keep the change localized and reviewable. Do you think this could be a good way forward? Opher |
Summary
As discussed in [#602](#602), this PR adds an optional
nameparameter toPyMCStateSpace. When provided, it is used to prefix all variables and registered data created by that state-space model. This makes it possible to use multiple state-space models inside one PyMC model without name collisions.What this changes
Add
name: str | None = NonetoPyMCStateSpace.__init__.Add
_prefix_name(self, var_name: str) -> str, which:var_nameunchanged ifself.name is None"{name}_{var_name}".Apply
_prefix_namewhen:pm.Deterministicvariables,SequenceMvNormal,H_jittered,x0_slice,P0_slice,forecast_*,irf,initial_shock,Update data helpers (
add_data_to_active_model,register_data_with_pymc) so that state-space models can optionally pass prefixed data names.Backwards compatibility
nameis not provided, everything behaves exactly as before.PyMCStateSpaceshould work without any changes.Example
This results in variables such as
"vision_obs","motor_obs","vision_data", etc., allowing both models to coexist cleanly in the same PyMC model.Testing
tests/statespace/test_register_variable.pyto check the basic naming and variable-registration behavior.pytest tests/statespacein a cleanpip install -e ".[dev]"environment.FutureWarningerrors frompytensor.graph.basic.*imports (these have moved topytensor.graph.traversal.*). This appears unrelated to this PR and originates from existing imports in the codebase and test suite.