Skip to content

Conversation

aloctavodia
Copy link
Member

@aloctavodia aloctavodia commented Sep 22, 2025

Adding 'Normal' as the first example, so we have a concrete example to guide our discussions. I am copying the PreliZ code with minimal changes. For instance, in the parent class Distribution, I only removed the plotting methods. I'm not sure what else we would like to remove. I guess we only want to keep the following methods (or even a shorter list)

rvs, pdf, cdf, ppf, logpdf, logcdf, sf, logsf, isf, entropy, mean, median, mode, std, var, skewness, kurtosis.

and keep the rest in PreliZ. Also, not sure about the __repr__,

Also, a good opportunity to ensure return types/shapes are consistent. The main use case in PreliZ is to work with something like Normal(0, 1), not Normal(0, np.array([1, 2])). Thus, the support for the latter can be inconsistent. For example, the results for Normal(0, np.array([1, 2])).mean() is np.float64(0.0), instead of array([0., 0.])

@ricardoV94
Copy link
Member

ricardoV94 commented Sep 22, 2025

I don't think we need classes here, just the functions you listed as simple python function that take tensor (or tensor-like) inputs and return tensor outputs.

PyMC links these methods with singledispatch on the RV Ops, which looks like

@_logprob.register(NormalRV)
 def normal_logp(op, values, rng, size, mu, sigma, **kwargs):
    res = -0.5 * pt.pow((value - mu) / sigma, 2) - pt.log(pt.sqrt(2.0 * np.pi)) - pt.log(sigma)
    return check_parameters(
      res,
      sigma > 0,
      msg="sigma > 0",
    )

And could now use it like this:

from distributions.normal import normal_logpdf

NormalRV = ...

@_logprob.register(NormalRV)
def normal_logp(op, values, rng, size, mu, sigma, **kwargs): 
  [value] = values
  return normal_logpdf(value, mu, sigma)

The NormalRV could be implemented in this repo if that seems useful for other packages, but needs not to.


Preliz would presumably use it like this:

from distributions.normal import normal_logpdf

jit_normal_logpdf = pytensor_jit(normal_logpdf)

PrelizZeroSumNormal(PrelizDistribution):
  def __init__(self, ...):
      ...

  def logpdf(self, x):
    return jit_normal_logpdf(x, self.mu, self.sigma)

Some quick notes:

  • In PyMC we wrap the logpdf / logcdf expressions in a CheckParameterValue that raises an error for stuff like scale < 0. This check can be easily ignored / replaced by -inf in logp / etc.
  • The RVs can also be PyTensor functions. If you have NormalRV implemented it looks something like:
normal_rv = NormalRV()

def normal_rvs(loc, scale, rng=None, size=None, dtype=None):
  return normal_rv(loc, scale, rng=rng, size=size, dtype=dtype)

But can also be something that builds on top of other RVs like:

def normal_rvs(loc, scale, rng=None, size=None, dtype=None):
  if size is None:
    size = pt.broadcast_shapes(loc, scale)
  return loc + pt.random.normal(rng=rng, size=size) * scale

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much what I had in mind. What do you think, this would be useful for preliz as is or did I miss some details that make it awkward to use?

return logcdf(-x, -mu, sigma)

def neg_logpdf(x, mu, sigma):
return -(logpdf(x, mu, sigma)).sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this sum? Maybe this one is only needed in preliz?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

(Normal, stats.norm, (pt.constant(0), pt.constant(2)), (0, 2), (-np.inf, np.inf)),
],
)
def test_match_scipy(p_dist, sp_dist, p_params, sp_params, support):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to think about a framework for this. PyMC has something involved for logpdf/logcdf that maybe we can repurpose.

More importantly I suggest compiling these expressions with symbolic inputs (not constants) and providing them at eval time. That's more sure things are working correctly (and not for example assuming some values can be known at compile time).

Also important to test broadcasting is working correctly specially for distributions with vector inputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just copied some of the tests from preliz. Mostly to check that the mathematical expressions are correct. But, yes, we can improve these tests or add more.

@aloctavodia
Copy link
Member Author

This will be easy to use in preliz following the pattern you shared earlier. And the readjustments needed will be an immediate benefit , like an opportunity to fix inconsistencies (like the shape of moments I mentioned earlier). Also, more eyes checking on the code is always a good idea. Additionally, sharing effort between projects and reducing redundancy is also beneficial.


def kurtosis(mu, sigma):
shape = pt.broadcast_arrays(mu, sigma)[0]
return pt.full_like(shape, 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok that these are integers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants