-
Notifications
You must be signed in to change notification settings - Fork 0
Add Normal #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Normal #1
Conversation
I don't think we need classes here, just the functions you listed as simple python function that take tensor (or tensor-like) inputs and return tensor outputs. PyMC links these methods with singledispatch on the RV Ops, which looks like @_logprob.register(NormalRV)
def normal_logp(op, values, rng, size, mu, sigma, **kwargs):
res = -0.5 * pt.pow((value - mu) / sigma, 2) - pt.log(pt.sqrt(2.0 * np.pi)) - pt.log(sigma)
return check_parameters(
res,
sigma > 0,
msg="sigma > 0",
) And could now use it like this: from distributions.normal import normal_logpdf
NormalRV = ...
@_logprob.register(NormalRV)
def normal_logp(op, values, rng, size, mu, sigma, **kwargs):
[value] = values
return normal_logpdf(value, mu, sigma) The NormalRV could be implemented in this repo if that seems useful for other packages, but needs not to. Preliz would presumably use it like this: from distributions.normal import normal_logpdf
jit_normal_logpdf = pytensor_jit(normal_logpdf)
PrelizZeroSumNormal(PrelizDistribution):
def __init__(self, ...):
...
def logpdf(self, x):
return jit_normal_logpdf(x, self.mu, self.sigma) Some quick notes:
normal_rv = NormalRV()
def normal_rvs(loc, scale, rng=None, size=None, dtype=None):
return normal_rv(loc, scale, rng=rng, size=size, dtype=dtype) But can also be something that builds on top of other RVs like: def normal_rvs(loc, scale, rng=None, size=None, dtype=None):
if size is None:
size = pt.broadcast_shapes(loc, scale)
return loc + pt.random.normal(rng=rng, size=size) * scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty much what I had in mind. What do you think, this would be useful for preliz as is or did I miss some details that make it awkward to use?
distributions/normal.py
Outdated
return logcdf(-x, -mu, sigma) | ||
|
||
def neg_logpdf(x, mu, sigma): | ||
return -(logpdf(x, mu, sigma)).sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this sum? Maybe this one is only needed in preliz?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
(Normal, stats.norm, (pt.constant(0), pt.constant(2)), (0, 2), (-np.inf, np.inf)), | ||
], | ||
) | ||
def test_match_scipy(p_dist, sp_dist, p_params, sp_params, support): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to think about a framework for this. PyMC has something involved for logpdf/logcdf that maybe we can repurpose.
More importantly I suggest compiling these expressions with symbolic inputs (not constants) and providing them at eval time. That's more sure things are working correctly (and not for example assuming some values can be known at compile time).
Also important to test broadcasting is working correctly specially for distributions with vector inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just copied some of the tests from preliz. Mostly to check that the mathematical expressions are correct. But, yes, we can improve these tests or add more.
This will be easy to use in preliz following the pattern you shared earlier. And the readjustments needed will be an immediate benefit , like an opportunity to fix inconsistencies (like the shape of moments I mentioned earlier). Also, more eyes checking on the code is always a good idea. Additionally, sharing effort between projects and reducing redundancy is also beneficial. |
|
||
def kurtosis(mu, sigma): | ||
shape = pt.broadcast_arrays(mu, sigma)[0] | ||
return pt.full_like(shape, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok that these are integers?
Adding 'Normal' as the first example, so we have a concrete example to guide our discussions. I am copying the PreliZ code with minimal changes. For instance, in the parent class
Distribution
, I only removed the plotting methods. I'm not sure what else we would like to remove. I guess we only want to keep the following methods (or even a shorter list)rvs, pdf, cdf, ppf, logpdf, logcdf, sf, logsf, isf, entropy, mean, median, mode, std, var, skewness, kurtosis.
and keep the rest in PreliZ. Also, not sure about the
__repr__
,Also, a good opportunity to ensure return types/shapes are consistent. The main use case in PreliZ is to work with something like
Normal(0, 1)
, notNormal(0, np.array([1, 2]))
. Thus, the support for the latter can be inconsistent. For example, the results forNormal(0, np.array([1, 2])).mean()
isnp.float64(0.0)
, instead ofarray([0., 0.])