Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PC prior distribution for Student T dof #252

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

bwengals
Copy link
Contributor

Moving this over from here.

What is this PR about?

Adding penalized complexity prior for the Student-T degrees of freedom parameter. Useful in models where the likelihood was normal, but you'd need some robustness so you switch to a Student T likelihood. It's implemented already in INLA.

The reason this is useful for modeling is you can "robustify" your Gaussian likelihood by making it a student t in a more principled way. You can use this prior to express in a meaningful way "I think there is a 50% or 20% or whatever chance that the degrees of freedom is over 30 (~normal likelihood)", whereas if you use a Gamma(2, 0.1) or worse fix the degrees of freedom to some value, you risk watering down the information coming from the data via the likelihood.

Outstanding issues

  • Still need to add tests but I think it'll be good enough for now to compare the logp to INLA.
  • Might be better to use BoundedContiuous instead of PositiveContinuous? Went down that rabbit hole some already, might need some help here. Seems to work fine now for normal use, but I haven't tested too many weird edge cases.

@ricardoV94 ricardoV94 added the enhancements New feature or request label Oct 23, 2023
)


def tri_gamma_approx(x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is already implemented

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approximation will be much more performant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you added trigamma recently, I'll give that a try. I used this approx because at the time the gradient wasn't implement yet, where the gradient for the approx is easy. Wasn't concerned with performance at the time, but will take another look

@@ -216,3 +226,62 @@ def moment(rv, size, mu, sigma, xi):
if not rv_size_is_none(size):
mode = pt.full(size, mode)
return mode


class PCPriorStudentT_dof_RV(RandomVariable):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a docstring

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we don't document the RV, but the Distribution class, which doesn't have a docstring either

)


def tri_gamma_approx(x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approximation will be much more performant

NU_MIN = 2.0 + 1e-6
nu = np.concatenate((np.linspace(NU_MIN, 2.4, 2000), np.linspace(2.4 + 1e-4, 4000, 10000)))
return UnivariateSpline(
studentt_kld_distance(nu).eval()[::-1],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an eval is a bit dangerous. If it comes up from an RV you're going to get a random value. The safe thing to do is to constant_fold and raise if it can't be done.

Or create a PyTensor Op that wraps UnivariateSpline

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have such an op?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@bwengals bwengals Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ricardoV94 It only comes from nu, which is passed in above as a fixed numpy array, so I think eval is safe here (unless I'm missing your point). I'm using this to get a spline approximation to the inverse of this function, which is what the [::-1] bit at the end of the inputs is about.

Thanks @ferrine, will look into that. I remember needing to use UnivariateSpline this way because I needed this particular behavior

if ext=3 of ‘const’, return the boundary value.

as nu goes to infinity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed the inputs were constant, nvm on my end

Copy link
Member

@jessegrabowski jessegrabowski Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's always a known constant could you use .data instead of .eval()?

@classmethod
def get_lam(cls, alpha=None, U=None, lam=None):
if (alpha is not None) and (U is not None):
return -np.log(alpha) / studentt_kld_distance(U)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return -np.log(alpha) / studentt_kld_distance(U)
return -pt.log(alpha) / studentt_kld_distance(U)

@bwengals
Copy link
Contributor Author

bwengals commented Dec 8, 2023

To update where this is at, tests were passing except for windows and I couldn't figure out why. Not sure what here would depend on windows so maybe need to dig into that in a VM or something to figure that out. Definitely some room to improve how some of this is structured, but was working well enough I was comfortable using it.

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 8, 2023

@bwengals the biggest difference with Windows generally is that they sometimes default to int32 dtypes where linux defaults to int64 IIRC. Maybe something in the np.linspace overflows in windows? (I didn't look at which test was even failing, so apologies if this is completely off the target)

@bwengals
Copy link
Contributor Author

overflows in windows?

Ah I bet that's it! The test produces inf and it shouldn't. Well thanks for giving me a starting place

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 11, 2023

@ricardoV94
Copy link
Member

@bwengals interested in updating this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancements New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants