Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binomial_coefficient_log produces wrong results when N >> n #1592

Closed
martinmodrak opened this issue Jan 7, 2020 · 4 comments · Fixed by #1614
Closed

binomial_coefficient_log produces wrong results when N >> n #1592

martinmodrak opened this issue Jan 7, 2020 · 4 comments · Fixed by #1614
Milestone

Comments

@martinmodrak
Copy link
Contributor

martinmodrak commented Jan 7, 2020

Description

For some values, the results from binomial_coefficient_log don't match those computed by R or Mathematica.

In #239, it is noted that we don't know, where the approximation we use for the case when N >> n comes from. I also don't understand it. R uses a quite a different approximation - their implementation is

double attribute_hidden lfastchoose(double n, double k)
{
    return -log(n + 1.) - lbeta(n - k + 1., k + 1.);
}

Stirling approximation produce good results for some of the cases, but also terrible results for others... The R's implementation seems to work neatly, but depends on quite a bunch of other code to compute the lbeta.

Example

In Stan:

binomial_coefficient_log(1e20, 100) = 4141.4308104325282
binomial_coefficient_log(845000, 8.45e-10) =  1.1925051146203358e-008

Expected Output

Compare to Mathematica, Rs lchoose and Stirling approximation implemented in R
via

lchoose_stirling <- function(n, k) {
  -(0.5) * (log(2) + log(pi) + log(k)) + k * (log(n) + 1 - log(k))
}

In both cases, there is a mismatch in second digit.
https://www.wolframalpha.com/input/?i=log%2810%5E20+choose+100%29

Stan:  4141.4308104325282
Math:  4241.430810432527877842402915999072777157961556194658677330
R:     4241.4308104325273
Stir:  4241.4316437630832

https://www.wolframalpha.com/input/?i=log%28845000+choose+845*10%5E-12%29

Stan:  1.1925051146203358e-008
Math:  1.2019540397111151911538109607610600259536643522634943 × 10^-8
R:     0 #Not great
R-cor: 1.2019542694474694e-08
Stir:  9.5269037411112816 #Even worse

Here R-cor is computed using the lfastchoose formula directly, e.g.:

lchoose_lbeta <- function(n, k) {
  - log1p(n) -lbeta(n - k + 1, k + 1) 
}

for some reason, lchoose calls a different computation method for those parameters

Current Version:

v3.0.0

@martinmodrak
Copy link
Contributor Author

This also suggests that binomial and beta densities could be subject to imprecisions :-(

@bob-carpenter
Copy link
Member

bob-carpenter commented Jan 7, 2020 via email

@martinmodrak
Copy link
Contributor Author

So I made a mistake in the initial description - Stirling approximation works reasonably only for some of the problematic cases (description now updated). Searching online, the links I found either say "use lgamma" (which we know is not enough) or "steal code from R", which we can't do as it is GPL. And the idea in R's code is (for me) non-trivial to adapt. The main workhorse in R is a function called lgamma_cor which computes the difference between Stirling's approximation to lgamma and actual lgamma value. They then proceed by computing the Stirling's approximation for lbeta and adding the corrections separately. Unfortunately, the code for lgamma_cor is opaque to me, so I can't reimplement it to avoid licensing issues. It uses something called chebyshev_eval which is probably related to Chebyshev polynomials, which have their boost implementation but I am totally out of my depth here and don't understand the surrounding math at all :-(

@martinmodrak
Copy link
Contributor Author

Also, if I read the code correctly, Stan now relies on autodiff for derivatives of binomial_coefficient_log (I believe Stan's lchoose delegates to binomial_coefficient_log in math and there is no rev version of the function. But autodiffing through a complex approximation (as the one in R) might be undesirable, so analytic derivatives might need to be developed and tested as well. We do have analytic derivatives for lbeta implementation we have, but I would suspect they would fail where the current implementation fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants