Skip to content

Conversation

Franzi2114
Copy link
Collaborator

@Franzi2114 Franzi2114 commented Mar 29, 2024

Summary

With this PR the CDF and the CCDF of the 7-parameter diffuion model are added.
See issue #2966
Relates to issue #2822

Tests

We implemented analogous tests as for the PDF

Side Effects

no

Release notes

CDF and CCDF for the 7-parameter diffusion model. Allows modeling truncated and censored data.

Checklist

  • Copyright holder: Franziska Henrich, Christoph Klauer

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@Franzi2114 Franzi2114 requested a review from SteveBronder March 29, 2024 18:41
@stan-buildbot

This comment was marked as outdated.

@Franzi2114
Copy link
Collaborator Author

Hey Steve, now the errors from before should be fixed and the wildcards are deleted. What next?

@stan-buildbot

This comment was marked as off-topic.

@Franzi2114
Copy link
Collaborator Author

Hey @SteveBronder, any news?

@Franzi2114
Copy link
Collaborator Author

Hello together, I would kindly like to ask, how we can proceed with this PR?

@Franzi2114
Copy link
Collaborator Author

Hey @SteveBronder, any news on this PR?

@Franzi2114
Copy link
Collaborator Author

Dear @SteveBronder, @andrjohns, @bob-carpenter,

I would kindly like to ask whether it would be possible to continue this PR?

@bob-carpenter
Copy link
Member

I'm sorry this got hung up without a response. There's no excuse for us leaving PRs hanging. If they're impossible or too much work, we need to make that clear earlier rather than later.

In the future, please feel free to email me about this kind of thing and I can talk to our devs and try to figure out what's going on: bcarpenter@flatironinstitute.org

@SteveBronder
Copy link
Collaborator

Hey! I'm terribly sorry. We've been doing a bunch of stuff to get laplace working in Stan and I kind of got tunnel vision. Honestly I felt like some of the math and robustness checks were above my paygrade so I pinged @andrjohns or @bob-carpenter to have a look at some things here. I'm not totally sure how to continue because I don't think I'm the right reviewer for this at this point

@bob-carpenter
Copy link
Member

The C++ in Stan is way too complicated for me, which I find sad because I wrote around half of the first release (@syclik wrote most of the rest). I'm also not enough of a statistician or applied mathematician to even understand what this function is supposed to be doing.

The only two candidates among active Stan developers would be:

Though he didn't come out and say this directly, I think you should interpret @SteveBronder's message above as saying he is not going to do it. I don't know how much time @andrjohns has to work on Stan these days. We can see if he responds. Our other C++ developers have all departed for industry.

I wish I could help myself, but I gave up trying to understand or code Stan's C++ code years ago when I couldn't finish a simple PR of my own. I think we've dug ourselves into a deep hole of complexity and I don't see any way out of it other than starting over. I've personally moved to developing samplers outside the context of Stan because integrating anything with Stan is such a headache these days.

I feel terrible that we left you hanging for so long, but I can't think of a way we can review this. In retrospect, we should've realized this was going to be too complex for us due to the form of the density (none of our testing is set up for this many arguments) and the lack of understanding of Wiener processes among our active developers. In the future, we're going to try to do better at telling people their issue isn't one we can support.

@bob-carpenter
Copy link
Member

Hi, @Franzi2114 --- could you clean up the conflicts in these files? It looks like they're not just superficial formatting. I'm diving into the code review now.

@bob-carpenter
Copy link
Member

For reference, here's a paper defining the partial derivatives of the pdf and cdf:

Hartmann, R. and Klauer, K.C., 2021. Partial derivatives for the first-passage time distribution in Wiener diffusion models. Journal of Mathematical Psychology.

They released an R package with code, wienR.

@bob-carpenter
Copy link
Member

Thanks for fixing the conflict.

@stan-buildbot

This comment was marked as off-topic.

Copy link
Member

@bob-carpenter bob-carpenter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ready to merge once you've made the changes I requested here. It's a lot of changes, but they're all fairly localized and minor. Some of them are generic and the change should be applied everywhere, like not defining a variable as a scalar and using negation rather than multiplying by -1.

Some of these are genuinely questions---usually some doc right there can help.

Some of the changes are marked optional---mostly ones that are about efficiency or style that's a matter of taste.

Comment on lines +33 to +37
if (exponent < 0) {
return ret_t(log1m_exp(exponent) - log_diff_exp(2 * v * a * w, exponent));
} else {
return ret_t(log1m_exp(-exponent) - log1m_exp(2 * v * a));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please add a comment here that indicates the branch is for numerical stability. This is the kind of thing that isn't clear without getting out pencil and paper.

hcubature_err = log_error_absolute - log_error_derivative
+ log(fabs(density)) + LOG_TWO + 1;

// computation of derivatives and precision checks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of these comment lines looks redundant, so please remove one of them or combine into one statement.

if (fabs(v) == 0.0) {
return ret_t(log1p(-w));
}
const auto exponent = -2.0 * v * a * (1.0 - w);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace -2.0 * v with v_value and (1 - w) with w_value.

What are the _value suffixes for? It feels inconsistent to have v = -v_value and w = 1 - w_value. I suggest renaming the arguments to v and w and then renaming v to neg_v and w to one_m_w. Or using some other naming scheme to indicate that v and v_value do not evaluate to the same number.

const auto v = -v_value;
const auto w = 1 - w_value;
int sign_v = v < 0 ? 1 : -1;
const auto exponent_with_1mw = sign_v * 2.0 * v * a * (1.0 - w);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional.

Same comment as above. Now that there's duplication here in the definition, it'd be nice otherwise encapsulate some of these repeated right-hand sides with functions.

Given that the sign is always multiple by 2, then I'd suggest just defining two_sign_v.

const auto temp = (sv != 0) ? square(x_vec[0]) : 0;
const auto factor = (sv != 0) ? x_vec[0] / (1 - temp) : 0;
const auto new_v = (sv != 0) ? v + sv * factor : v;
const auto new_w = (sv != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can distribute the ternary operator down and rearrange to make it all easier to read. I also removed the negation---I find the fewer negations in the conditions, the easier they are to parse in my head.

new_w = (sw == 0) ? t0 : w + sw * (x_vec[sv == 0 ? 0 : 1] - 0.5);

? ((sw != 0) ? ((st0 != 0) ? t0 + st0 * x_vec[2] : t0)
: ((st0 != 0) ? t0 + st0 * x_vec[1] : t0))
: ((sw != 0) ? ((st0 != 0) ? t0 + st0 * x_vec[1] : t0)
: ((st0 != 0) ? t0 + st0 * x_vec[0] : t0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same principle as last one.

int idx = (sv == 0 && sw == 0) ? 0 : (sv != 0 && sw != 0) ? 2 : 1;
new_t0 = (st0 == 0) ? t0 : t0 + st0 * x_vec[idx];

: ((st0 != 0) ? t0 + st0 * x_vec[0] : t0));
if (y - new_t0 <= 0) {
return ret_t(0.0);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: remove the else block and unindent 2 after the return in the if.

T_st0>::value) {
return ret_t(0);
}
using T_y_ref = ref_type_if_t<!is_constant<T_y>::value, T_y>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't completely work through this but wanted to make sure that these checks weren't being done redundantly in code being called by this code. The general principle is to test in client code, then control calls in things that clients don't call.

hcubature_err
= log_error_absolute - lerror_bound + log(fabs(cdf)) + LOG_TWO + 1;

// computation of derivatives and precision checks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like before, one of these lines feels redundant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants