Skip to content

updated Cmdstan to 2.37 and updated package version in project toml#30

Merged
bparbhu merged 3 commits intorelease/stanvar-v4_5_3from
ci/update_cmdstan_to_237
Mar 29, 2026
Merged

updated Cmdstan to 2.37 and updated package version in project toml#30
bparbhu merged 3 commits intorelease/stanvar-v4_5_3from
ci/update_cmdstan_to_237

Conversation

@bparbhu
Copy link
Copy Markdown
Contributor

@bparbhu bparbhu commented Nov 20, 2025

Updating CmdStan to 2.37 and updating the package version in Project toml to 4.5.3

@bparbhu bparbhu self-assigned this Nov 20, 2025
@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Nov 20, 2025

CI failed at the bernoulli run. I'll check to see what's going on there.

@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Nov 20, 2025

The old expectation running in the tests are -8.2 in terms of what's expected we might need to shift to -8.5 or -8.6

Copy link
Copy Markdown

@caesoma caesoma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear why atol would change, could you please clarify?

@JeffreyBroll
Copy link
Copy Markdown
Contributor

JeffreyBroll commented Nov 22, 2025

I'm further confused about why these are the parameters we're checking. These correspond to checks of the 'log_p__' and 'log_g__', which I suspect mean much more to people who aren't me, and we aren't considering $\theta$, which we're actually trying to fit and can compare with the closed-form solution $\beta(4,8)$ (mean 1/3, stdev ~ 0.13ish). What's the reasoning behind the original and new tests?

@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Nov 22, 2025

Hey all, So I'm not too sure about why the atol values are set to those particular values. Maybe @goedman could you help give us an understanding of the tests written here. I just padded them with .7 as the values we were getting were deviating from -8.2 so just checking to understand this a bit. Again there are no issues I think with cmdstan 2.37 but just trying to understand the reason for the values. I set them that way to just get them to pass the tests and also based on the new values for cmdstan 2.37.

@JeffreyBroll
Copy link
Copy Markdown
Contributor

I do want to hold off a little until we know what the idea behind these tests are. Maybe we should take notes from other interfaces...

@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Mar 8, 2026

Hey all, I've been sitting on this for a bit. I have not seen the criteria that we test for in CmdStanPy or CmdStanR.

Also something to keep in mind is this interface was made before the other two. So what they decided to test for might be different.

In CmdStanPy this is the variational test that is run https://github.com/stan-dev/cmdstanpy/blob/develop/test/test_variational.py .

In CmdStanR this is the test for variational here https://github.com/stan-dev/cmdstanr/blob/master/tests/testthat/test-model-variational.R

In the future we can adopt something simliar to what is done in those interfaces but for now I would like to push this along.

the values that we had previously can be reverted if necessary but we can also build towards tests that align with the other interfaces as well.

I looked through the tests for these values to see if there were assertions and there were not any around atol values.

@test ms[1, 2, 1] ≈ -8.2 atol=0.7
@test ms[1, 2, 2] ≈ -8.2 atol=0.7
@test ms[1, 2, 3] ≈ -8.2 atol=0.7
@test ms[1, 2, 4] ≈ -8.2 atol=0.7

Again. We can revert back to these values if need be

  @test ms[1, 2, 1] ≈ -8.2 atol=0.3
  @test ms[1, 2, 2] ≈ -8.2 atol=0.3
  @test ms[1, 2, 3] ≈ -8.2 atol=0.3
  @test ms[1, 2, 4] ≈ -8.2 atol=0.3

@JeffreyBroll
Copy link
Copy Markdown
Contributor

I still just don't quite know that these are the right tests. The R tests make sure things run or fail as expected and the py tests include some approximate value checks for good step size parameters and fits, and I think that makes enough sense to imitate; it's not clear what this test is getting us or how we'd use it to diagnose anything.

@bparbhu bparbhu requested a review from caesoma March 9, 2026 02:09
@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Mar 9, 2026

Understood, even if we want to redesign what the test were to be I still think we should push forward. The only thing we really did was accommodate the newer version of CmdStan we didn't do anything that would warrant anything serious to be honest. I do agree that we should understand why we are using tests and what they should represent but given the circumstances we could bring this up in the Stan discourse and maybe get all of us to centralize our tests and what we test for. It's something that other Stan maintainers have been striving for anyway in terms of consistency across the interfaces. In such, I think we should proceed but follow up with others in terms of consolidating the types of tests we're doing.

@JeffreyBroll
Copy link
Copy Markdown
Contributor

Given that extra note on the to-do list I'm in favor.

@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Mar 9, 2026

Given that extra note on the to-do list I'm in favor.

Awesome, thanks @JeffreyBroll, much appreciated!

@bparbhu
Copy link
Copy Markdown
Contributor Author

bparbhu commented Mar 29, 2026

merging this in now.

@bparbhu bparbhu merged commit a05d772 into release/stanvar-v4_5_3 Mar 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants