Skip to content

Stop rbmi writing to cache directory on CRAN#535

Merged
gowerc merged 23 commits intomainfrom
donttest
Dec 5, 2025
Merged

Stop rbmi writing to cache directory on CRAN#535
gowerc merged 23 commits intomainfrom
donttest

Conversation

@gravesti
Copy link
Collaborator

@gravesti gravesti commented Nov 26, 2025

Closes #537

@gravesti
Copy link
Collaborator Author

@gowerc I think this catches the NOTE that CRAN was finding. It's not actually related to dont-test at all. It's triggered by _R_CHECK_THINGS_IN_OTHER_DIRS_ = true. (That said it didn't show for me locally 🤷 )

We could eventually integrate this with another workflow.

@gravesti gravesti requested a review from gowerc November 26, 2025 09:51
Copy link
Collaborator Author

@gravesti gravesti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gowerc I approve these changes (but you'll have to do it because its my PR)

@gowerc
Copy link
Collaborator

gowerc commented Nov 28, 2025

@gravesti / @danielinteractive / @luwidmer

Some notes about the changes that I've made here:

  • @gravesti created a workflow that replicated the error, I extracted the core logic of this into a shell script which can in theory be run locally and then updated the workflow to just call that shell script. You can see it replicating the error here: https://github.com/insightsengineering/rbmi/actions/runs/19768301887/job/56646410987

  • I updated the cache directory to default to tempfile(). This means by default it is unique to just the current R session so you will always have to recompile across sessions. The thinking here is this should significantly reduce the risk of future issues as this directory is automatically cleaned up after R has finished and doesn't trigger the _R_CHECK_THINGS_IN_OTHER_DIRS_ test

Currently the directories monitored are the home directory, /tmp (excluding RtmpXXXXXX directories), /dav/shm, ~/.cache (recursively) and ~/.local/share (recursively) or their equivalents on Windows and macOS (the directories in which the default settings for tools::R_user_dir() use a R subdirectory).
https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Tools

  • I updated the hashing function to consume the content of the file itself. This means if any line of code changes as well as the R version / R package versions change then a new hash is generated. This way there should be no risk of using an old cached file. As such I have removed the "clean_cache()" function as the logic was getting a bit awkward and I don't think its needed anymore.

  • I updated the code so that it constructs the model fully in memory and only ever writes to disk if getOption("rbmi.enable_cache") == TRUE

  • options("rbmi.enable_cache") is set to FALSE by default for all tests unless the global environment variable RBMI_TEST_LOCAL=TRUE

I will disable all Stan tests in a future PR

Copy link
Collaborator Author

@gravesti gravesti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gowerc Looks good, just one note

@luwidmer
Copy link
Collaborator

@gowerc you mentioned "I will disable all Stan tests in a future PR" - is this just on CRAN, or everywhere?

@gowerc
Copy link
Collaborator

gowerc commented Nov 29, 2025

@luwidmer - That would just be for CRAN. I need to adjust the code and GHA workflows a bit but I was thinking we'd have 3 tiers of tests:

  • Minimal (CRAN)
  • Core ( CI workflows + local execution of tests)
  • Full (cron job that runs twice a month)

Main justification here for the "minimal" on CRAN being that we need to get the run time down below 10 minutes which is difficult now we don't cache the model on CRAN thus easiest option is to disable all rstan tests. This is also justifiable as we rstan is only a suggests for the package anyway.

@luwidmer
Copy link
Collaborator

Works for me. For OncoBayes2 we still sample with 10 samples total and 2 warmup "fake sampling" on CRAN, but if your runtime is dominated by compilation, then I'd skip there too.

@gowerc gowerc changed the title add "donttest" workflow Stop rbmi writing to cache directory on CRAN Dec 1, 2025
@gowerc
Copy link
Collaborator

gowerc commented Dec 1, 2025

Hmmm when running this on my local machine I am sometimes getting errors such as:

* checking for new files in some other directories ... NOTE
Found the following files/directories:
  ‘/var/folders/6s/f6_p_03n6g182gs7f_fzf_5w0000gn/T/RtmpvPPIP6’
  ‘/var/folders/6s/f6_p_03n6g182gs7f_fzf_5w0000gn/T/RtmpxynGyO’
* DONE

My assumption is this is due to other background processes adding folders but I am not confident... in particular the fact the have Rtmp in the root file/folder name worries me a bit...

EDIT:

Definitely appears to be R related :(

~/Desktop/Work/rbmi donttest* 11s
> cd /var/folders/6s/f6_p_03n6g182gs7f_fzf_5w0000gn/T/RtmpvPPIP6     

/var/folders/6s/f6_p_03n6g182gs7f_fzf_5w0000gn/T/RtmpvPPIP6
> ls
libloc_228_cce4ed05d3cb9016.rds
image

EDIT2:
I'm 80% confident this is a false positive due to me running multiple tests in parallel. Looks like other R process writing to their temp directory seems to trigger the _R_CHECK_THINGS_IN_OTHER_DIRS_ test

@gowerc
Copy link
Collaborator

gowerc commented Dec 1, 2025

Ok apologies this PR has ballooned a bit. Unfortunately to get the unit tests to work in a more sensible and clear way required a minor overhaul to them.

  • Nearly all GHA workflow tests now just point to the tests/scripts/*.sh files
  • I've renamed the additional test tiers to be "core" and "extended". By default neither of these tiers will be run unless explicit environment variables have been set (which is what the *.sh files control)
  • I've renamed the main GHA workflows to be "core" / "extended" / "cran" to represent what they cover. In particular "cran" includes the _R_CHECK_THINGS_IN_OTHER_DIRS_ test to replicate the aforementioned error.
  • Had to update the docker image to put all of TinyTex bin on the $PATH so that the CRAN manual checks didn't fail (as well as implement a bunch of changes to get corporate certs working for local building of the image)
  • Deleted the "build-src" and "rcmdcheck" GHA as neither of these are needed anymore with the above "*.sh" script inclusion

@gravesti / @luwidmer / @danielinteractive - I think I am done with this now if any of you can spare any time to skim the code again. To be honest I'd ignore all the docker / unit test changes I think the important stuff is just the changes to R/utilities.R and R/zzz.R.

In particular I wasn't sure about the save_dso argument to rstan::stan_model() I set it to FALSE as a precaution but I'm honestly not sure if I should leave it to TRUE or not ?

Copy link
Collaborator Author

@gravesti gravesti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gowerc From my POV this looks good to merge. I can't actually approve since I opened the PR.

@gowerc gowerc merged commit 9bf273a into main Dec 5, 2025
5 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 5, 2025
@gowerc gowerc deleted the donttest branch January 20, 2026 14:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rbmi writing files to cache directory during CRAN tests

3 participants