Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write bbi_config.json even on failure? #318

Open
kyleam opened this issue May 10, 2024 · 0 comments
Open

write bbi_config.json even on failure? #318

kyleam opened this issue May 10, 2024 · 0 comments

Comments

@kyleam
Copy link
Collaborator

kyleam commented May 10, 2024

[ related to metrumresearchgroup/bbr#685, +cc @seth127 @barrettk ]

For the various errors where RecordConcurrentError is invoked (including a non-zero exit of the nmfe* subprocess)

bbi/cmd/local.go

Lines 185 to 191 in 835953e

cerr := executeNonMemJob(executeLocalJob, l.Nonmem)
if cerr.Error != nil {
p := &l
p.BuildExecutionEnvironment(false, cerr.Error)
RecordConcurrentError(p.Nonmem.Model, cerr.Notes, cerr.Error, channels, p.Cancel, p)
}

... true is sent the cancel channel

bbi/cmd/root.go

Lines 177 to 178 in 835953e

func RecordConcurrentError(model string, notes string, err error, channels *turnstile.ChannelMap, cancel chan bool, executor PostWorkExecutor) {
cancel <- true

... which stops the execution before the turnstile cleanup phase.

As a result, bbi_config.json isn't written. As discussed at metrumresearchgroup/bbr#685, from bbr's perspective it would be useful to have bbi_config.json consistently written as an indication that a model run has completed (even if unsuccessfully).

When considering this change, we should go through local's Cleanup and decide which parts should run in case of a failure.


Here are three related issues mentioned in the bbr thread. (At this point I'll track them here, but if they don't end up getting handled by the same series, they should be split off into dedicated issues.)

  • if bbi aborts early due to the data file not existing, it doesn't write the config file or create the output directory. From a consistency perspective, it might be good if bbi_config.json was created. I'm not sure where to draw the line between a pure bbi early non-zero exit (e.g., unknown argument given) and a model run error. But if the data file doesn't exist, that's something nmfe would fail on, and bbi is just catching it early. So it seems reasonable for it to behave consistently with other nmfe failures.

    There are perhaps other cases that behave similarly to the non-existent data path.

  • if --overwrite is passed but bbi aborts to a non-existent data path (again, there may be other cases), the directory is not removed. That's confusing from bbr's perspective because it looks at bbi_config.json and will think it's from its latest model submission.

  • bbr is interested in knowing whether a model is currently running. To help, bbi could do the combination of 1) adding a file before launching nmfe* and removing it on cleanup and 2) ensuring cleanup still fires when the nmfe* subprocess exits with a non-zero status

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant