-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bbi sometimes errors when submitting with --overwrite
while model is running
#322
Comments
I think that's consistent with something like this:
In this case, process 2 is the It's probably tricky to reliably time things to trigger any given error, but here's one example of an error that can be triggered by the script below (that uses
script#!/bin/sh
set -eu
tdir=$(mktemp -d "${TMPDIR:-/tmp}"/bbi-322-XXXXXXX)
cd "$tdir"
url_base=https://raw.githubusercontent.com/metrumresearchgroup/bbi/v3.3.0/integration/testdata/acop
curl -fSsL "$url_base/acop.mod" >1.mod
curl -fSsLO "$url_base/acop.csv"
bbi init --dir /opt/NONMEM
bbi nonmem run local --nm_version nm75 1.mod >run1.out 2>run1.err &
bbi nonmem run local --nm_version nm75 --overwrite 1.mod
If you have two bbi processes running and racing against each other, I don't think there's anything to do to avoid all the possible errors. I'd say the main thing to do is to prevent that in the first place by having bbi write some sort of lock file to the output directory that indicates that it is running and then have it remove the lock after it completes (related to last point here). Any subsequent bbi invocation would refuse to work on a directory with a lock file, giving a clear error. I don't think you're hoping for with One risk is when bbi dies unexpectedly without a chance to clean that up. Then you have a stale lock file around that would cause a future bbi to refuse to work. bbi could mention in the error message that, if you know another bbi process isn't running, you can manually remove the lock file. (This is similar to what Git does with its index lock.) |
I'm not fully clear what's happening here, but I noticed a strange behavior where submitting a model with
--overwrite
while that same model is currently running while causebbi
to error out sometimes. Some examples are below, but this seems to be unreliable/flaky, because sometimes I was able to reproduce it and other times it just seemed to work as expected.I'm not sure if there's anything we can really do here, at least on the
bbi
side, but I wanted to document so we can look into a little deeper at some point.As a side comment: this may be an argument for making
submit_models()
check thebbr::submit_models(..., .overwrite)
arg first and deleting directories up front. There is some discussion related to this in comments on bbr#691.truncated error output from bbr
This is where I originally saw, trying to re-run models that had failed or were stuck from a
bbr
bootstrap (working with unreleasedbbr 1.10.0.8005
).It went on longer than this, but got cut off eventually. From here, I ran the printed
bbi
command directly in the terminal and posted the full output below.full error output from bbi run in terminal
trying (and failing) to reproduce reliably with a single model
The text was updated successfully, but these errors were encountered: