-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stale file handle error #128
Comments
Original comment by Michael Hoffman (Bitbucket: hoffman, GitHub: michaelmhoffman). This is likely to have to do with configuration issues on the cluster and not anything to do with Segway's programming. Nor is there anything likely we can really do about this by changing Segway. |
Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86). I believe there may be a race condition here in the way observation files are managed in minibatch mode. Currently, observation filenames are not instance-specific. It seems possible, for example, that two instances could simultaneously be attempting to delete or open the same observation file for reading/writing. Section A.10 of the NFS FAQ also mentions ESTALE errors being reported when referring to items that may have been deleted. The |
Original comment by Michael Hoffman (Bitbucket: hoffman, GitHub: michaelmhoffman).
|
Original report (BitBucket issue) by Mickaël Mendez (Bitbucket: Mickael Mendez).
While running Segway 2.0.2 in reverse mode I ran into a
Stale file handle error
. Below are the logs and the command of the job that failed.segway command
Segway output
EM training error
The source of the error seem to come from the job:
emt0.19.1233.train.637ed75e7b0f11e8975fbd311cda90ee
____ PROGRAM ENDED SUCCESSFULLY WITH STATUS___
This jobs has two unexpected behavior:
jobs.tab
The text was updated successfully, but these errors were encountered: