Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job containers should be killed when no inner job for some time #175

Open
bzizou opened this issue Mar 3, 2020 · 1 comment
Open

Job containers should be killed when no inner job for some time #175

bzizou opened this issue Mar 3, 2020 · 1 comment

Comments

@bzizou
Copy link
Contributor

bzizou commented Mar 3, 2020

We could implement an optional feature into Leon, to automatically kill job containers that did not have any inner job for a defined timeout.

Containers may be used by users that want to schedule subjobs (for example complementary depopulated jobs, or LHC like "pilot jobs") with the side effect that the end of a container is not dependent of the the end of the inner jobs, wasting resources then...

@bzizou bzizou changed the title Job containers may be killed when no inner job for some time Job containers should be killed when no inner job for some time Mar 3, 2020
@npf
Copy link
Contributor

npf commented Mar 3, 2020

Could be implemented in the meta_sched quite easily I think (same place where oarwalltime is handled): get all running jobs in the container, if latest end time > N minutes in the past, kill container.

My suggestion would be to offer that with a job type container=autokill:N, where N is the number of minutes of grace before the container is to be killed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants