-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Containment #78
Comments
support for containment plugins is described in #3.. probably not exactly an umbrella issue like we'd want to tie to the multi-user project. I've added the flux-framework core team to this repo, so you should be able to modify it now. |
The way I use "containment" is to indicate those container and resource restriction methods that require privilege and are therefore inescapable by the user job. Maybe we can call non-privileged resource containment "binding" instead, since it is completely under the user's control, can be un-done, etc. Any binding plugins or features would likely be implemented as part of the job shell, or the user's own job. |
As an example, if we want to make a containment plugin for CUDA devices, we might create an IMP plugin that would use a devices cgroup and add only the gpus assigned to the job to its devices cgroup. Within the job container, the job shell could then set CUDA_VISIBLE_DEVICES individually to "bind" one or more GPUs to each task. Without the containment piece, jobs sharing nodes could access gpus of other jobs simply by resetting CUDA_VISIBLE_DEVICES. |
BTW, don't forget we will have to do this for AMD ROCm GPUs. |
Not sure this is the right place for this issue, but on the timeline there is a post-it for "IMP Exec and Containment". The "IMP Exec" piece seems to be broken up across #45, #46, and #47. I do not see anything on containment in this repo.
Presumably some types of containment will require root/setuid and thus must be integrated into the IMP. Maybe there should be a corresponding issue in flux-core for all types of containment that are possible without root?
The text was updated successfully, but these errors were encountered: