Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Containment #78

Open
SteVwonder opened this issue Jul 10, 2018 · 5 comments
Open

Support for Containment #78

SteVwonder opened this issue Jul 10, 2018 · 5 comments

Comments

@SteVwonder
Copy link
Member

Not sure this is the right place for this issue, but on the timeline there is a post-it for "IMP Exec and Containment". The "IMP Exec" piece seems to be broken up across #45, #46, and #47. I do not see anything on containment in this repo.

Presumably some types of containment will require root/setuid and thus must be integrated into the IMP. Maybe there should be a corresponding issue in flux-core for all types of containment that are possible without root?

@SteVwonder
Copy link
Member Author

It looks like I cannot set this issue under projects/labels/milestones, probably due to permissions. If this issue is going to stick around, could @garlick or @grondo add it to the multi-user project?

@grondo
Copy link
Contributor

grondo commented Jul 10, 2018

support for containment plugins is described in #3.. probably not exactly an umbrella issue like we'd want to tie to the multi-user project.

I've added the flux-framework core team to this repo, so you should be able to modify it now.

@grondo
Copy link
Contributor

grondo commented Jul 10, 2018

Presumably some types of containment will require root/setuid and thus must be integrated into the IMP. Maybe there should be a corresponding issue in flux-core for all types of containment that are possible without root?

The way I use "containment" is to indicate those container and resource restriction methods that require privilege and are therefore inescapable by the user job.

Maybe we can call non-privileged resource containment "binding" instead, since it is completely under the user's control, can be un-done, etc.

Any binding plugins or features would likely be implemented as part of the job shell, or the user's own job.

@grondo
Copy link
Contributor

grondo commented Jul 10, 2018

As an example, if we want to make a containment plugin for CUDA devices, we might create an IMP plugin that would use a devices cgroup and add only the gpus assigned to the job to its devices cgroup.

Within the job container, the job shell could then set CUDA_VISIBLE_DEVICES individually to "bind" one or more GPUs to each task. Without the containment piece, jobs sharing nodes could access gpus of other jobs simply by resetting CUDA_VISIBLE_DEVICES.

@dongahn
Copy link
Member

dongahn commented Jul 10, 2018

BTW, don't forget we will have to do this for AMD ROCm GPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

3 participants