-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRA: support dynamic device provisioning #5075
Comments
/cc |
1 similar comment
/cc |
Please put the PR in draft mode if you want reviews |
Thank you @sunya-ch, this is really interesting. As mentioned in the WG meeting, there are several KEPs that I think have some overlap, and I'd like to try to bring them together into a single solution, if possible. In particular, this proposal has some similarities to a concept we have been discussing for a while which I call "per-device allocatable resources" (see this doc which discusses this concept among other things, though I am not sure I use that term in the doc). In that concept, the "capacities" of a given device can be allocatable, allowing sharing of the device in the same manner that node allocatable resources allow sharing of a node. In your case, you do the same thing but you provision a new device to represent that set of shared capacity, and reference the source device. That may be useful construct, I need to think about it some more. In the "per-device allocatable resources", we don't need an explicit provisioning limit; once all of a capacity is consumed, you can't allocate any more of it. Another aspect of this I see is that the need to "provision" creates a need for a lifecycle of resource claim actuation. This is similar to what is needed for #5007, except in the networking case it does not (I don't think?) need to happen before the pod gets scheduled. So that may not be quite the same thing. My next step is to look at this #5007, #5075, #4815, and a few other ideas that don't yet have enhancement issues, and start a doc where we can work through a design together. cc @catblade |
/cc |
/cc |
@johnbelamaric Thank you so much for the pointer to related enhancement issues. I will walk through the list too. |
Thank you. I will create a PR in draft. |
Please also update the issue description to use the normal KEP template (checklists, etc.). Then you can use the 5075 issue number as the number in your KEP PR. |
@sunya-ch please create the draft PR and I will comment there. I was going to put together a doc but I think commenting is probably better. I think we might be able to solve this and @catblade's use cases with per-device allocatable resources. My current thinking is that this is distinct from the disaggregated device KEP, because your device creation is all still node local and therefore the lifecycle is the same as our existing device models. But once I have the PR I can provide a more thorough response. |
/cc |
@pohly Thank you for your advice. I have updated the issue description. @aojea @johnbelamaric I have created the draft PR #5104 |
Enhancement Description
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update PR(s):Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.
Note:
Motivation:
The device driver can generate a new device and allocate to the pod dynamically based on selected host device or profile.
The original use case is the CNI driver which can call
macvlan
oripvlan
to generate a virtual device with given master network device.This enhancement leverages benefit of DRA over conventional CNI approach especially in multi-network context which I would like to highlight on user story 1 in the KEP draft. Assume a node has 2 x 10Gbps NICs, if 1 NIC has been allocated to one pod, another pod can request for their 10Gps network without a need to check which NIC has not allocated yet and hard code the master device name.
The text was updated successfully, but these errors were encountered: