-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE]EKS NodeGroups scalability #197
Comments
@a13zen Feel free to add more context about the ask here |
Testing by setting the desired/minimum to 0 sees the ASG terminating but then deploying 2 nodes again. This could be due to the base workload deployed by the EKS module |
Hey @a13zen I was able to test the workflow and here is an update:
Expectation: when a user launches a GPU pod/job (for this context), the CA will query the tags on the GPU NG and scale out appropriately, thereby running the GPU pod/job. When the GPU NG is launched, it is expected behavior that Having said the above, i am thinking off a design where i would refactor the EKS module to launch a system NG with |
Yes, having a simple system NG with small instances could be a good middle ground for sure. Do we know if m5.large would be sufficient for the default services deployed by the EKS module? |
From my understanding, |
Is your feature request related to a problem? Please describe.
Related to
modules/compute/eks
Describe the solution you'd like
The current manifests deploy EKS Managed node groups with desired count of atleast 1. Test if the workloads can scale with 0 as the starting capacity, so we can save $ for customers
The text was updated successfully, but these errors were encountered: