Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deployer: add info generating commands under resource-allocation #3337

Merged
merged 5 commits into from
Nov 4, 2023

Conversation

consideRatio
Copy link
Contributor

@consideRatio consideRatio commented Oct 28, 2023

This PR adds two commands of relevance towards refining the resource-allocation script to meet more needs. The added commands are updating their separate .yaml file's with collected information.

Script overview

  • daemonset-requests summarizes the requests from all daemonsets, and clarifies what daemonsets contributed to the requests and what daemonset didn't.

    The script makes us better understand and clarify differences observed. It helped me verify that the current overhead is mostly cloud provoider specific, to some extent cluster specific (feature and k8s version differences), and to a small extent instance type/node pool specific (gpu drivers).

    A caveat of this implementation strategy is that it doesn't intelligently rule out that a daemonset may not only be schedule on some nodes, for example core nodes only, or on nodes with gpus. I've filtered out the gpu related daemonsets manually, and not observed any core node pool only daemonsets.

  • instance-capacities scrapes the kubectl get node reported capacity and allocatable (allocatable capacity), and saves both the lowest and highest reported value to help us overview the spread of these values across clusters. It turns out that there is a slight spread, but its very low.

With update-node-info, why these commands?

These are iterations on update-node-info, but they are added separately as update-node-info directly hooks into the resource-allocation choices command, making an update to it break the existing behavior.

@consideRatio consideRatio requested a review from a team as a code owner October 28, 2023 16:19
@github-actions

This comment was marked as resolved.

@consideRatio consideRatio marked this pull request as draft October 28, 2023 16:21
@consideRatio consideRatio force-pushed the pr/update-node-info branch 2 times, most recently from d7fcda1 to 83279dd Compare October 28, 2023 16:29
@consideRatio consideRatio marked this pull request as ready for review October 28, 2023 16:58
Copy link
Member

@GeorgianaElena GeorgianaElena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Thanks @consideRatio

deployer/README.md Outdated Show resolved Hide resolved
deployer/README.md Outdated Show resolved Hide resolved
@consideRatio
Copy link
Contributor Author

Thank you for reviewing this @GeorgianaElena! I've not resolved all comments, but will get it done soonish!

@consideRatio
Copy link
Contributor Author

consideRatio commented Nov 4, 2023

Thank you for reviewing this @GeorgianaElena!!!

  • Rebased
  • renamed command instance-capacity to instance-capacities
  • renamed command daemonset-overhead to daemonset-requests
  • updated docstrings to reflect feedback from @GeorgianaElena

I'll go for a merge here at this point!

@consideRatio consideRatio merged commit 1be1355 into 2i2c-org:master Nov 4, 2023
Copy link

github-actions bot commented Nov 4, 2023

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/6754266612

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done 🎉
2 participants