deployer: add info generating commands under resource-allocation #3337

consideRatio · 2023-10-28T16:19:16Z

This PR adds two commands of relevance towards refining the resource-allocation script to meet more needs. The added commands are updating their separate .yaml file's with collected information.

Closes Refine script to collect information about instances capacity #3338
Closes Add script to collect information about daemonsets cpu/memory requests in various clusters #3330

Script overview

daemonset-requests summarizes the requests from all daemonsets, and clarifies what daemonsets contributed to the requests and what daemonset didn't.

The script makes us better understand and clarify differences observed. It helped me verify that the current overhead is mostly cloud provoider specific, to some extent cluster specific (feature and k8s version differences), and to a small extent instance type/node pool specific (gpu drivers).

A caveat of this implementation strategy is that it doesn't intelligently rule out that a daemonset may not only be schedule on some nodes, for example core nodes only, or on nodes with gpus. I've filtered out the gpu related daemonsets manually, and not observed any core node pool only daemonsets.
instance-capacities scrapes the kubectl get node reported capacity and allocatable (allocatable capacity), and saves both the lowest and highest reported value to help us overview the spread of these values across clusters. It turns out that there is a slight spread, but its very low.

With `update-node-info`, why these commands?

These are iterations on update-node-info, but they are added separately as update-node-info directly hooks into the resource-allocation choices command, making an update to it break the existing behavior.

GeorgianaElena

🚀 Thanks @consideRatio

deployer/README.md

deployer/commands/generate/resource_allocation/daemonset_overhead.py

deployer/commands/generate/resource_allocation/daemonset_overhead.yaml

consideRatio · 2023-10-30T13:20:24Z

Thank you for reviewing this @GeorgianaElena! I've not resolved all comments, but will get it done soonish!

Co-authored-by: Georgiana <georgiana.dolocan@gmail.com>

consideRatio · 2023-11-04T10:21:23Z

Thank you for reviewing this @GeorgianaElena!!!

Rebased
renamed command instance-capacity to instance-capacities
renamed command daemonset-overhead to daemonset-requests
updated docstrings to reflect feedback from @GeorgianaElena

I'll go for a merge here at this point!

github-actions · 2023-11-04T10:21:58Z

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/6754266612

consideRatio requested a review from a team as a code owner October 28, 2023 16:19

github-actions bot assigned consideRatio Oct 28, 2023

This comment was marked as resolved.

Sign in to view

consideRatio marked this pull request as draft October 28, 2023 16:21

consideRatio force-pushed the pr/update-node-info branch 2 times, most recently from d7fcda1 to 83279dd Compare October 28, 2023 16:29

consideRatio marked this pull request as ready for review October 28, 2023 16:58

consideRatio mentioned this pull request Oct 28, 2023

Q4 Reduced workload goal - Oct 18 Sprint 1 tracking issue #3318

Closed

GeorgianaElena approved these changes Oct 30, 2023

View reviewed changes

consideRatio and others added 5 commits November 4, 2023 11:00

deployer: add info generating commands under resource-allocation

b818a93

Apply suggestions from code review

c0317bc

Co-authored-by: Georgiana <georgiana.dolocan@gmail.com>

deployer: rename command daemonset-overhead to daemonset-requests

089f8f9

deployer: update docstring for info commands

1610669

deployer: rename command instance-capacity to instance-capacities

baaea25

consideRatio force-pushed the pr/update-node-info branch from 5854d47 to baaea25 Compare November 4, 2023 10:00

consideRatio merged commit 1be1355 into 2i2c-org:master Nov 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deployer: add info generating commands under resource-allocation #3337

deployer: add info generating commands under resource-allocation #3337

consideRatio commented Oct 28, 2023 •

edited

Loading

This comment was marked as resolved.

GeorgianaElena left a comment

consideRatio commented Oct 30, 2023

consideRatio commented Nov 4, 2023 •

edited

Loading

github-actions bot commented Nov 4, 2023

deployer: add info generating commands under resource-allocation #3337

deployer: add info generating commands under resource-allocation #3337

Conversation

consideRatio commented Oct 28, 2023 • edited Loading

Script overview

With update-node-info, why these commands?

This comment was marked as resolved.

GeorgianaElena left a comment

Choose a reason for hiding this comment

consideRatio commented Oct 30, 2023

consideRatio commented Nov 4, 2023 • edited Loading

github-actions bot commented Nov 4, 2023

consideRatio commented Oct 28, 2023 •

edited

Loading

With `update-node-info`, why these commands?

consideRatio commented Nov 4, 2023 •

edited

Loading