-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terraform: switch to pd-balanced, reduce size to save cost #2102
Conversation
matches current auto-upgrade version, doesn't actually upgrade nodes
c975eea
to
ad3c6c4
Compare
@@ -71,6 +79,50 @@ resource "google_container_node_pool" "user" { | |||
node_locations = ["${local.location}-a"] | |||
version = local.gke_version | |||
|
|||
autoscaling { | |||
min_node_count = 0 | |||
max_node_count = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not trigger immediate scale-down. Autoscale doesn't force these bounds to be satisfied continuously. Instead, it should only prevent scale-up.
should reduce operational costs by ~1k/month sets autoscale-max on existing user pool to 1 to avoid allocation of new nodes. Can delete old user pool once it's drained.
ad3c6c4
to
3f9a0d9
Compare
Would this cost us performance in terms of docker pull throughput? That was why we made them big SSDs to begin with, as performance unfortunately scales linearly with size. The newer |
saves cost without losing as much space/performance
I have no idea. I don't think we've done any measuring on how much Disk IO limits pulls vs network IO. gcloud docs suggest a 512GB SSD should have 384/768 MBps read/write performance. That seems very fast! But it's ~1/2 what we should have now. I think we should probably switch to pd-balanced and see what happens. Switching to pd-balanced without losing size would save 40%. We could save the same 50% if we dropped to 800, so I did that. I didn't know about pd-balanced! I switched the core pool to that as well. |
OK, I'll give this a go. |
from 1T. Should save ~$1k/month
Ignore the version bumps, which are only to make terraform match what's already been auto-upgraded to, so the new pool has the same version as what's already deployed.
This creates a new user pool to replace the old one, but leaves the existing one with an autoscale max of 1 so it will slowly drain (will need some help with cordoning). Once it's drained, another PR can actually delete the old pool.
I'll do the apply after this is merged. I've synced the versions, but not created the new pool.
xref: cost calculations