terraform: switch to pd-balanced, reduce size to save cost #2102

minrk · 2022-01-18T11:20:19Z

from 1T. Should save ~$1k/month

Ignore the version bumps, which are only to make terraform match what's already been auto-upgraded to, so the new pool has the same version as what's already deployed.

This creates a new user pool to replace the old one, but leaves the existing one with an autoscale max of 1 so it will slowly drain (will need some help with cordoning). Once it's drained, another PR can actually delete the old pool.

I'll do the apply after this is merged. I've synced the versions, but not created the new pool.

xref: cost calculations

matches current auto-upgrade version, doesn't actually upgrade nodes

minrk · 2022-01-18T11:26:05Z

terraform/prod/main.tf

@@ -71,6 +79,50 @@ resource "google_container_node_pool" "user" {
  node_locations = ["${local.location}-a"]
  version        = local.gke_version

+  autoscaling {
+    min_node_count = 0
+    max_node_count = 1


This should not trigger immediate scale-down. Autoscale doesn't force these bounds to be satisfied continuously. Instead, it should only prevent scale-up.

should reduce operational costs by ~1k/month sets autoscale-max on existing user pool to 1 to avoid allocation of new nodes. Can delete old user pool once it's drained.

yuvipanda · 2022-01-18T15:28:24Z

Would this cost us performance in terms of docker pull throughput? That was why we made them big SSDs to begin with, as performance unfortunately scales linearly with size.

The newer pd-balanced type is also something to try to cut cost. That's what I now use on my berkeley clusters.

saves cost without losing as much space/performance

minrk · 2022-01-25T10:20:54Z

Would this cost us performance in terms of docker pull throughput?

I have no idea. I don't think we've done any measuring on how much Disk IO limits pulls vs network IO. gcloud docs suggest a 512GB SSD should have 384/768 MBps read/write performance. That seems very fast! But it's ~1/2 what we should have now.

I think we should probably switch to pd-balanced and see what happens. Switching to pd-balanced without losing size would save 40%. We could save the same 50% if we dropped to 800, so I did that. I didn't know about pd-balanced! I switched the core pool to that as well.

minrk · 2022-01-25T11:37:51Z

OK, I'll give this a go.

minrk added 2 commits January 18, 2022 12:07

update terraform to 1.1

8bf7c4f

terraform: bump kubernetes to 1.19.14-gke.1900

20ef6c7

matches current auto-upgrade version, doesn't actually upgrade nodes

minrk force-pushed the reduce-disk-size branch from c975eea to ad3c6c4 Compare January 18, 2022 11:22

minrk commented Jan 18, 2022

View reviewed changes

add user pool with reduced pd-ssd size

3f9a0d9

should reduce operational costs by ~1k/month sets autoscale-max on existing user pool to 1 to avoid allocation of new nodes. Can delete old user pool once it's drained.

minrk force-pushed the reduce-disk-size branch from ad3c6c4 to 3f9a0d9 Compare January 18, 2022 14:44

switch storage pd-balanced

c0377e9

saves cost without losing as much space/performance

yuvipanda approved these changes Jan 25, 2022

View reviewed changes

minrk changed the title ~~terraform: Reduce PD-SSD disk size to 500GB~~ terraform: switch to pd-balanced, reduce size to save cost Jan 25, 2022

minrk merged commit b830a88 into jupyterhub:master Jan 25, 2022

minrk deleted the reduce-disk-size branch January 25, 2022 11:38

This was referenced Jan 25, 2022

Node update part 2 #2105

Merged

remove old user node pool #2107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

terraform: switch to pd-balanced, reduce size to save cost #2102

terraform: switch to pd-balanced, reduce size to save cost #2102

minrk commented Jan 18, 2022 •

edited

Loading

minrk Jan 18, 2022

yuvipanda commented Jan 18, 2022

minrk commented Jan 25, 2022

minrk commented Jan 25, 2022

terraform: switch to pd-balanced, reduce size to save cost #2102

terraform: switch to pd-balanced, reduce size to save cost #2102

Conversation

minrk commented Jan 18, 2022 • edited Loading

minrk Jan 18, 2022

Choose a reason for hiding this comment

yuvipanda commented Jan 18, 2022

minrk commented Jan 25, 2022

minrk commented Jan 25, 2022

minrk commented Jan 18, 2022 •

edited

Loading