Skip to content

Resourcing Hubs

Abby Drury edited this page Apr 3, 2023 · 6 revisions

Previously deployed VMS

  • COMSC-243 (fall 2022): 26 users (15 students), m6g.4xlarge VM on AWS: 16 cores, 64GB RAM
  • DATA-113 (spring 2023): ? users (37 students), m6g.2xlarge VM on AWS: (presumably) 8 cores, 32GB RAM. Upgraded on March 4 to m6g.4xlarge VM on AWS: 16 cores, 64GB RAM.

Probably these are over-provisioned, especially once we added user-level resource limits to JupyterHub.

Setting user limits

We decided to set a 2 CPU, 4GB memory per-user limit. This might still be too high, but it stabilized the COMSC-243 hub almost immediately. (TLJH documentation for setting user limits) was very helpful.

Following a crash in DATA-113, we revisited our assumptions and lowered limits.memory to 1.5G

First, ssh to the server on AWS

sudo tljh-config show
sudo tljh-config set limits.cpu 2
sudo tljh-config set limits.memory 1.5G
sudo tljh-config reload
sudo tljh-config show

Note: These changes won't take effect for any users with running servers until the server is stopped and restarted. (reference #3 here: Resize the resources available to your JupyterHub.)

Setting a server timeout

By default, JupyterHub will ping the user notebook servers every 60s to check their status. Every server found to be idle for more than 10 minutes will be culled. We configured ours to wait 5400s (1.5 hours) to deal with the realities of teaching (pauses during instruction, etc).

First, ssh to the server on AWS

sudo tljh-config show
sudo tljh-config set services.cull.timeout 5400
sudo tljh-config reload
sudo tljh-config show
services:
  cull:
    timeout: 5400