|
| 1 | +# Set up a hub for an exam |
| 2 | + |
| 3 | +We provide support for using a JupyterHub in a *controlled* physical |
| 4 | +space for exams. This is an extra paid feature where we charge hourly |
| 5 | +for responsiveness. |
| 6 | + |
| 7 | +This page documents what we do to prep, based on our prior experiences. |
| 8 | + |
| 9 | +1. **Exact dates and times are known and at least two engineers available** |
| 10 | + |
| 11 | + Make sure the exact dates and times of the exam are checked well in |
| 12 | + advance, and we have at least two engineers available during this time period. |
| 13 | + |
| 14 | +2. **Check engineer access** |
| 15 | + |
| 16 | + Engineers should *test* their access to the infrastructure and the |
| 17 | + hub beforehand, to make sure they can fix issues if needed. |
| 18 | + |
| 19 | + Simple checklist: |
| 20 | + - 🔲 Access and login to the hub admin page |
| 21 | + - 🔲 Access and login to the cluster grafana |
| 22 | + - 🔲 Access and login to the cloud console |
| 23 | + - 🔲 Test access to Logs Explorer for container logs if on GCP |
| 24 | + - 🔲 Test that running `deployer use-cluster-credentials $CLUSTER` and then `kubectl get pods -A` work |
| 25 | + |
| 26 | +3. **Ensure user pods have a guaranteed quality of service class** |
| 27 | + |
| 28 | + For the duration of the exam, all user pods must have a |
| 29 | + [guaranteed quality of service class](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/). |
| 30 | + |
| 31 | + In practice, this means we have memory & cpu requests set to be the same |
| 32 | + as guarantees. This is to ensure equity - no user should get more or less |
| 33 | + resources than any other. It also improves reliability. |
| 34 | + |
| 35 | + This usually increases cost too, so should be done **no more than 12h before** |
| 36 | + the start of the exam. It should be reverted back soon after the exam |
| 37 | + is done. |
| 38 | + |
| 39 | + If the hub has a profile list enabled, based on the instance types setup for |
| 40 | + the hub, you can find the new allocation options by running: |
| 41 | + |
| 42 | + ```{bash} |
| 43 | + deployer generate resource-allocation choices <instance-type> |
| 44 | + ``` |
| 45 | + |
| 46 | + Running this command will output options where memory requests equal limits. |
| 47 | + |
| 48 | +4. **Ensure instructor tests the hub before the exam** |
| 49 | + |
| 50 | + The instructor running the exam should test out their exam on the hub, |
| 51 | + and make sure that it will complete within the amount of resources assigned |
| 52 | + to it. They should also make sure that the environment (packages, python |
| 53 | + versions, etc) are set up appropriately. From the time they test this until |
| 54 | + the exam is over, new environment changes are put on hold. |
| 55 | + |
| 56 | + Responsibilities: |
| 57 | + - the **community and partnerships** team makes sure that the community's |
| 58 | + **expectations** around exams are correctly set |
| 59 | + - the **engineer(s)** leading the exam, should make sure **they are respected** |
| 60 | + |
| 61 | +5. **Pre-warm the cluster** |
| 62 | + |
| 63 | + We should pre-warm the cluster the hub is on before the start of the exam, |
| 64 | + to make sure that all users can start a notebook without having to wait. This |
| 65 | + is also for equity reasons, to make sure we don't disadvantage one user from |
| 66 | + another. |
| 67 | + |
| 68 | +6. **Follow freshdesk for any questions/issues** |
| 69 | + |
| 70 | + Issues during the exam are communicated via freshdesk, and what we are paid |
| 71 | + for is to make sure we respond immediately - there is no guarantee of fixes, |
| 72 | + although we try very hard to make sure the infrastructure is stable during this |
| 73 | + period. |
0 commit comments