Skip to content

Commit c0dfd97

Browse files
committed
Add L40S_BK to GPU_TO_SM and clarify env var docs
- Add L40S_BK SM arch (89 - Ada Lovelace) to GPU_TO_SM mapping - Document env vars by location: - Heroku/Backend: BUILDKITE_API_TOKEN, BUILDKITE_ORG, BUILDKITE_PIPELINE - GPU Nodes: BUILDKITE_AGENT_TOKEN (set by admin), auto-set vars - Jobs: KERNELBOT_* vars passed via API
1 parent 0bcb540 commit c0dfd97

File tree

2 files changed

+36
-12
lines changed

2 files changed

+36
-12
lines changed

SKILLS/buildkite.md

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -233,26 +233,49 @@ Buildkite-managed GPUs are registered with `_BK` suffix:
233233
| `B200_BK` | `b200` | 100 |
234234
| `H100_BK` | `h100` | 90a |
235235
| `MI300_BK` | `mi300` | (AMD) |
236+
| `L40S_BK` | `test` | 89 (Ada Lovelace) |
236237

237238
## Environment Variables
238239

239-
### For Kernelbot API/Backend
240+
### On Heroku/Backend (where the app runs)
240241

241-
- `BUILDKITE_API_TOKEN`: API token for submitting jobs
242+
These are set in Heroku config vars or your `.env` file:
242243

243-
### For Buildkite Agents (set by setup script)
244+
| Variable | Required | Description |
245+
|----------|----------|-------------|
246+
| `BUILDKITE_API_TOKEN` | Yes | API token for submitting jobs and downloading artifacts. Get from Buildkite → Personal Settings → API Access Tokens |
247+
| `BUILDKITE_ORG` | No | Organization slug (default: `gpu-mode`) |
248+
| `BUILDKITE_PIPELINE` | No | Pipeline slug (default: `kernelbot`) |
244249

245-
- `NVIDIA_VISIBLE_DEVICES`: GPU index for isolation
246-
- `CUDA_VISIBLE_DEVICES`: Same as above
247-
- `KERNELBOT_GPU_INDEX`: GPU index (0, 1, 2, ...)
248-
- `KERNELBOT_CPUSET`: CPU cores for this agent
249-
- `KERNELBOT_MEMORY`: Memory limit
250+
**API Token Permissions Required:**
251+
- `read_builds` - Poll build status
252+
- `write_builds` - Create/trigger builds
253+
- `read_artifacts` - Download result.json artifact
254+
- `read_agents` (optional) - Check queue status
255+
256+
### On GPU Runner Nodes
250257

251-
### For Jobs (passed via pipeline)
258+
These are set during node setup:
252259

253-
- `KERNELBOT_RUN_ID`: Unique run identifier
254-
- `KERNELBOT_PAYLOAD`: Base64+zlib compressed job config
255-
- `KERNELBOT_QUEUE`: Target queue name
260+
| Variable | Set By | Description |
261+
|----------|--------|-------------|
262+
| `BUILDKITE_AGENT_TOKEN` | Admin (setup script) | Agent token for connecting to Buildkite |
263+
| `NVIDIA_VISIBLE_DEVICES` | Environment hook | GPU index for isolation (auto-set per job) |
264+
| `CUDA_VISIBLE_DEVICES` | Environment hook | Same as above |
265+
| `KERNELBOT_GPU_INDEX` | Environment hook | GPU index (0, 1, 2, ...) |
266+
| `KERNELBOT_CPUSET` | Environment hook | CPU cores for this agent |
267+
| `KERNELBOT_MEMORY` | Environment hook | Memory limit for Docker |
268+
269+
### Passed to Jobs (via Buildkite API)
270+
271+
These are set automatically by the launcher:
272+
273+
| Variable | Description |
274+
|----------|-------------|
275+
| `KERNELBOT_RUN_ID` | Unique run identifier |
276+
| `KERNELBOT_PAYLOAD` | Base64+zlib compressed job config |
277+
| `KERNELBOT_QUEUE` | Target queue name |
278+
| `KERNELBOT_IMAGE` | Docker image to use |
256279

257280
## Troubleshooting
258281

src/libkernelbot/consts.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ class RankCriterion(Enum):
133133
"B200_BK": "100",
134134
"H100_BK": "90a",
135135
"MI300_BK": None,
136+
"L40S_BK": "89", # Ada Lovelace
136137
}
137138

138139

0 commit comments

Comments
 (0)