Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds concurrency control and dynamic runner selection to the NVIDIA workflow to distribute jobs across multiple GPU runners. The changes implement a round-robin distribution strategy based on the GitHub run number to select from 8 available GPU runners.
Changes:
- Added workflow-level concurrency configuration with a global group
- Introduced a new
select-runnerjob that dynamically picks one of 8 GPU runners using round-robin selection - Updated the
runjob to use the dynamically selected runner label
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| concurrency: | ||
| group: nvidia-workflow-global | ||
| cancel-in-progress: false |
There was a problem hiding this comment.
The workflow-level concurrency group 'nvidia-workflow-global' will queue all workflow runs globally, which may defeat the purpose of having 8 separate runners. Since the goal is to distribute work across multiple runners, consider using a per-runner concurrency group or removing workflow-level concurrency entirely. A better approach would be to set concurrency at the job level with a dynamic group name like 'nvidia-workflow-${{ needs.select-runner.outputs.runner }}' to allow parallel runs on different GPUs while preventing concurrent runs on the same GPU.
This reverts commit 56970e0.
No description provided.