Default to slurm based launchers when salloc/srun is available #26170

jabraham17 · 2024-10-30T15:21:41Z

I recently had some issues running on a slurm-based IB system. The problem was I was using the default launcher for IB when COMM=gasnet, which is gasnetrun_ibv. However, that launcher requires you to make your own slurm allocation using salloc (see https://chapel-lang.org/docs/main/usingchapel/launcher.html#using-any-ssh-based-launcher-with-slurm). The solution was either to manually make salloc calls, or just use slurm-gasnetrun_ibv which handles that for you.

This is a simple solution, but why is it necessary? It seems like if we can detect a slurm based system, we should default to a slurm based launcher. This led me to investigate util/chplenv/chpl_launcher.py where we do actually have that detection, but only on cray-cs and hpe-apollo.

I went looking though the history for this and found two PRs making this change, #17314 for gasnet and #17305 for other comm layers. Based on these PR messages, we only default to slurm based launchers on cray/hpe systems because it was messing with internal testing systems that want to use a different launcher but have slurm.

This feels like optimizing for the wrong case, we should default to what is common for users.

In my opinion this is a simple change, just remove the checks for the target platform and adjust automated testing systems as needed. However, there may be other cases I am not thinking of where we would not want to default to a slurm-based launcher.

The text was updated successfully, but these errors were encountered:

bradcray · 2024-10-30T18:11:38Z

This sounds great to me, thanks for investigating, Jade!

jabraham17 added the area: Runtime label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default to slurm based launchers when salloc/srun is available #26170

Default to slurm based launchers when salloc/srun is available #26170

jabraham17 commented Oct 30, 2024

bradcray commented Oct 30, 2024

Default to slurm based launchers when salloc/srun is available #26170

Default to slurm based launchers when salloc/srun is available #26170

Comments

jabraham17 commented Oct 30, 2024

bradcray commented Oct 30, 2024