-
Notifications
You must be signed in to change notification settings - Fork 1
Generate Agent Data on a Cluster Using Slurm
- Clone the main Toybox repository.
- Install required packages.
- Have a cluster account set up (either on swarm2 or gypsum).
Toybox environments can be stochastic, and some agents may be stochastic. Consequently, a user may want to generate data over multiple runs. The script make_scripts.py provides an example of how to generate scripts on a cluster that uses slurm.
- Clone this repository into your home directory on the head node.
- Run
./scripts/make_scripts.py /mnt/nfs/work1/jensen/<USERNAME>/output
from the top-level of the repository on swarm2.
This will create 30 scripts per agent in the scripts directory. Feel free to modify this file. It does not run anything; it merely creates the script with the appropriate specifications to launch on the cluster. You should inspect the output scripts to ensure that they are what you want.
Note that in step 2, you can use whatever directory structure you like, but make sure that you are writing to the ZFS file system because the output will quickly exceed the cap in your home directory.
Each script can be launched with the command sbatch <scriptname>
. DO NOT LAUNCH THE SCRIPTS FROM THE HEAD NODE.
We have written a script that will run all of the generated scripts from make_scripts.py.
Also note that none of the agents as of this writing require (or would benefit from) GPUs. If your agent uses GPUs, you will need to use the --gres
argument to sbatch
. We have found that most deep agents only benefit from one gpu, so our command would be sbatch <script> --gres=gpu:1
. See the gypsum-specific documentation for other options.
Once you have finished, if you want to move data, we recommend running any archiving or compression interactively on a node using the srun
command. You may be generating a lot of data, and this will prevent the head node from becoming clogged. Recall that you can log into an arbitrary node using srun --pty /bin/bash
. Save the archive to NFS, since it may still be quite large.
- Only run
make_scripts.py
once! Each time it is run, it generates new seeds and will thus generate new scripts. If you want new scripts, delete all of the old ones first.