-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
WIP: MPI Ensembles can use multiple ranks per node, upto the gpu coun…
…t, evenly (ish) distributing gpus Only one rank per node sends the device string back for telemetry, others send back an empty string (while the assembleGPUsString method is expecting a message from each rank in the world. Currently (i.e. still WIP bits) + Using more ranks than GPUs per node will erorr wiht a generic exception. Can potentially make these extra runners do 0 work, by never requesting jobs / stating they are finished immediately. + If users specify devices via config / cli, this will be applied to all mpi ranks per node. This may or may not be desirable. Using a specialised communicator might be a better way to do this (build a communicator of mpi ranks participating in th ensemble, and use that in place of MPI_COMM_WORLD). All ranks would need to return a message stating if they are participating or not (0/1) which would then be used as a color in a call to MPI_COMM_split, to create the new communicator(s).
- Loading branch information
Showing
3 changed files
with
38 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters