score.sc file has different scores for the same pdb while using mpi run 64 cores #222
-
score.sc file has different scores for the same pdb based on the number of how many cores I use to run my command, am I doing something wrong? And about the final pdb output pdb file, is it the one with the lowest score? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
If there's one line per core used, then it's likely the MPI isn't being properly launched, so each process thinks it's an independent serial process. (And so generates one output for everything itself.) Common ways this occur is by inadvertently using a non-MPI build with the mpi launcher command -- make sure you have the version with .mpi. in the filename (e.g. rosettascripts.mpi.linuxgccrelease). Alternatively, if you use an MPI version of Rosetta but forget to use the MPI launcher, then your (e.g.) SLURM script will launch N Rosetta processes, but there will be no way for them to communicate with each other, so each will think it's running in serial mode. It can also happen if the MPI library you compiled with doesn't match the MPI library your mpi launcher command is associates with (e.g. OpenMPI vs. MPICH). So the processes are under an MPI environment, but the MPI setup functions in Rosetta can't talk to the MPI environment, so it thinks it's running outside the MPI launcher. Since the Rosetta processes are running serially, they have no idea that the other N-1 processes are also running. So they'll just blindly overwrite files. So the final PDB output you get will be the results from whichever process last wrote the file to disk. (The results from the prior runs being irrevocably overwritten.) |
Beta Was this translation helpful? Give feedback.
If there's one line per core used, then it's likely the MPI isn't being properly launched, so each process thinks it's an independent serial process. (And so generates one output for everything itself.)
Common ways this occur is by inadvertently using a non-MPI build with the mpi launcher command -- make sure you have the version with .mpi. in the filename (e.g. rosettascripts.mpi.linuxgccrelease). Alternatively, if you use an MPI version of Rosetta but forget to use the MPI launcher, then your (e.g.) SLURM script will launch N Rosetta processes, but there will be no way for them to communicate with each other, so each will think it's running in serial mode. It can also happen if the MPI …