-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profile CoLoRe at NERSC #90
Comments
Also, runs by @jfarr03 take longer time than runs by @fjaviersanchez , it would be good to understand why. |
[Copied from Slack channel] Computational requirements for Gaussian vs 2LPT runs (by Javier) Gaussian without skewers 1024 — Cori Haswell 2 nodes (4 MPI tasks) 1 OpenMP thread per core 1MPI task per socket 2048 — Cori Haswell 2 nodes (4 MPI tasks) 1 OpenMP thread per core 1MPI task per socket 2LPT without skewers 1024 — Cori Haswell 2 nodes (4 MPI tasks) 1 OpenMP thread per core 1MPI task per socket 2048 — Cori Haswell 8 nodes (16 MPI tasks) 1 OpenMP thread per physical core. 1 MP1 task per socket. 4096 — Cori Haswell 32 nodes (64 MPI tasks) 1 OpenMP thread per physical core. 1 MPI task per socket. Computational requirements for Gaussian runs (by James) Gaussian with skewers 2048 - Using 8 Cori Haswell nodes: 4 minutes, 169 GB memory (21.2 GB per node, [12.059 GB (Gaussian), 9.105 GB (srcs)]), 48 NERSC hours 4096 - Using 32 Cori Haswell nodes: 9 minutes, 918 GB memory (28.7 GB per node, [24.199 GB (Gaussian), 4.541 GB (srcs)]) 432 NERSC hours Gaussian without skewers 2048 (~Javier's setup) - Using 2 Cori Haswell nodes: 172 seconds, 65 GB memory (32.288 GB per node, [32.094 GB (Gaussian), 0.194 GB (srcs)]), 8.6 NERSC hours 2048 - Using 8 Cori Haswell nodes: 55 seconds, 65 GB memory (8.088 GB per node, [8.039 GB (Gaussian), 0.049 GB (srcs)]), 11 NERSC hours 4096 - Using 32 Cori Haswell nodes: 101 seconds, 517 GB memory (16.145 GB per node, [16.133 GB (Gaussian), 0.012 GB (srcs)]) 81 NERSC hours |
Currently, running a 2048 gaussian without skewers uses 65 GB of memory for both Javier and myself. However, even when I use the same param and slurm files as Javier, it takes 2x as long to run. Potential differences are in the exact version of CoLoRe that we're using, as well as the libraries and compilers that we've chosen. I'm going to try a separate installation of CoLoRe with the exact same setup as Javier and see if that affects things |
How much CPU time / memory do we need to run Gaussian / 2LPT boxes, as a function of number of cells? How do the number change when we also need to extract skewers?
The text was updated successfully, but these errors were encountered: