Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile CoLoRe at NERSC #90

Open
andreufont opened this issue Oct 4, 2019 · 3 comments
Open

Profile CoLoRe at NERSC #90

andreufont opened this issue Oct 4, 2019 · 3 comments
Assignees

Comments

@andreufont
Copy link
Collaborator

How much CPU time / memory do we need to run Gaussian / 2LPT boxes, as a function of number of cells? How do the number change when we also need to extract skewers?

@andreufont
Copy link
Collaborator Author

Also, runs by @jfarr03 take longer time than runs by @fjaviersanchez , it would be good to understand why.

@jfarr03
Copy link
Collaborator

jfarr03 commented Oct 4, 2019

[Copied from Slack channel]

Computational requirements for Gaussian vs 2LPT runs (by Javier)

Gaussian without skewers

1024 — Cori Haswell 2 nodes (4 MPI tasks) 1 OpenMP thread per core 1MPI task per socket
10 seconds, 2.109 GB/task -> 8.4 GB -> slurm-24945519.out -> 0.5 NERSC hours

2048 — Cori Haswell 2 nodes (4 MPI tasks) 1 OpenMP thread per core 1MPI task per socket
79s, 16.144 GB/task -> 64.5 GB total -> 3.95 NERSC hours

2LPT without skewers

1024 — Cori Haswell 2 nodes (4 MPI tasks) 1 OpenMP thread per core 1MPI task per socket
43.7 seconds, 14.9 GB/task -> ~60GB -> slurm-249456078.out -> 2.19 NERSC hours

2048 — Cori Haswell 8 nodes (16 MPI tasks) 1 OpenMP thread per physical core. 1 MP1 task per socket.
115 seconds, 29.7 GB/task -> 475.2 GB -> slurm-24956874.out -> 23 NERSC hours

4096 — Cori Haswell 32 nodes (64 MPI tasks) 1 OpenMP thread per physical core. 1 MPI task per socket.
255.2 seconds, 59.36 GB.task -> 3.8TB -> slurm-24967363.out -> 204 NERSC hours

Computational requirements for Gaussian runs (by James)

Gaussian with skewers

2048 - Using 8 Cori Haswell nodes: 4 minutes, 169 GB memory (21.2 GB per node, [12.059 GB (Gaussian), 9.105 GB (srcs)]), 48 NERSC hours

4096 - Using 32 Cori Haswell nodes: 9 minutes, 918 GB memory (28.7 GB per node, [24.199 GB (Gaussian), 4.541 GB (srcs)]) 432 NERSC hours

Gaussian without skewers

2048 (~Javier's setup) - Using 2 Cori Haswell nodes: 172 seconds, 65 GB memory (32.288 GB per node, [32.094 GB (Gaussian), 0.194 GB (srcs)]), 8.6 NERSC hours

2048 - Using 8 Cori Haswell nodes: 55 seconds, 65 GB memory (8.088 GB per node, [8.039 GB (Gaussian), 0.049 GB (srcs)]), 11 NERSC hours

4096 - Using 32 Cori Haswell nodes: 101 seconds, 517 GB memory (16.145 GB per node, [16.133 GB (Gaussian), 0.012 GB (srcs)]) 81 NERSC hours

@jfarr03
Copy link
Collaborator

jfarr03 commented Oct 7, 2019

Currently, running a 2048 gaussian without skewers uses 65 GB of memory for both Javier and myself.

However, even when I use the same param and slurm files as Javier, it takes 2x as long to run.

Potential differences are in the exact version of CoLoRe that we're using, as well as the libraries and compilers that we've chosen. I'm going to try a separate installation of CoLoRe with the exact same setup as Javier and see if that affects things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants