Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream old fixes / features #218

Merged
merged 3 commits into from
Oct 8, 2023

Conversation

henryleberre
Copy link
Member

I realized that I forgot to upstream these fixes / features:

  • Enable testing on multiple GPUs (in parallel).
  • The above point and other changes make it easier to implement benchmarking in the future.
  • Small build.py refactor. This also fixes a bug relating to targets not being built in a deterministic order. I understood this would happen when I pushed those changes but underestimated how confusing that might be.

@sbryngelson
Copy link
Member

sbryngelson commented Oct 8, 2023

Will merge this once tests pass -- was failing before for some spurious thing that is now fixed.

@sbryngelson
Copy link
Member

[SUCCESS]
++ ok 'All modules have been loaded for GT Phoenix on GPUs.'
++ log 'OK > All modules have been loaded for GT Phoenix on GPUs.'
++ echo -e 'mfc: OK > All modules have been loaded for GT Phoenix on GPUs.'
mfc: OK > All modules have been loaded for GT Phoenix on GPUs.
++ return
++ nvidia-smi -L
++ wc -l
+ gpu_count=2
++ seq -s , 0 1
+ gpu_ids=0,1
++ nproc
+ ./mfc.sh test -a -b mpirun -j 24 --gpu -g 0,1
mfc: OK > (venv) Entered the Python virtual environment.
usage: ./mfc.sh test [-h] [--mpi] [--no-mpi] [--gpu] [--no-gpu] [--debug]
                     [--no-debug] [-j JOBS] [-v] [--no-fftw] [--no-hdf5]
                     [--no-silo] [-g GPUS [GPUS ...]] [-l] [-f FROM] [-t TO]
                     [-o L [L ...]] [-b {jsrun,srun,mpirun,mpiexec,N/A}] [-r]
                     [-a] [-% PERCENT] [-m MAX_ATTEMPTS] [--case-optimization]
                     [--generate | --add-new-variables]
./mfc.sh test: error: argument -g/--gpus: invalid int value: '0,1'
mfc: (venv) Exiting the Python virtual environment.
---------------------------------------
Begin Slurm Epilog: Oct-08-2023 12:54:15
Job ID:        3730688
Array Job ID:  _4294967294
User ID:       sbryngelson3
Account:       gts-sbryngelson3
Job name:      shb-test-jobs
Resources:     cpu=24,gres/gpu:v100=2,mem=96G,node=1
Rsrc Used:     cput=00:02:00,vmem=[165](https://github.com/MFlowCode/MFC/actions/runs/6443848089/job/17506330869?pr=218#step:5:166)6K,walltime=00:00:05,mem=0,energy_used=0
Partition:     gpu-v100
QOS:           embers
Nodes:         atl1-1-01-003-36-0
---------------------------------------

@sbryngelson sbryngelson merged commit c312cd7 into MFlowCode:master Oct 8, 2023
15 checks passed
@sbryngelson sbryngelson deleted the upstream-old-fixes branch October 8, 2023 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants