Skip to content

sync #311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed

sync #311

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,12 @@ jobs:
docs:
name: Build & Publish
runs-on: ubuntu-latest


if: github.repository == 'MFlowCode/MFC'
concurrency:
group: docs-publish
cancel-in-progress: true

steps:
- uses: actions/checkout@v3

Expand Down
13 changes: 13 additions & 0 deletions .github/workflows/links.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: LinkChecker

on: push

jobs:
markdown-link-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- uses: gaurav-nelson/github-action-markdown-link-check@v1
with:
config-file: '.link_config.json'
use-verbose-mode: 'yes'
63 changes: 63 additions & 0 deletions .github/workflows/phoenix/submit.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/bin/bash

set -e

usage() {
echo "Usage: $0 [script.sh] [cpu|gpu]"
}

if [ ! -z "$1" ]; then
sbatch_script_contents=`cat $1`
else
usage
exit 1
fi

sbatch_cpu_opts="\
#SBATCH -p cpu-small # partition
#SBATCH --ntasks-per-node=24 # Number of cores per node required
#SBATCH --mem-per-cpu=2G # Memory per core\
"

sbatch_gpu_opts="\
#SBATCH -CV100-16GB
#SBATCH -G2\
"

if [ "$2" == "cpu" ]; then
sbatch_device_opts="$sbatch_cpu_opts"
elif [ "$2" == "gpu" ]; then
sbatch_device_opts="$sbatch_gpu_opts"
else
usage
exit 1
fi

job_slug="`basename "$1" | sed 's/\.sh$//' | sed 's/[^a-zA-Z0-9]/-/g'`-$2"

sbatch <<EOT
#!/bin/bash
#SBATCH -Jshb-$job_slug # Job name
#SBATCH --account=gts-sbryngelson3 # charge account
#SBATCH -N1 # Number of nodes required
$sbatch_device_opts
#SBATCH -t 04:00:00 # Duration of the job (Ex: 15 mins)
#SBATCH -q embers # QOS Name
#SBATCH -o$job_slug.out # Combined output and error messages file
#SBATCH -W # Do not exit until the submitted job terminates.

set -e
set -x

cd "\$SLURM_SUBMIT_DIR"
echo "Running in $(pwd):"

job_slug="$job_slug"
job_device="$2"

. ./mfc.sh load -c p -m $2

$sbatch_script_contents

EOT

20 changes: 20 additions & 0 deletions .github/workflows/phoenix/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash

build_opts=""
if [ "$job_device" == "gpu" ]; then
build_opts="--gpu"
fi

./mfc.sh build -j 8 $build_opts

n_test_threads=8

if [ "$job_device" == "gpu" ]; then
gpu_count=$(nvidia-smi -L | wc -l) # number of GPUs on node
gpu_ids=$(seq -s ' ' 0 $(($gpu_count-1))) # 0,1,2,...,gpu_count-1
device_opts="-g $gpu_ids"
n_test_threads=`expr $gpu_count \* 2`
fi

./mfc.sh test -a -j $n_test_threads $device_opts -- -c phoenix

30 changes: 8 additions & 22 deletions .github/workflows/ci.yml → .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,6 @@ jobs:
OPT1: ${{ matrix.mpi == 'mpi' && '--test-all' || '' }}
OPT2: ${{ matrix.debug == 'debug' && '-% 20' || '' }}

- name: Ensure empty diff
run: git diff --exit-code tests/

docker:
name: Github | Docker
runs-on: ubuntu-latest
Expand All @@ -103,9 +100,6 @@ jobs:
- name: Test
run: sudo ./mfc.sh docker ./mfc.sh test -j $(nproc) -a

- name: Ensure empty diff
run: git diff --exit-code tests/

self:
name: Georgia Tech | Phoenix (NVHPC)
if: github.repository == 'MFlowCode/MFC'
Expand All @@ -120,21 +114,13 @@ jobs:
- name: Clone
uses: actions/checkout@v3

- name: Build
run: |
. ./mfc.sh load -c p -m gpu
./mfc.sh build -j 2 $(if [ '${{ matrix.device }}' == 'gpu' ]; then echo '--gpu'; fi)

- name: Test
run: |
. ./mfc.sh load -c p -m gpu
mv misc/run-phoenix-release-${{ matrix.device }}.sh ./
sbatch run-phoenix-release-${{ matrix.device }}.sh
- name: Build & Test
run: bash .github/workflows/phoenix/submit.sh .github/workflows/phoenix/test.sh ${{ matrix.device }}

- name: Ensure empty diff
run: exit $(git status --porcelain tests/ | wc -l)
- name: Archive Logs
uses: actions/upload-artifact@v3
if: always()
with:
name: logs
path: test-${{ matrix.device }}.out

- name: Print
if: always()
run: |
cat test.out
7 changes: 5 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
node_modules/
package.json
yarn.lock
docker-compose.yml

/build/
.vscode/
src/*/include/case.fpp
Expand All @@ -19,8 +24,6 @@ __pycache__
/tests/*/**
!/tests/*/golden.txt
!/tests/*/golden-metadata.txt
!/tests/*/case.py
!/tests/*/README.md

# NVIDIA Nsight Compute
*.nsys-rep
Expand Down
32 changes: 32 additions & 0 deletions .link_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{
"ignorePatterns": [
{
"pattern": "examples.md"
}
],
"replacementPatterns": [
{
"pattern": "^.attachments",
"replacement": "file://some/conventional/folder/.attachments"
},
{
"pattern": "^/",
"replacement": "{{BASEURL}}/"
}
],
"httpHeaders": [
{
"urls": ["https://example.com"],
"headers": {
"Authorization": "Basic Zm9vOmJhcg==",
"Foo": "Bar"
}
}
],
"timeout": "20s",
"retryOn429": true,
"retryCount": 5,
"fallbackRetryDelay": "30s",
"aliveStatusCodes": [200, 206, 403]
}

16 changes: 8 additions & 8 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -196,11 +196,11 @@ endif()
# * For each .fpp file found with filepath <dirpath>/<filename>.fpp, using a
# custom command, instruct CMake how to generate a file with path
#
# src/<target>/autogen/<filename>.f90
# src/<target>/fypp/<filename>.f90
#
# by running Fypp on <dirpath>/<filename>.fpp. It is important to understand
# that this does not actually run the pre-processor. Rather, it instructs
# CMake what to do when it finds a src/<target>/autogen/<filename>.f90 path
# CMake what to do when it finds a src/<target>/fypp/<filename>.f90 path
# in the source list for a target. Thus, an association is made from an .f90
# file to its corresponding .fpp file (if applicable) even though the
# generation is of the form .fpp -> .f90.
Expand All @@ -223,7 +223,7 @@ endif()
# one of them is modified).
#
# .fpp files in src/common are treated as if they were in src/<target> (not
# pre-processed to src/common/autogen/) so as not to clash with other targets'
# pre-processed to src/common/fypp/) so as not to clash with other targets'
# .fpp files (this has caused problems in the past).
#
# * Export, in the variable <target>_SRCs, a list of all source files (.f90)
Expand All @@ -250,7 +250,7 @@ macro(HANDLE_SOURCES target useCommon)
list(APPEND ${target}_SRCs ${common_F90s})
endif()

# src/[<target>,common]/*.fpp -> src/<target>/autogen/*.f90
# src/[<target>,common]/*.fpp -> src/<target>/fypp/*.f90
file(GLOB ${target}_FPPs CONFIGURE_DEPENDS "${${target}_DIR}/*.fpp")
if (${useCommon})
file(GLOB common_FPPs CONFIGURE_DEPENDS "${common_DIR}/*.fpp")
Expand All @@ -259,22 +259,22 @@ macro(HANDLE_SOURCES target useCommon)

# Locate src/[<target>,common]/include/*.fpp
file(GLOB ${target}_incs CONFIGURE_DEPENDS "${${target}_DIR}/include/*.fpp"
"${CMAKE_CURRENT_BINARY_DIR}/include/*.fpp")
"${CMAKE_BINARY_DIR}/include/${target}/*.fpp")

if (${useCommon})
file(GLOB common_incs CONFIGURE_DEPENDS "${common_DIR}/include/*.fpp")
list(APPEND ${target}_incs ${common_incs})
endif()

file(MAKE_DIRECTORY "${${target}_DIR}/autogen")
file(MAKE_DIRECTORY "${CMAKE_BINARY_DIR}/fypp/${target}")
foreach(fpp ${${target}_FPPs})
cmake_path(GET fpp FILENAME fpp_filename)
set(f90 "${${target}_DIR}/autogen/${fpp_filename}.f90")
set(f90 "${CMAKE_BINARY_DIR}/fypp/${target}/${fpp_filename}.f90")

add_custom_command(
OUTPUT ${f90}
COMMAND ${FYPP_EXE} -m re
-I "${CMAKE_CURRENT_BINARY_DIR}/include"
-I "${CMAKE_BINARY_DIR}/include/${target}"
-I "${${target}_DIR}/include"
-I "${common_DIR}/include"
-I "${common_DIR}"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
<img src="https://zenodo.org/badge/doi/10.1016/j.cpc.2020.107396.svg" />
</a>
<a href="https://github.com/MFlowCode/MFC/actions">
<img src="https://github.com/MFlowCode/MFC/actions/workflows/ci.yml/badge.svg" />
<img src="https://github.com/MFlowCode/MFC/actions/workflows/test.yml/badge.svg" />
</a>
<a href="https://lbesson.mit-license.org/">
<img src="https://img.shields.io/badge/License-MIT-blue.svg" />
Expand Down
4 changes: 2 additions & 2 deletions docs/documentation/case.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Case Files

Example Python case files, also referred to as *input files*, can be found in the [examples/](examples/) directory. They print a Python dictionary containing input parameters for MFC. Their contents, and a guide to filling them out, are documented
Example Python case files, also referred to as *input files*, can be found in the [examples/](https://github.com/MFlowCode/MFC/tree/master/examples) directory. They print a Python dictionary containing input parameters for MFC. Their contents, and a guide to filling them out, are documented
in the user manual. A commented, tutorial script
can also be found in [examples/3d_sphbubcollapse/](examples/3D_sphbubcollapse/case.py).
can also be found in [examples/3d_sphbubcollapse/](https://github.com/MFlowCode/MFC/blob/master/examples/3D_sphbubcollapse/case.py).

## Basic Skeleton

Expand Down
73 changes: 32 additions & 41 deletions docs/documentation/running.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,31 @@

MFC can be run using `mfc.sh`'s `run` command.
It supports both interactive and batch execution, the latter being designed for multi-socket systems, namely supercomputers, equipped with a scheduler such as PBS, SLURM, and LSF.
A full (and updated) list of available arguments can be acquired with `./mfc.sh run -h`.
A full (and updated) list of available arguments can be acquired with `./mfc.sh run -h`.

MFC supports running simulations locally (Linux, MacOS, and Windows) as well as
several supercomputer clusters, both interactively and through batch submission.

> [!IMPORTANT]
> Running simulations locally should work out of the box. On supported clusters,
> you can append `-c <computer name>` on the command line to instruct the MFC toolchain
> to make use of the template file `toolchain/templates/<computer name>.mako`. You can
> browse that directory and contribute your own files. Since systems and their schedulers
> do not have a standardized syntax to request certain resources, MFC can only provide
> support for a restricted subset of common or user-contributed configuration options.
>
> Adding a new template file or modifying an existing one will most likely be required if:
> - You are on a cluster that does not have a template yet.
> - Your cluster is configured with SLURM but interactive job launches fail when
> using `srun`. You might need to invoke `mpirun` instead.
> - Something in the existing default or computer template file is incompatible with
> your system or does not provide a feature you need.
>
> If `-c <computer name>` is left unspecified, it defaults to `-c default`.

Additional flags can be appended to the MPI executable call using the `-f` (i.e `--flags`) option.

Please refer to `./mfc.sh run -h` for a complete list of arguments and options, along with their defaults.

## Interactive Execution

Expand Down Expand Up @@ -32,24 +56,16 @@ using 4 cores:
$ ./mfc.sh run examples/2D_shockbubble/case.py -t simulation post_process -n 4
```

On some computer clusters, MFC might select the wrong MPI program to execute your application
because it uses a general heuristic for its selection. Notably, `srun` is known to fail on some SLURM
systems when using GPUs or MPI implementations from different vendors, whereas `mpirun` functions properly. To override and manually specify which
MPI program you wish to run your application with, please use the `-b <program name>` option (i.e `--binary`).

Additional flags can be appended to the MPI executable call using the `-f` (i.e `--flags`) option.

Please refer to `./mfc.sh run -h` for a complete list of arguments and options, along with their defaults.

## Batch Execution

The MFC detects which scheduler your system is using and handles the creation and execution of batch scripts.
The batch engine is requested with the `-e batch` option.
Whereas the interactive engine can execute all of MFC's codes in succession, the batch engine requires you to only specify one target with the `-t` option.
The number of nodes and GPUs can, respectively be specified with the `-N` (i.e `--nodes`) and `-g` (i.e `--gpus-per-node`) options.
The batch engine is requested via the `-e batch` option.
The number of nodes can be specified with the `-N` (i.e `--nodes`) option.

We provide a list of (baked-in) submission batch scripts in the `toolchain/templates` folder.

```console
$ ./mfc.sh run examples/2D_shockbubble/case.py -e batch -N 2 -n 4 -g 4 -t simulation
$ ./mfc.sh run examples/2D_shockbubble/case.py -e batch -N 2 -n 4 -t simulation -c <computer name>
```

Other useful arguments include:
Expand All @@ -60,26 +76,8 @@ Other useful arguments include:
- `-a <account name>` to identify the account to be charged for the job. (i.e `--account`)
- `-p <partition name>` to select the job's partition. (i.e `--partition`)

Since some schedulers don't have a standardized syntax to request certain resources, MFC can only provide support for a restricted subset of common configuration options.
If MFC fails to execute on your system, or if you wish to adjust how the program runs and resources are requested to be allocated, you are invited to modify the template batch script for your queue system.
Upon execution of `./mfc.sh run`, MFC fills in the template with runtime parameters, to generate the batch file it will submit.
These files are located in the [templates](https://github.com/MFlowCode/MFC/tree/master/toolchain/templates/) directory.
To request GPUs, modification of the template will be required on most systems.

- Lines that begin with `#>` are ignored and won't figure in the final batch script, not even as a comment.

- Statements of the form `${expression}` are string-replaced to provide runtime parameters, most notably execution options.
You can perform therein any Python operation recognized by the built-in `expr()` function.

As an example, one might request GPUs on a SLURM system using the following:

```
#SBATCH --gpus=v100-32:{gpus_per_node*nodes}
```

- Statements of the form `{MFC::expression}` tell MFC where to place the common code, across all batch files, that is required for proper execution.
They are not intended to be modified by users.

**Disclaimer**: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to
allocate resources. Therefore, the MFC constructs equivalent resource-sets in task and GPU count.

Expand Down Expand Up @@ -173,13 +171,6 @@ $ ./mfc.sh run examples/1D_vacuum_restart/restart_case.py -t post_process
- Oak Ridge National Laboratory's [Summit](https://www.olcf.ornl.gov/summit/):

```console
$ ./mfc.sh run examples/2D_shockbubble/case.py -e batch \
-N 2 -n 4 -g 4 -t simulation -a <redacted>
```

- University of California, San Diego's [Expanse](https://www.sdsc.edu/services/hpc/expanse/):

```console
$ ./mfc.sh run examples/2D_shockbubble/case.py -e batch -p GPU -t simulation \
-N 2 -n 8 -g 8 -f="--gpus=v100-32:16" -b mpirun –w 00:30:00
$ ./mfc.sh run examples/2D_shockbubble/case.py -e batch \
-N 2 -n 4 -t simulation -a <redacted> -c summit
```
Loading