Skip to content

Commit

Permalink
Merge pull request #170 from xylar/add-omega-ctest-util
Browse files Browse the repository at this point in the history
Add utility for building Omega and running ctests
  • Loading branch information
xylar authored Feb 2, 2024
2 parents 5e7184b + 95b2739 commit 2921be9
Show file tree
Hide file tree
Showing 4 changed files with 384 additions and 0 deletions.
93 changes: 93 additions & 0 deletions utils/omega/ctest/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Omega CTest utility

This utility helps Omega developers build the model and run
CTests for a given compiler.

The utility will check out submodules that Omega needs and build Omega with
the requested compilers (see below). On a compute node, the utility will also
run CTests directly. On a login node, it will create a job script for running
CTests and can optionally submit the job script.

## Instructions

1. You must have followed the instructions for configuring
Polaris on a [supported machine](https://e3sm-project.github.io/polaris/main/developers_guide/quick_start.html#supported-machines),
specifying the compiler for which you wish to test Omega. The result
should be an activation script like:
```
load_dev_polaris_0.3.0-alpha.1_chrysalis_intel_openmpi.sh
```
You must be on a machine that supports both E3SM (present in Omega's
[config_machines.xml](https://github.com/E3SM-Project/Omega/blob/develop/cime_config/machines/config_machines.xml))
and Polaris (see [supported machines](https://e3sm-project.github.io/polaris/main/developers_guide/machines/index.html#supported-machines)).

2. Source the polaris load script for the desired compiler, e.g.:
```
source load_dev_polaris_0.3.0-alpha.1_chrysalis_intel_openmpi.sh
```

3. Run the utility:
```
./util/omega/ctest/omega_ctest.py
```
The utility will check out submodules and build Omega with the compilers
associated with the Polaris load script (e.g. `intel` in the example above).

The code is built in a subdirectory `build_omega/build_<machine>_<compiler>`
within the current directory (e.g. the base of the Polaris branch so not
typically within the Omega branch).

**Flags**:

```
usage: omega_ctest.py [-h] [-o OMEGA_BRANCH] [-c] [-s] [-d]
[--cmake_flags CMAKE_FLAGS]
```

* `-o <path_to_omega_branch>`: point to a branch of Omega
(`e3sm_submodules/Omega` by default)

* `-c`: indicates that the build subdirectory should be removed first to
allow a clean build

* `-s`: if running the utility on a login node, submit the job script that
the utility generates (does nothing on a compute node)

* `-d`: build Omega in debug mode

* `--cmake_flags="<flags>"`: Extra flags to pass to the `cmake` command

4. If you are on a login node and didn't use the `-s` flag, you will need
to submit the batch job to run CTests yourself (perhaps after editing the
job script), e.g.:
```
sbatch build_omega/job_build_and_ctest_omega_chrysalis_intel.sh
```

If all goes well, you will see something like:
```
$ cat omega_ctest_chrysalis_intel.o464153
Test project /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/add-omega-ctest-util/build_omega/build_chrysalis_intel
Start 1: DATA_TYPES_TEST
1/9 Test #1: DATA_TYPES_TEST .................. Passed 0.38 sec
Start 2: MACHINE_ENV_TEST
2/9 Test #2: MACHINE_ENV_TEST ................. Passed 0.98 sec
Start 3: BROADCAST_TEST
3/9 Test #3: BROADCAST_TEST ................... Passed 1.13 sec
Start 4: LOGGING_TEST
4/9 Test #4: LOGGING_TEST ..................... Passed 0.03 sec
Start 5: DECOMP_TEST
5/9 Test #5: DECOMP_TEST ...................... Passed 1.20 sec
Start 6: HALO_TEST
6/9 Test #6: HALO_TEST ........................ Passed 1.08 sec
Start 7: IO_TEST
7/9 Test #7: IO_TEST .......................... Passed 2.94 sec
Start 8: CONFIG_TEST
8/9 Test #8: CONFIG_TEST ...................... Passed 1.01 sec
Start 9: YAKL_TEST
9/9 Test #9: YAKL_TEST ........................ Passed 0.03 sec
100% tests passed, 0 tests failed out of 9
Total Test time (real) = 8.91 sec
```
52 changes: 52 additions & 0 deletions utils/omega/ctest/build_and_ctest.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/env bash

cwd=${PWD}

module load cmake

# quit on errors
set -e
# trace commands
set -x

{% if update_omega_submodule %}
cd {{ polaris_source_dir }}
git submodule update --init e3sm_submodules/Omega
{% endif %}

cd {{ omega_base_dir }}

git submodule update --init --recursive externals/YAKL externals/ekat \
externals/scorpio cime

cd ${cwd}

{% if clean %}
rm -rf build_omega/{{ build_dir }}
{% endif %}
mkdir -p build_omega/{{ build_dir }}
cd build_omega/{{ build_dir }}

export METIS_ROOT={{ metis_root }}
export PARMETIS_ROOT={{ parmetis_root }}

cmake \
-DOMEGA_BUILD_TYPE={{ build_type }} \
-DOMEGA_CIME_COMPILER={{ compiler }} \
-DOMEGA_CIME_MACHINE={{ machine }} \
-DOMEGA_METIS_ROOT=${METIS_ROOT} \
-DOMEGA_PARMETIS_ROOT=${PARMETIS_ROOT} \
-DOMEGA_BUILD_TEST=ON \
-Wno-dev \
-S {{ omega_base_dir }}/components/omega \
-B . {{ cmake_flags }}

./omega_build.sh

cd test

ln -sfn {{ omega_mesh_filename }} OmegaMesh.nc

{% if run_ctest %}
./omega_ctest.sh
{% endif %}
21 changes: 21 additions & 0 deletions utils/omega/ctest/job_script.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash
#SBATCH --job-name={{ job_name }}
{% if account != '' -%}
#SBATCH --account={{ account}}
{%- endif %}
#SBATCH --nodes={{ nodes }}
#SBATCH --output={{ job_name }}.o%j
#SBATCH --exclusive
#SBATCH --time={{ wall_time }}
{% if qos != '' -%}
#SBATCH --qos={{ qos }}
{%- endif %}
{% if partition != '' -%}
#SBATCH --partition={{ partition }}
{%- endif %}
{% if constraint != '' -%}
#SBATCH --constraint={{ constraint }}
{%- endif %}

cd {{ build_dir }}
./omega_ctest.sh
218 changes: 218 additions & 0 deletions utils/omega/ctest/omega_ctest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
#!/usr/bin/env python3

import argparse
import os
import subprocess

from jinja2 import Template

from polaris.config import PolarisConfigParser
from polaris.io import download
from polaris.job import _clean_up_whitespace, get_slurm_options


def make_build_script(machine, compiler, branch, build_only, mesh_filename,
debug, clean, cmake_flags):
"""
Make a shell script for checking out Omega and its submodules, building
Omega and its ctests, linking to testing data files, and running ctests.
"""

polaris_source_dir = os.environ['POLARIS_BRANCH']
metis_root = os.environ['METIS_ROOT']
parmetis_root = os.environ['PARMETIS_ROOT']

build_dir = f'build_{machine}_{compiler}'

branch = os.path.abspath(branch)
omega_submodule = os.path.join(polaris_source_dir, 'e3sm_submodules/Omega')
update_omega_submodule = (branch == omega_submodule)

this_dir = os.path.realpath(
os.path.join(os.getcwd(), os.path.dirname(__file__)))

template_filename = os.path.join(this_dir, 'build_and_ctest.template')

with open(template_filename, 'r', encoding='utf-8') as f:
template = Template(f.read())

if debug:
build_type = 'Debug'
else:
build_type = 'Release'

if cmake_flags is None:
cmake_flags = ''

script = template.render(update_omega_submodule=update_omega_submodule,
polaris_source_dir=polaris_source_dir,
omega_base_dir=branch,
build_dir=build_dir,
machine=machine,
compiler=compiler,
metis_root=metis_root,
parmetis_root=parmetis_root,
omega_mesh_filename=mesh_filename,
run_ctest=(not build_only),
build_type=build_type,
clean=clean,
cmake_flags=cmake_flags)

script = _clean_up_whitespace(script)

build_omega_dir = os.path.abspath('build_omega')
os.makedirs(build_omega_dir, exist_ok=True)

if build_only:
script_filename = f'build_omega_{machine}_{compiler}.sh'
else:
script_filename = f'build_and_ctest_omega_{machine}_{compiler}.sh'

script_filename = os.path.join(build_omega_dir, script_filename)

with open(script_filename, 'w', encoding='utf-8') as f:
f.write(script)

return script_filename


def download_mesh(config):
"""
Download and symlink a mesh to use for testing.
"""
base_url = config.get('download', 'server_base_url')
database_root = config.get('paths', 'database_root')

filepath = 'ocean/polaris_cache/global_convergence/icos/cosine_bell/' \
'Icos480/mesh/mesh.230220.nc'

url = f'{base_url}/{filepath}'
download_path = os.path.join(database_root, filepath)
download_target = download(url, download_path, config)
return download_target


def write_job_script(config, machine, compiler, submit):
"""
Write a job script for running the build script
"""

if config.has_option('parallel', 'account'):
account = config.get('parallel', 'account')
else:
account = ''

nodes = 1

partition, qos, constraint, _ = get_slurm_options(
config, machine, nodes)

wall_time = '0:15:00'

# see if we can find a debug partition
if config.has_option('parallel', 'partitions'):
partition_list = config.getlist('parallel', 'partitions')
for partition_local in partition_list:
if 'debug' in partition_local:
partition = partition_local
break

# see if we can find a debug qos
if config.has_option('parallel', 'qos'):
qos_list = config.getlist('parallel', 'qos')
for qos_local in qos_list:
if 'debug' in qos_local:
qos = qos_local
break

job_name = f'omega_ctest_{machine}_{compiler}'

this_dir = os.path.realpath(
os.path.join(os.getcwd(), os.path.dirname(__file__)))
template_filename = os.path.join(this_dir, 'job_script.template')

with open(template_filename, 'r', encoding='utf-8') as f:
template = Template(f.read())

build_dir = os.path.abspath(
os.path.join('build_omega', f'build_{machine}_{compiler}'))

script = template.render(job_name=job_name, account=account,
nodes=f'{nodes}', wall_time=wall_time, qos=qos,
partition=partition, constraint=constraint,
build_dir=build_dir)
script = _clean_up_whitespace(script)

build_omega_dir = os.path.abspath('build_omega')
script_filename = f'job_build_and_ctest_omega_{machine}_{compiler}.sh'
script_filename = os.path.join(build_omega_dir, script_filename)

with open(script_filename, 'w', encoding='utf-8') as f:
f.write(script)

if submit:
args = ['sbatch', script_filename]
print(f'\nRunning:\n {" ".join(args)}\n')
subprocess.run(args=args, check=True)


def main():
"""
Main function for building Omega and performing ctests
"""
parser = argparse.ArgumentParser(
description='Check out submodules, build Omega and run ctest')
parser.add_argument('-o', '--omega_branch', dest='omega_branch',
default='e3sm_submodules/Omega',
help='The local Omega branch to test.')
parser.add_argument('-c', '--clean', dest='clean', action='store_true',
help='Whether to remove the build directory and start '
'fresh')
parser.add_argument('-s', '--submit', dest='submit', action='store_true',
help='Whether to submit a job to run ctests')
parser.add_argument('-d', '--debug', dest='debug', action='store_true',
help='Whether to only build Omega in debug mode')
parser.add_argument('--cmake_flags', dest='cmake_flags',
help='Quoted string with additional cmake flags')

args = parser.parse_args()

machine = os.environ['POLARIS_MACHINE']
compiler = os.environ['POLARIS_COMPILER']

config = PolarisConfigParser()
config.add_from_package('polaris', 'default.cfg')
config.add_from_package('mache.machines', f'{machine}.cfg')
config.add_from_package('polaris.machines', f'{machine}.cfg')

submit = args.submit
branch = args.omega_branch
debug = args.debug
clean = args.clean
cmake_flags = args.cmake_flags

if 'SLURM_JOB_ID' in os.environ:
# already on a comptue node so we will just run ctests directly
submit = False
else:
build_only = True

mesh_filename = download_mesh(config=config)

script_filename = make_build_script(machine=machine, compiler=compiler,
branch=branch, build_only=build_only,
mesh_filename=mesh_filename,
debug=debug, clean=clean,
cmake_flags=cmake_flags)

# clear environment variables and start fresh with those from login
# so spack doesn't get confused by conda
subprocess.check_call(f'env -i HOME="$HOME" bash -l {script_filename}',
shell=True)

write_job_script(config=config, machine=machine, compiler=compiler,
submit=submit)


if __name__ == '__main__':
main()

0 comments on commit 2921be9

Please sign in to comment.