diff --git a/README.md b/README.md
index 9f9e0f2..0ee0fc2 100644
--- a/README.md
+++ b/README.md
@@ -1,8 +1,10 @@
# wraprun
-`wraprun` is a utility that enables independent execution of multiple MPI applications under a single `aprun` call.
+`wraprun` is a utility that enables independent execution of multiple MPI
+applications under a single `aprun` call.
## To install:
-`wraprun` includes a Smithy formula to automate deployment, for centers not using Smithy the build looks as follow:
+`wraprun` includes a Smithy formula to automate deployment, for centers not
+using Smithy the build looks as follow:
```
$ mkdir build
@@ -10,25 +12,36 @@ $ cmake -DCMAKE_INSTALL_PREFIX=/path/to/install ..
$ make
$ make install
```
-Inside of `/path/to/install` a `bin` directory will be created containing the `wraprun` scripts and a `lib` directory will be created containing `libsplit.so`. The `WRAPRUN_PRELOAD` environment variable must be correctly set to point to `libsplit.so` and in the case of fortran applications `libfmpich.so` at runtime.
-e.g. `WRAPRUN_PRELOAD=/path/to/install/lib/libsplit.so:/path/to/mpi_install/lib/libfmpich.so`
+Inside of `/path/to/install` a `bin` directory will be created containing the
+`wraprun` scripts and a `lib` directory will be created containing
+`libsplit.so`. The `WRAPRUN_PRELOAD` environment variable must be correctly set
+to point to `libsplit.so` and in the case of fortran applications
+`libfmpich.so` at runtime. e.g.
+`WRAPRUN_PRELOAD=/path/to/install/lib/libsplit.so:/path/to/mpi_install/lib/libfmpich.so`
-On some systems libfmpich has a programming environment specific suffix that must be taken into account:
-e.g. `WRAPRUN_PRELOAD=/path/to/install/lib/libsplit.so:/path/to/mpi_install/lib/libfmpich_pgi.so`
+On some systems libfmpich has a programming environment specific suffix that
+must be taken into account: e.g.
+`WRAPRUN_PRELOAD=/path/to/install/lib/libsplit.so:/path/to/mpi_install/lib/libfmpich_pgi.so`
## To run:
-Assuming that the module file created by the Smithy formula is used, or a similar one created, basic running looks like the following examples.
+Assuming that the module file created by the Smithy formula is used, or a
+similar one created, basic running looks like the following examples.
```
$ module load python wraprun
$ wraprun -n 80 ./foo.out : -n 160 ./bar.out ...
```
+A maximum of 2048 separate `:` separated task groups is enforced to protect
+ALPS stability.
-In addition to the standard process placement flags available to aprun the `--w-cd` flag can be set to change the current working directory for each executable:
+In addition to the standard process placement flags available to aprun the
+`--w-cd` flag can be set to change the current working directory for each
+executable:
```
$ wraprun -n 80 --w-cd /foo/dir ./foo.out : -n 160 --w-cd /bar/dir ./bar.out ...
```
-This is particularly useful for legacy Fortran applications that use hard coded input and output file names.
+This is particularly useful for legacy Fortran applications that use hard coded
+input and output file names.
Multiple instances of an application can be placed on a node using
comma-separated PES syntax `PES1,PES2,...,PESN` syntax, for instance:
@@ -37,14 +50,18 @@ $ wraprun -n 2,2,2 ./foo.out : ...
```
would launch 3 two-process instances of foo.out on a single node.
-In this case the number of allocated nodes must be at least equal to the sum of processes in the comma-separated list of processing elements divided by the maximum number of processes per node.
+In this case the number of allocated nodes must be at least equal to the sum of
+processes in the comma-separated list of processing elements divided by the
+maximum number of processes per node.
This may also be combined with the `--w-cd` flag :
```
$ wraprun -n 2,2,2 --w-cd /foo/dir1,/foo/dir2,/foo/dir3 ./foo.out : ...
```
-For non MPI executables a wrapper application, `serial`, is provided. This wrapper ensures that all executables will run to completion before aprun exits. To use, place `serial` in front of your application and arguments:
+For non MPI executables a wrapper application, `serial`, is provided. This
+wrapper ensures that all executables will run to completion before aprun exits.
+To use, place `serial` in front of your application and arguments:
```
$ wraprun -n 1 serial ./foo.out -foo_args : ...
```
@@ -60,12 +77,12 @@ ${JOBNAME}.${JOBID}_w${INSTANCE}.${TASKID}.err
```
where `JOBNAME` is the batch job name (value of `$PBS_JOBNAME` for instance),
-`JOBID` is the batch job number (or PID of parent shell if `$PBS_JOBID` is unavailable),
-`INSTANCE` is the unique wraprun invocation called within the parent shell, and
-`TASKID` is the task index among all bundled tasks. The instance index is
-required so that multiple concurrent wraprun invocations in a single batch job
-do not collide with each other. The task index is fixed in the order that tasks are
-passed to wraprun such that for the following invocation:
+`JOBID` is the batch job number (or PID of parent shell if `$PBS_JOBID` is
+unavailable), `INSTANCE` is the unique wraprun invocation called within the
+parent shell, and `TASKID` is the task index among all bundled tasks. The
+instance index is required so that multiple concurrent wraprun invocations in a
+single batch job do not collide with each other. The task index is fixed in the
+order that tasks are passed to wraprun such that for the following invocation:
```
$ wraprun -n 1,2 ./foo.out : -n 3 ./bar.out
```
@@ -73,7 +90,8 @@ $ wraprun -n 1,2 ./foo.out : -n 3 ./bar.out
task '0' is the instance of `foo.out` having 1 PE; task '1' is the 2 PE split of
`foo.out`, and task '2' is the instance of `bar.out`.
-The default names can be overridden by supplying a basename path to the group flag `--w-oe`:
+The default names can be overridden by supplying a basename path to the group
+flag `--w-oe`:
```
$ wraprun -n 1,2 --w-oe name_a ./a.out : \
@@ -188,7 +206,9 @@ See the testing/example_config.yaml file for format information.
## Disclaimer
-`wraprun` works by intercepting all MPI function calls that contain an `MPI_Comm` argument. If an application calls an MPI function, containing an `MPI_Comm` argument, not included in `src/split.c` the results are undefined.
+`wraprun` works by intercepting all MPI function calls that contain an
+`MPI_Comm` argument. If an application calls an MPI function, containing an
+`MPI_Comm` argument, not included in `src/split.c` the results are undefined.
If any executable is not dynamically linked the results are undefined.
diff --git a/python/wraprun/api.py b/python/wraprun/api.py
index ebc6c7f..2c7a295 100644
--- a/python/wraprun/api.py
+++ b/python/wraprun/api.py
@@ -171,6 +171,10 @@ def add_task(self, string=None, **kwargs):
self._rank_and_color = {
k: v + 1 for k, v in task_group.last_rank_and_color().items()}
self._task_groups.append(task_group)
+ if len(self._task_groups) > 2048:
+ raise WraprunError(
+ 'Too many task groups (> 2048) in bundle: '
+ 'Aborting to protect ALPS stability.')
self._update_file(task_group)
def _debug_mode(self):
diff --git a/share/man/man1/wraprun.1 b/share/man/man1/wraprun.1
index 5d521af..c268ff4 100644
--- a/share/man/man1/wraprun.1
+++ b/share/man/man1/wraprun.1
@@ -1,4 +1,4 @@
-.TH WRAPRUN "1" "May 2016" "wraprun 0.2.3+" "User Commands"
+.TH WRAPRUN "1" "Aug 2016" "wraprun 0.2.4+" "User Commands"
.SH NAME
.B wraprun
\- an ensemble task wrapper for aprun
@@ -16,10 +16,10 @@ options] [: task ]...
.B wraprun
--w-conf file
.SH DESCRIPTION
-Wraps an arbitrary number of independent MPI and/or serial executables into an ensemble
-that runs under a single aprun call. MPI executables must be dynamically linked
-to run correctly under wraprun. However, serial applications can be run as-is
-when declared with the keyword 'serial'.
+Wraps independent MPI and/or serial executables into an ensemble that runs under
+a single aprun call. A maximum of 2048 separate executables may be bundled. MPI
+executables must be dynamically linked to run correctly under wraprun. However,
+serial applications can be run as-is when declared with the keyword 'serial'.
.SH OPTIONS
.PP
.SS "Global Options"
diff --git a/wraprun_formula.rb b/wraprun_formula.rb
index d995bc0..105f918 100644
--- a/wraprun_formula.rb
+++ b/wraprun_formula.rb
@@ -1,6 +1,6 @@
class WraprunFormula < Formula
homepage "https://github.com/olcf/wraprun"
- url "https://github.com/olcf/wraprun/archive/v0.2.3.tar.gz"
+ url "https://github.com/olcf/wraprun/archive/v0.2.4.tar.gz"
supported_build_names /python2.7/, /python3/