Skip to content

Building and Running MPICH (CH3) THIS IS OUTDATED Try CH4 instructions first

Howard Pritchard edited this page Apr 14, 2017 · 1 revision

This page describes how to build and run MPICH (CH3 device) to test the libfabric GNI provider. It is assumed the user is building MPICH on a Cray XC system like jupiter or edision/cori, and that you have built and installed a copy of libfabric.

MPICH can be built to use the Cray PMI or SLURM PMI. Differences in the build procedure using the two PMIs are highlighted below.

Note this wiki describes building from the stable MPICH repo, not from devel.


Building and Installing MPICH

First, if you don't already have a clone of MPICH

% git clone git://git.mpich.org/mpich.git

Next, configure and build/install MPICH. There has to be some hacking done to get MPICH to work with Cray PMI. The goal is to build MPICH such that it uses libfabric and works with aprun or srun as the job launcher. Note you will need libtool 2.4.4 or higher to keep MPICH's configury happy.

If you wish to use the Cray PMI library, get the Cray PMI related patch file and apply it to the MPICH source

% cd mpich
% git am 0001-netmod-ofi-changes-to-be-able-to-use-cray-pmi.patch

This patch step can be skipped if the SLURM PMI library will be used. Parts of the configuration procedure are the same whether using SLURM or Cray PMI:

% ./autogen.sh
% module load PrgEnv-gnu
% export MPID_NO_PMI=yes
% export MPID_NO_PM=yes

If using Cray PMI, also set the following enviroment variable so that MPICH will be configured to use the PMI2 API.

% export USE_PMI2_API=yes

For the Cray PMI, set the LDFLAGS and LIBS environment variables as follows:

% export LDFLAGS=`pkg-config --libs-only-L cray-pmi`
% export LIBS=`pkg-config --libs-only-l cray-pmi`

For SLURM PMI on Cori, set these environment variables as follows:

% export LDFLAGS="-L/usr/lib64/slurmpmi"
% export LIBS="-lpmi"

In either case, the configure line needs to include the location the base of your libfabric install, as well as specifying the OFI nemesis netmod:

% ./configure --prefix=mpich_install_dir --with-ofi=your_libfabric_install_dir --with-device=ch3:nemesis:ofi
% make -j 8 install

Note if you are wanting to run MPI multi-threaded tests which use MPI_THREAD_MULTIPLE, you will need to configure MPICH as follows

% ./configure --prefix=mpich_install_dir --with-ofi=your_libfabric_install_dir --enable-threads=multiple --with-device=ch3:nemesis:ofi

Running MPICH with libfabric

First you will need to build an MPI app using MPICH's compiler wrapper:

% export PATH=mpich_install_dir/bin:${PATH}
% mpicc -o my_app my_app.c

On Tiger and NERSC edison/cori, the application can be launched using srun:

% srun -n 2 -N 2 ./my_app

On systems using aprun, the application can be launched:

% aprun -n 2 -N 1 /my_app

IMPORTANT NOTE: If you are running on Cori and using the SLURM PMI library, you will need to set the LD_LIBRARY_PATH (MPICH compiler scripts doesn't rpath apparently):

% export LD_LIBRARY_PATH=/usr/lib64/slurmpmi:$LD_LIBRARY_PATH

If you'd like to double check against the sockets provider, do the following

% export MPIR_CVAR_OFI_USE_PROVIDER=sockets
% srun -n 2 -N 2 ./my_app

This will force the OFI netmod to use the sockets provider.

Building and Testing OSU MPI benchmarks

OSU provides a relatively simple set of MPI benchmark tests which are useful for testing the GNI libfabric provider.

% wget http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.0.tar.gz
% tar -zxvf osu-micro-benchmarks-5.0.tar.gz
% cd osu-micro-benchmarks-5.0
% ./configure CC=mpicc
% make

In the mpi/pt2pt and mpi/collective subdirectories there are a number of tests. To test, for example MPICH send/recv message latency, osu_latency can be used

% cd mpi/pt2pt
% srun -n 2 -N 2 ./osu_latency

Known Issues

The MPICH CH3 OFI Netmod uses an unscalable shutdown algorithm in MPI_Finalize. For applications with limited inter-node communication patterns (nearest neighbor, etc) this can be particularly problematic above 4K MPI processes.