Skip to content

Building and Running MPICH (CH4)

Howard Pritchard edited this page Apr 10, 2017 · 21 revisions

This page describes how to build and run MPICH (CH4 device) to test the libfabric GNI provider. It is assumed the user is building MPICH on a Cray XC system like jupiter or edision/cori, and that you have built and installed a copy of libfabric.

MPICH can be built to use the Cray PMI or SLURM PMI. Differences in the build procedure using the two PMIs are highlighted below.


Building and Installing MPICH

First, if you don't already have a clone of MPICH

% git clone git@github.com:pmodels/mpich.git

Next, configure and build/install MPICH. Note you will need libtool 2.4.4 or higher to keep MPICH's configury happy.

Building using Cray PMI

If you intend to use Cray PMI, you'll need to apply this patch first.

After applying the patch, the following steps can be used to configure MPICH CH4:

% module load PrgEnv-gnu
%./autogen.sh
%./configure CFLAGS="-DMPIDI_CH3_HAS_NO_DYNAMIC_PROCESS" --with-pmi=cray --with-pm=none 
  --prefix=<path-to-mpich-install> --with-ofi=<path-to-ofi-libfabric-install>
% make -j install

Build Using SLURM PMI

The following has been used to configure MPICH CH4 on Cray XC systems with SLURM PMI installed:

% ./autogen.sh
% module load PrgEnv-gnu
% export MPID_NO_PMI=yes
% export MPID_NO_PM=yes
% export USE_PMI2_API=yes

For SLURM PMI on Cori, set these environment variables as follows:

% export LDFLAGS="-L/usr/lib64/slurmpmi"
% export LIBS="-lpmi"

There is no patch to use MPICH/CH4 with Cray PMI.

The configure line needs to include the location the base of your libfabric install, as well as specifying the OFI nemesis netmod, and specifying use of the FI_MR_BASIC memory registration model:

% ./configure --prefix=mpich_install_dir --with-libfabric=path_to_libfabric_install --with-device=ch4:ofi --with-ch4-netmod-ofi-args=mr-basic
% make -j 8 install

Note if you are wanting to run MPI multi-threaded tests which use MPI_THREAD_MULTIPLE, you will need to configure MPICH as follows

% ./configure --prefix=mpich_install_dir --with-ofi=your_libfabric_install_dir --enable-threads=multiple --with-device=ch4:ofi --with-ch4-netmod-ofi-args=mr-basic

Running MPICH with libfabric

First you will need to build an MPI app using MPICH's compiler wrapper:

% export PATH=mpich_install_dir/bin:${PATH}
% mpicc -o my_app my_app.c

On Tiger and NERSC edison/cori, the application can be launched using srun:

% srun -n 2 -N 2 ./my_app

IMPORTANT NOTE: If you are running on Cori and using the SLURM PMI library, you will need to set the LD_LIBRARY_PATH (MPICH compiler scripts doesn't rpath apparently):

% export LD_LIBRARY_PATH=/usr/lib64/slurmpmi:$LD_LIBRARY_PATH

If you'd like to double check against the sockets provider, do the following

% export MPIR_CVAR_OFI_USE_PROVIDER=sockets
% srun -n 2 -N 2 ./my_app

This will force the OFI netmod to use the sockets provider.

Building and Testing OSU MPI benchmarks

OSU provides a relatively simple set of MPI benchmark tests which are useful for testing the GNI libfabric provider.

% wget http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.0.tar.gz
% tar -zxvf osu-micro-benchmarks-5.0.tar.gz
% cd osu-micro-benchmarks-5.0
% ./configure CC=mpicc
% make

In the mpi/pt2pt and mpi/collective subdirectories there are a number of tests. To test, for example MPICH send/recv message latency, osu_latency can be used

% cd mpi/pt2pt
% srun -n 2 -N 2 ./osu_latency

Known Issues

The MPICH CH4 OFI Netmod is under active development. Expect suprises, hangs, etc. when using the OFI CH4 netmod, especially with the GNI provider.