Skip to content

PPUG_System_Requirements

Alice Bertini edited this page Mar 8, 2018 · 17 revisions

CESM Post Processing System Requirements

The CESM post processing code requires a number of custom modules, which are included in the distribution download, and some standard packages from the python community that may require system administration privileges to install.

External Components included with the Distribution

  • pyReshaper (github)
  • pyAverager (gitbub)
  • ASAPPyTools (github)
  • Diagnostics packages from Atmosphere, Land, Sea-Ice, and Ocean working groups (NCAR CGD Subversion server)
  • python wrapper codes to call tools listed above (NCAR CGD Subversion server)

Notes about vanilla python vs. anaconda python

This package is known to work with the vanilla python (>= v2.7.6) and the virtualenv. We have not been able to get it to successfully work with an anaconda install of python due to the requirements of the boot-strap packages listed below and their dependencies on underlying compiled C, C++ and Fortran libraries.

We are trying to port the CESM post-processing tools to an anaconda python (v2.7.11) on a small test cluster but finding that we still need to compile all the underlying dependency libraries to work with our particular infiniband switch software stack.

Required Standard Packages from the Python Community:

  • python (version >= 2.7.x but < 3.y.z)
  • virtualenv (version >= 12.0.7)
  • pip (version >= 1.5.6)
  • pyngl (version >= 1.4)
  • pynio (version >= 1.4)
  • mpi4py (version >= 1.3)
  • netCDF4-python (version >= 1.1)
  • numpy (version >= 1.8.1)
  • scipy (version >= 0.15.1)
  • matplotlib (version >= 1.4.3) & basemap toolkit
  • cf_units (version >= 1.1.3)
  • netcdf4-python (version >= 1.2.7)
  • PythonMagick (version >= 0.5) (image format conversion - convert ncl ps files to png, gif, etc.) OPTIONAL

Possible problems that may be encountered when installing on HPC system via virtualenv

The list may not be complete, and some of these may be Yellowstone-specific. Pure-python modules should work without issues, so the list addresses the ones that do need a C/C++/Fortran compiler.

  1. On Yellowstone, the C/C++/Fortran compilers (as well as the rest of the compilation stack) are not available on the batch nodes, the “pip install” must happen on the login nodes or on Geyser or on Caldera.

  2. All the usual modules for loading the appropriate compilers should be used, however the python building process does not always pick the right one. Usually, the $FC, $CC and similar environmental variables are recognized and one can indicate the preferred compiler in that way. Sometimes, the extremely old, system-provided (by RedHat) gcc v4.4.7 is used, even if a module for intel or for a more recent gcc is loaded and the environmental variables are set. When this happen, case-by-case investigation is the best way to proceed.

  3. All the usual modules for loading the appropriate dependencies (libraries) should be used, however often times the python building process does not pick the right ones, or anything at all. Each python package has its own way to specify the dependencies, like --with-such-and-such=/path/to/such/and/such (which can be passed to setup.py from pip install with something like:

    pip install --install-option=”--with-such-and-such=/path/to/such/and/such”

  4. Items 2 and 3 above taken together should cover the need for MPI (probably relevant only for mpi4py, which does not happen to have any of these difficulties, at least when compiled with gnu/4.8.2)

  5. Geyser uses a slightly different architecture than the rest of Yellowstone nodes (login, compute and caldera have identical architectures among themselves). Using on Geyser virtual environment created elsewhere may crash with “Illegal Instruction”. See https://www2.cisl.ucar.edu/resources/yellowstone/code_dev/compiling#where for various workarounds.

  6. On Yellowstone, software is built with shared-library-linking (the only mode supported by the vendor), and using RPATH. In short, RPATH is an list of paths embedded into each shared-linked binary (both executable and library), and tells the operating system where to find the dependencies that binary is linked against. Using RPATH, a shared-linked binary behaves similarly to (but not exactly like) a static-linked one.

    To properly create the RPATH into binaries, one could use some link options, which could be passed to the linker by the compiler. To make the environment user-friendly, Yellowstone uses a combination of compiler wrappers, environmental variables and lmod (lua modules) files to properly “inject” the right RPATH into the binaries, without any explicit action by the user, other than loading the appropriate module. Unfortunately, the python setup.py build process does not always properly work in this environment, and sometimes the binaries (usally .so libraries) created by a “pip install” or “python setup.py” do not have the proper RPATH set.

    Fortunately, RPATH can also changed after the linking stage, by editing the binary file. An useful tool to examining and modify RPATH of binaries is patchelf, installed in Yellowstone under /glade/apps/opt/usr/bin/patchelf. The options useful in this context are:

    patchelf --print-rpath <filename>

    patchelf --set-rpath <path> <filename>

    (note that the latter sets the path, does not append a directory to it, so one must use the former, append to its output and only then use the latter).

    Therefore, after every “pip install” or “python setup.py” run, it is advisable to check for each binary which is created and use “ldd” to make sure there is nothing “not found”.

    To complicate the matter, the batch nodes are different from the login, Geyser and Caldera nodes, in that (since only the former are statelight) they do not have the same full-fledged operating system installation, but a reduced one, with missing operating system libraries. Therefore, ldd could be successful on the login/Geyser/Caldera node (where the compilation must have occurred), but it may fail on the batch node. So the whole process requires at least two separate LSF jobs, one for compiling (not on batch nodes as described in the bullet #1 above), one for checking described here. After the check, a “fix” may be needed with patchelf (which may happen in the same “check” job).

  7. For its own convenience, CSG keeps notes, logs and record of the problems encountered during installation of packages. Tese are available to you if you would like to check them. We place them into /BUILD_DIR/logs/ subdirectory within the installation directory (e.g. /glade/apps/opt/mpi4py/1.3.1/gnu/4.8.2/BUILD_DIR/logs/). Note that we do not guaranteed these to be accurate or complete, and they will also be different in “style” depending on which consultant worked on that particular installation.