Skip to content
Caetano Melone edited this page Mar 23, 2020 · 47 revisions

For those with experience setting up Intel clusters, using the OpenHPC repository and installing packages is a breeze. However, if you would like to use one of IBM's new POWER9 systems at your lab or company, the steps are a bit more complicated but the end result is fantastic performance.

In this article, I will cover everything from installing the OS to configuring open source software that is extremely useful in getting your HPC up and running smoothly.

Table of Contents

  1. Introduction
  2. Installing CentOS
  3. Configuring the BMC
  4. Installing xCAT
  5. Configuring xCAT Environment
  6. Installing the Slurm Job Scheduler
  7. Adding Infiniband Support
  8. C/C++ Compiler
  9. CUDA Support

Introduction

Our friends over at IBM lent the Stanford High Performance Computing Center a couple of incredible machines to test out. The master node (also functioning as a login node) is an IBM LC921 system while the AC921 will be our compute node.

While the systems came preinstalled with RHEL, at the HPCC we use open source software whenever possible, so we will be using CentOS 7 for this cluster.

As I went through the first tests with the LC921, I discovered that the CentOS 7.7 ISO provided through the official channels was not compatible with the machine. Thus, we will be using the 7.6 ISO.

Before we do anything, make sure that both your master and compute nodes are plugged in and ready to go!

Installing CentOS

The correct ISO can be installed from the Linux Kernel Archives. Navigate to this directory and download CentOS-7-power9-Everything-1810.iso onto your local machine.

Once the download is complete, burn it to a USB.

Ensure that the LC922 is powered off and plug the USB into an empty port. Start the machine, wait a few minutes, and Petitboot (the bootloader) will prompt you to exit the boot process. Proceed with this action, as we want to modify some boot parameters.

At the top of Petitboot, all the available boot options are listed. Using your navigation keys, scroll to the USB that you burned CentOS to. Once it is highlighted, hit e on your keyboard. Petitboot will enter a new interface, allowing you to edit the boot parameters.

At the top of your screen, information about the USB is displayed. Make sure to write the UUID of the device (in 0000-00-00-00-00-00 format) down for safekeeping. Navigate to the field where the boot parameters can be edited, and clear it. Now, replace it with the following (replace the UUID with the one for your USB).

ro inst.stage2=hd:UUID=0000-00-00-00-00-00 inst.graphical console=tty0 console=hvc0

Once the new boot parameters are entered correctly, save the information and return back to the Petitboot home page. Boot into the USB by navigating to its location on the listing and pressing Enter.

The machine will boot into the CentOS installer. Select the appropriate timezone and language. Under Installation Destination, click "I would like to make additional space available." Click "Done", and you are brought to a new screen. Make sure that the disk that you would like to install CentOS to is selected for installation. Click "Delete All" on the botton right, then click "Reclaim Space."

Under Software Selection, select GNOME Desktop on the left, and GNOME Applications, Compatibility Libraries, Development Tools, and Security Tools on the right. Of course if you would like to install additional software, feel free, the provided examples are the bare minimum necessary to get up and running.

If you would like to configure a network connection (local or internet), plug in the respective cables and you can configure the options under Network and Hostname. Be sure to choose the interfaces wisely, as these will be used during the node deployment process.

After all your settings are set the way you like, click "Begin Installation." The CentOS installer will prompt you to set a root password as well as credentials for one user. Make sure that the user is categorized as an administrator.

The installation process will take between 15 and 30 minutes. The computer will reboot and you may unplug the USB from the machine. Depending on how the boot order is configured in Petitboot, you may need to change the top of the device sequence to match the hard drive you installed CentOS on.

After the LC922 successfully boots up, you should be able to log in with the user you created.

Configuring the BMC

The Base Management Controller, or BMC, is an integral part of controlling compute nodes remotely. You can restart, and power on/off the system from anywhere. For this example, I will be configuring the BMC on the AC922 system.

Before getting started, be sure to identify the BMC interface on the compute node and plug an ethernet cable into the master node.

Start the compute node, and following the initial boot process, it will enter Petitboot. Stop the boot there and enter the Petitboot console.

For the compute node, I'll be using the following configuration:

  • IP Address: 10.1.1.11 (known as COMPUTE_BMC_IP in the instructions down below)
  • Netmask: 255.0.0.0
  • Gateway: 10.1.1.1

Once you are in the console, run the following commands to set the static IP for the BMC interface:

ipmitool lan set 1 ipsrc static
ipmitool lan set 1 ipaddr `COMPUTE_BMC_IP`
ipmitool lan set 1 netmask 255.0.0.0
ipmitool lan set 1 defgw ipaddr `COMPUTE_BMC_IP`
ipmitool mc reset cold

After running the last command, the system will hang for a couple minutes as the process reinitializes. Once that is over, run ipmitool lan print 1 to confirm the changes.

Installing xCAT

xCAT is software that allows you to manage the stateless deployment of nodes with ease.

First, identify the hostname and IP addresses of both the master node and compute node. In the commands below, they will be represented by variables such as EXTERNAL_IP for the master node.

To get started with the installation process, log onto your new master node and run the following commands.

echo "EXTERNAL_IP HOSTNAME" >> /etc/hosts"
echo "INTERNAL_IP HOSTNAME.localdomain HOSTNAME >> /etc/hosts"

xCAT relies on network protocols that your default firewall rules may prevent from working correctly. The firewall service on the master node must be disabled.

systemctl disable firewall
systemctl stop firewalld
reboot

Once your machine reboots and you can verify that the firewall is disabled by running sestatus, xCAT can be installed.

First, add the EPEL repository:

yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Now, using a script made by xCAT's creators, install the software:

wget https://raw.githubusercontent.com/xcat2/xcat-core/master/xCAT-server/share/xcat/tools/go-xcat -O - >/tmp/go-xcat
chmod +x /tmp/go-xcat

Run the script:

/tmp/go-xcat install

Configuring xCAT Environment

Your HPC system will need to have an accurate clock throughout all nodes; enabling NTP on your machine and selecting an appropriate time server is an essential start of the process. If your institution has their own time server, I would recommend using that one, otherwise, using time.nist.gov is a safe bet.

systemctl enable ntpd.service
echo "server NTP_SERVER" >> /etc/ntp.conf"
echo systemctl restart ntpd

Now, we will start configuring xCAT for node provisioning.

Register the internal network interface for DHCP:

chdef -t site dhcpinterfaces="xcatmn|INTERNAL_INTERFACE"

Transfer the CentOS-7-power9-Everything-1810.iso file that you installed on the computer to the master node. Generate the initial operating system image using this command:

copycds ~/CentOS-7-power9-Everything-1810.iso

Of course, the path that follows copycds will depend on where you copied the ISO file to.

To verify that the base image has been generated, run

lsdef -t osimage

and the output should resemble the following:

centos7.6-ppc64le-install-compute  (osimage)
centos7.6-ppc64le-netboot-compute  (osimage)
centos7.6-ppc64le-statelite-compute  (osimage)

Because xCAT utilizes a chroot environment as the assembly location of the operating system, it will be useful to store the location of this environment on the master node.

Add the following to ~/.bashrc or the equivalent file, and modify it depending on the operating system and architecture.

export CHROOT=/install/netboot/centos7.6/ppc64le/compute/rootimg/

To generate the chroot environment inside the respective folder, run

genimage centos7.6-ppc64le-netboot-compute

Now, enable the Yum package manager for the chroot environment:

yum-config-manager --installroot=$CHROOT --enable base 
cp /etc/yum.repos.d/epel.repo $CHROOT/etc/yum.repos.d

Install packages that will be necessary on the compute node:

yum install ntp kernel ipmitool --installroot=$CHROOT

Enable the time server on the compute node:

chroot $CHROOT systemctl enable ntpd 
echo "server INTERNAL_IP" >> $CHROOT/etc/ntp.conf

There are a number of files that you may want to sync from the master to the compute node. xCAT handles this by reading from a synclist. We are going to place this file in /install/custom/netboot.

mkdir -p /install/custom/netboot
chdef -t osimage -o centos7.6-ppc64le-netboot-compute synclists="/install/custom/netboot/compute.synclist" 
echo "/etc/passwd -> /etc/passwd" >> /install/custom/netboot/compute.synclist 
echo "/etc/group -> /etc/group" >> /install/custom/netboot/compute.synclist 
echo "/etc/shadow -> /etc/shadow" >> /install/custom/netboot/compute.synclist 

Once this process is completed, the chroot environment can be packed into an image:

packimage centos7.6-ppc64le-netboot-compute

Now, depending on how many compute nodes you would like to configure, the following command will vary.

mkdef -t node COMPUTE_NODE_NAME groups=compute,all ip=COMPUTE_INTERNAL_IP mac=COMPUTE_INTERNAL_MAC netboot=petitboot \ arch=ppc64le bmc=COMPUTE_BMC_IP bmcpassword=0penBmc \ mgt=ipmi serialport=0 serialspeed=115200

Before running this command, make sure to identify the local interface you would like to use on the compute node, as having the correct MAC address will matter as you set it up. In addition, having the BMC configured correctly is important as you can control the node's power from the master.

Add the master node's hostname to the xCAT database for network-wide name resolution.

chdef -t site domain=HOSTNAME

Now, the following commands will configure the network records in preparation for the first boot of the compute node:

makehosts
makenetworks
makedhcp -n
makedns -n
systemctl enable dhcpd.service
systemctl start dhcpd

To match the node definition created with mkdef with the chroot environment, run:

nodeset compute osimage=centos7.6-ppc64le-netboot-compute

Install IPMItool in order to control the compute node's power:

yum install ipmitool

After successfully running those commands, run rpower compute reset to reboot the compute node and boot into the new operating system.

Installing the Slurm Job Scheduler

Now that we have a simple operating system running, it's time to install software that will allow us to run programs on the cluster. First, let's install Slurm, an open source job scheduler that is incredibly useful when submitting jobs throughout a cluster.

First, install the MariaDB database engine to store the accounting information for Slurm.

yum install mariadb-server mariadb-devel

Now you can create the users that Slurm, and a dependency (Munge, which handles authentication for submitted jobs).

export MUNGEUSER=1001
groupadd -g $MUNGEUSER munge
useradd -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGEUSER -g munge -s /sbin/nologin munge

export SLURMUSER=1002
groupadd -g $SLURMUSER slurm
useradd -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm -s /bin/bash slurm

Depending on how GIDs are allocated on the system, you may need to modify MUNGEUSER and SLURMUSER. Check which GIDs are being currently used, run getent group.

Install Munge on both the master and chroot environment:

yum install munge munge-libs munge-devel
yum install munge munge-libs munge-devel --installroot=$CHROOT

Now we can start configuring the Munge set up. Create a secret key on the master node using rng-tools:

yum install rng-tools
rngd -r /dev/urandom

/usr/sbin/create-munge-key -r
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
chown munge: /etc/munge/munge.key
chmod 400 /etc/munge/munge.key

We can add this secret key to the synclist we created earlier:

echo "/etc/munge/munge.key -> /etc/munge/munge.key" >> /install/custom/netboot/compute.synclist 

From the chroot environment, we can fix the permissions for the Munge files:

chroot $CHROOT chown -R munge: /etc/munge/ /var/log/munge/ /var/lib/munge/
chroot $CHROOT chmod 0700 /etc/munge/ /var/log/munge/ /var/lib/munge

Enable the Munge service on start:

chroot $CHROOT systemctl enable munge
systemctl enable munge
systemctl start munge

After these steps, Munge should be successfully installed, and we can now proceed with the installation of Slurm on the system.

Install Slurm dependencies:

yum install openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad

yum install openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad --installroot=$CHROOT

Because Slurm needs to be installed from source, it makes sense to create a temporary directory to store all the files:

mkdir -p /tmp/slurm

The latest version of the Slurm source code can be found on this page. For this guide, we will be using version 20.02.

cd /tpm/slurm
wget https://download.schedmd.com/slurm/slurm-20.02.0.tar.bz2

Install the rpm-build package, which will allow for easier compilation of the bz2 file. There are a few additional packages that will facilitate the process.

yum install rpm-build
yum groupinstall "Development Tools"
yum install python36 perl-ExtUtils-MakeMaker

Now, compile Slurm:

rpmbuild -ta slurm-20.02.0.tar.bz2

This should take a few minutes, and once it's done compiling, change into this directory: cd /root/rpmbuild/RPMS/ppc64le/. In here, you will find all the Slurm rpms that can be used for install with yum.

Install Slurm onto the master node:

yum install ./slurm-20.02.0-1.el7.ppc64le.rpm slurm-devel-20.02.0-1.el7.ppc64le.rpm slurm-perlapi-20.02.0-1.el7.ppc64le.rpm slurm-torque-20.02.0-1.el7.ppc64le.rpm slurm-slurmdbd-20.02.0-1.el7.ppc64le.rpm slurm-slurmctld-20.02.0-1.el7.ppc64le.rpm

Install Slurm onto the compute node (via chroot environment):

yum install ./slurm-20.02.0-1.el7.ppc64le.rpm slurm-devel-20.02.0-1.el7.ppc64le.rpm slurm-perlapi-20.02.0-1.el7.ppc64le.rpm slurm-torque-20.02.0-1.el7.ppc64le.rpm slurm-slurmdbd-20.02.0-1.el7.ppc64le.rpm slurm-slurmd-20.02.0-1.el7.ppc64le.rpm --installroot=$CHROOT

Enable Slurm on startup:

systemctl enable slurmctld
chroot $CHROOT systemctl enable slurmd

After installing Slurm, you will need to configure its settings to match your environment. Luckily, the developers of the project have set up a website that allows you to generate a simple config file. For posterity, here is the one we are using at HPCC:

SlurmctldHost=HOSTNAME

MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=2
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurmd
SwitchType=switch/none
TaskPlugin=task/none

SchedulerType=sched/backfill
SelectType=select/linear

AccountingStorageType=accounting_storage/none
ClusterName=cluster
JobAcctGatherType=jobacct_gather/none

SlurmctldLogFile=/var/log/slurmctld.log
SlurmdLogFile=/var/log/slurmd.log

# COMPUTE NODES
NodeName=COMPUTE_NODE_NAME CPUs=128 Sockets=2 CoresPerSocket=16 ThreadsPerCore=4 State=UNKNOWN
PartitionName=main Nodes=power9-compute-1 Default=YES MaxTime=INFINITE State=UP

When determining info such as sockets, CPUs, etc, you can run lscpu on the compute node to find it.

Save the file to /etc/slurm/slurm.conf (create directory if it doesn't exist), and add it to the synclist:

echo "/etc/slurm/slurm.conf -> /etc/slurm/slurm.conf" >> /install/custom/netboot/compute.synclist 

Configure permissions for Slurm log files on the master node:

mkdir /var/spool/slurmctld
chown slurm: /var/spool/slurmctld
chmod 755 /var/spool/slurmctld
touch /var/log/slurmctld.log
chown slurm: /var/log/slurmctld.log
touch /var/log/slurm_jobacct.log /var/log/slurm_jobcomp.log
chown slurm: /var/log/slurm_jobacct.log /var/log/slurm_jobcomp.log

Now for the chroot environment:

mkdir $CHROOT/var/spool/slurmd
chroot $CHROOT chown slurm: /var/spool/slurmd
chroot $CHROOT chmod 755 /var/spool/slurmd
touch $CHROOT/var/log/slurmd.log
chroot $CHROOT chown slurm: /var/log/slurmd.log

Apply the changes to the compute image, and restart the server:

packimage centos7.6-ppc64le-netboot-compute
rpower compute reset

Once the node is up and running again, you can verify that the node has been successfully configured by running sinfo. Output should look similar to this:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
main*        up   infinite      1   idle power9-compute-1

To run a quick example job on the compute node, run srun hostname, which should just output the hostname of the compute node.

That's it! Now you have a phenomenal and incredibly useful job scheduler installed on your cluster!

Adding InfiniBand Support

If you want nodes to be able to communicate with one another with high throughput and low latency, InfiniBand (IB) is the perfect technology. First, ensure that you have IB network cards installed on all the nodes that you want.

Now, we can install the software necessary to set up IB on both the master and compute nodes.

yum groupinstall "Infiniband Support"
yum groupinstall "Infiniband Support" --installroot=$CHROOT

Install (optional) packages for network diagnosis:

yum install infiniband-diags perftest gperf
yum install infiniband-diags perftest gperf --installroot=$CHROOT

IB requires the rdma and opensm to begin at startup:

systemctl enable rdma
systemctl enable opensm
chroot $CHROOT systemctl enable rdma
chroot $CHROOT systemctl enable opensm

Reboot the system in order to load kernel modules and load rdma and opensm.

reboot

After it restarts, verify the status of rdma and opensm:

systemctl status rdma
systemctl status opensm

Check that the IB port is up and running using ibhosts, you should see an output similar to:

Ca	: 0xec0d9a03008edc50 ports 1 "hpcc-power9 mlx5_0"

Now, we need to set the configuration for the IB interface, ib0. Navigate to /etc/sysconfig/network-scripts on the master node and you should see a file named ifcfg-ib0. Open it with your preferred editor and edit it. This is the file we are using, but feel free to change it in accordance with your preferences.

DEVICE=ib0
TYPE=infiniband
BOOTPROTO=static
IPADDR=192.168.1.1
NETMASK=255.255.0.0
NETWORK=192.168.0.0
ONBOOT=yes
NM_CONTROLLED=no
DEFROUTE=no
UUID=

Restart the network, and run ifconfig. You should see the ib0 interface with the correct IP Address, netmask, etc:

systemctl restart network
ifconfig

Once IB is configured on the master node, we can add it to the xCAT configuration (change commands based on your specific network config):

chdef -t network -o ib0 mask=255.255.0.0 net=192.168.1.1
chdef compute -p postbootscripts=confignics 
chdef power9-compute-1 nicips.ib0=192.168.1.2 nictypes.ib0="InfiniBand" nicnetworks.ib0=ib0

The IP Address for the IB interface on the compute node will be 192.168.1.2. Now, we need to regenerate the network configuration files for xCAT:

makenetworks
makedhcp -n

As always, apply the changes to the compute image and restart the node:

packimage centos7.6-ppc64le-netboot-compute
rpower compute reset

To test that the IB interface has been successfully set up on both nodes, try pinging the IP for the compute node from the master:

ping 192.168.1.2

If you don't receive any output or get connection/timeout errors, be sure to revisit the above instructions and debug any issues.

C/C++ Compiler

If you are going to be doing any programming in C-family programming languages, you are doing to want the Development Tools group from yum to be installed. It includes gcc, make, and other useful programs for compilation.

yum groupinstall "Development Tools"
yum groupinstall "Development Tools" --installroot=$CHROOT 

Now, we can install IBM's XL C/C++ for Linux toolkit, which contains "high-performance compilers that can be used for developing complex, computationally intensive programs, including interlanguage calls."

On the master node, create a temporary directory and change into it:

mkdir /tmp/xlc
cd /tmp/xlc

Download the compiler from this site:

wget https://iwm.dhe.ibm.com/sdfdl/1v2/regs2/ca0g3632/CE/Xa.2/Xb.W5f2_Bc3N5i43w__5I0b9BcxKy3jEKUMobhdst9udYM/Xc.IBM_XL_C_CPP_V16.1.1.3_LINUX_COMMUNITY.tar.gz/Xd./Xf.LPr.F1az/Xg.10640456/Xi.swg-xlcfl/XY.regsrvs/XZ.4BypvuMeQ6Y4DPEracL7ss7GFLc/IBM_XL_C_CPP_V16.1.1.3_LINUX_COMMUNITY.tar.gz

Extract the files and create a folder inside /tmp/xlc:

tar xvf IBM_XL_C_CPP_V16.1.1.3_LINUX_COMMUNITY.tar.gz
mkdir /tmp/xlc/rpms

Move the rpms to the new folder:

mv images/littleEndian/rhel/* rpms/

Before going forward, one small change needs to be made to the name of a package:

mv rpms/xlc-license-community.16.1.1-16.1.1.3-190426.ppc64le.rpm rpms/xlc-license.16.1.1-16.1.1.3-190426.ppc64le.rpm

Download the xCAT kit for this specific compiler:

wget https://xcat.org/files/kits/hpckits/2.14/rhels7.6/xlc-16.1.1-1-ppc64le.NEED_PRODUCT_PKGS.tar.bz2

Build the kit using buildkit:

buildkit addpkgs xlc-16.1.1-1-ppc64le.NEED_PRODUCT_PKGS.tar.bz2 --pkgdir rpms/

Once that command finishes, there should be a file named xlc-16.1.1-1-ppc64le.tar.bz2 in the directory. Add it to the xCAT database:

addkit xlc-16.1.1-1-ppc64le.tar.bz2

Add the kit to the compute image:

addkitcomp --adddeps -i centos7.6-ppc64le-netboot-compute xlc.rte-compute-16.1.1-1-rhels-7-ppc64le -f

Apply changes:

packimage centos7.6-ppc64le-netboot-compute
rpower compute reset

That's it! IBM's XL C/C++ compilers are ready for use.

CUDA Support

If your compute node has GPUs, they are a very powerful tool when performing certain tasks, and NVIDIA's CUDA platform is extremely useful in this regard.

First, ssh into your compute node(s) and verify that there are GPUs installed:

lscpi | grep NVIDIA

Similar output expected:

0004:04:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1)
0004:05:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1)
0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1)
0035:04:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1)

To perform the next installation instructions, you will need to note the kernel version of the compute node, which outputs with this command:

uname -r

Now, go back to the master node. We need to install some packages on the chroot environment.

Here are the first dependencies for CUDA: kernel, kernel-devel, kernel-tools, kernel-tools-libs, and kernel-bootstrapper.

Because the compute node and master node may have different kernel versions, you can install the packages to the compute similar to this:

yum install kernel-4.14.0-115.el7a.0.1.ppc64le kernel-devel-... kernel-tools-... kernel-tools-libs-... kernel-bootstrapper-... --installroot=$CHROOT

Make sure to replace the correct the kernel version with the correct one for your compute node. If yum gives an error stating that the package for the kernel version cannot be found, you may need to install the packages from a different source. I googled "kernel-4.14.0-115.el7a.0.1.ppc64le.rpm" and was able to find a site that hosted all the RPM's I needed.

Once the initial kernel dependencies are installed, we need to disable the udev memory auto-onlining rule, as it is necessary before installing NVIDIA components on POWER systems.

cp $CHROOT/lib/udev/rules.d/40-redhat.rules $CHROOT/etc/udev/rules.d
sed -i 's/SUBSYSTEM!="memory",.GOTO="memory_hotplug_end"/SUBSYSTEM=="", GOTO="memory_hotplug_end"/' $CHROOT/etc/udev/rules.d/40-redhat.rules

After these initial steps, it's time to install the repository for NVIDIA packages for your system version.

Download the NVIDIA GPU driver:

  • Go to NVIDIA Driver Download.
  • Select Product Type: Tesla.
  • Select Product Series: P-Series (for Tesla P100) or V-Series (for Tesla V100).
  • Select Product: Tesla P100 or Tesla V100.
  • Select Operating System, click Show all Operating Systems, then choose the appropriate value:
  • Linux POWER LE RHEL 7
  • Select CUDA Toolkit: 10.2.
  • Click SEARCH to go to the download link.
  • Click Download to download the driver.

Make sure that the file you download has the .rpm extension. Now, we can add the repository to both the master node and the chroot environment:

yum install ./nvidia-driver-local-repo-rhel7-440.33.01-1.0-1.ppc64le.rpm
yum install ./nvidia-driver-local-repo-rhel7-440.33.01-1.0-1.ppc64le.rpm --installroot=$CHROOT

For this to work, be sure that you're in the same directory as the .rpm file or can locate it.

Install the NVIDIA GPU drivers:

yum install nvidia-driver-latest-dkms --installroot=$CHROOT

Now that we have the NVIDIA drivers successfully installed, we can install CUDA. Download the CUDA repository at this link, and as always, configure the options based on your own system. Install the repository:

yum install ./cuda-repo-rhel7-10-2-local-10.2.89-440.33.01-1.0-1.ppc64le.rpm
yum install ./cuda-repo-rhel7-10-2-local-10.2.89-440.33.01-1.0-1.ppc64le.rpm --installroot=$CHROOT

Install CUDA:

yum install cuda --installroot=$CHROOT

Now, we need to make a couple changes to ensure that it will run correctly on POWER9 systems. xCAT includes a useful post-installation script to deal with these changes. It can be enabled by using the following commands:

mkdir -p /install/custom/netboot/rh
cp /opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64le.postinstall /install/custom/netboot/compute.postinstall

cat >>/install/custom/netboot/compute.postinstall <<-EOF

/install/postscripts/cuda_power9_setup
EOF

chdef -t osimage centos7.6-ppc64le-netboot-compute postinstall=/install/custom/netboot/compute.postinstall

Finally, enable the nvidia-persistenced daemon on startup for the compute node and add a PATH definition for CUDA commands:

chroot $CHROOT systemctl enable nvidia-persistenced
echo "export PATH=/usr/local/cuda-10.2/bin:/usr/local/cuda-10.2/NsightCompute-2019.1${PATH:+:${PATH}}" >> $CHROOT/root/.bashrc

Before applying the changes, we need to modify one last option in the xCAT configuration. We will need to compile some programs in order to use CUDA, so it is imperative that all compilers are properly configured. We already installed the basic development tools and IBM's additional C/C++ compilers, but there is an additional step. For some reason, xCAT prevents the /usr/include directory, which stores important header files, from being built into the compute image. To reverse this, first identify xCAT's 'exclude' file:

lsdef -t osimage -o centos7.6-ppc64le-netboot-compute -i exlist

My exclude file is stored at /opt/xcat/share/xcat/netboot/centos/compute.centos7.exlist. Let's copy it to a new location so we can make modifications:

cp /opt/xcat/share/xcat/netboot/centos/compute.centos7.exlist /install/custom/netboot/compute.exlist

Now, edit the new file using your favorite editor, and you should see a line that looks like:

./usr/include*

Remove that line, and make the file the new exclude file for the current image:

chdef -t osimage -o centos7.6-ppc64le-netboot-compute exlist="/install/custom/netboot/compute.exlist" 

Apply changes:

packimage centos7.6-ppc64le-netboot-compute
rpower compute reset

Once the compute powers up again, we can verify the CUDA installation:

cat /proc/driver/nvidia/version

Outputs:

NVRM version: NVIDIA UNIX ppc64le Kernel Module  440.33.01  Wed Nov 13 00:09:08 UTC 2019
GCC version:  gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC)

And:

nvcc -V

Outputs:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Thu_Oct_24_17:58:26_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

If you want to verify that CUDA runs correctly:

cd ~
cuda-install-samples-10.2.sh .
cd NVIDIA_CUDA-10.2_Samples
make

Once it is done compiling, you can run a few of the programs that it created:

~/NVIDIA_CUDA-10.2_Samples/bin/ppc64le/linux/release/deviceQuery
~/NVIDIA_CUDA-10.2_Samples/bin/ppc64le/linux/release/bandwidthTest

If both of these programs run correctly, you have a fully functioning CUDA installation. Congrats!

Sources

  1. Sloth Paradise - Setting Up Infiniband in Centos 6.7
  2. Sloth Paradise - How to Install Slurm on CentOS 7 Cluster
  3. IBM - RHEL NVIDIA GPU Driver Installation Guide
  4. xCAT - NVIDIA CUDA - RHEL 7.5
  5. xCAT - IBM XL Compilers