2/6. nevikw39
.
In practical, most supercomputers are clusters of computation nodes, each of which is equipped with CPU, main memory, ... Among the nodes, they're interconnected by Ethernet or InfiniBand. File system is usually shared by nodes, even other cluster.
Generally, computation nodes would not have access to the Internet. On the other hand, we could connect to login nodes, submitting jobs or doing lightweight tasks there. Moreover, there are data transfer node dedicated to uploading or downloading files via SFTP
or similar mechanisms.
It is parallel programming that makes the whole use of supercomputers.
Simple Linux Utility for Resource Management
Since there are quite a few computation nodes, it's necessary for us to have a manager to coordinate the resources.
The jobs could be categorizeed into two types: interactive and non-interactive ones. Interactive jobs are launched through the shells, whereas the non-interactive ones would be batched in background.
In Slurm, there are some partitions containing computation nodes served as queues that accept jobs. On Taiwania 3, the difference among partitions is the number of CPU cores. On Taiwania 2, the difference is the wall time.
There are several Slurm commands. I would casually divide them into two groups:
- Info
-
squeue
: Show the status of running and pending jobs. Usesqueue -u $USER
to list your own jobs only. -
sinfo
: Show the status of nodes and partitions. To see the summary of partitions, usesinfo -s
. -
scontrol
,sacct
,sstat
, ...
-
- Job
-
sbatch
: Submit non-interactive jobs. -
salloc
: Request resources (for interactive jobs mostly). -
srun
: Launch the parallel job within the resource (Allocate if necessary). -
scancel
: Kill the job that is not yet completed. When something goes into trouble, stop all of your jobs byscancel -u $USER
.
-
All jobs should be associated with a account with enough service unit (SU). On Taiwania
, one's accounts could be shown by the command wallet
.
The most vital resource that we're concerned is the number of CPU cores. Normally, it's enough for us to determine the number of total tasks / MPI ranks / processes. Nonetheless, we could also limit the maximum number of nodes in used or the tasks per node.
Prons and cons of interactive and non-interactive jobs:
- Flexibility
- Real-time feedback or not
- Resource utilization
- Batch processing or human interactions
To start an interactive job, request resources by salloc
and launch the pseudo terminal (pseudo TTY, PTY) by srun
.
salloc -A $ACCOUNT_NAME -p $PARTITION -n $N srun --pty bash
Remember to quit since that it costs every second even if idling!
sbatch -A $ACCOUNT_NAME -p $PARTITION -n $N $COMMANDS
When we have more options and commands, it'd be better put all stuff into a script and submit the job in shorter form:
sbatch $SCRIPT
In which the script is a bash script like the following one:
#!/bin/bash
#SBATCH -J pi-mt # Job Name
#SBATCH -A GOV111082 # Account
#SBATCH -p ct2k # Partition
#SBATCH -o mt_out_%j.log # Redirect `stdout` to File
#SBATCH -e mt_err_%j.log # Redirect `stderr` to File
#SBATCH -n 1000 # `--ntasks`, number of tasks / MPI ranks / processes $SLURM_NTASKS
#
# #SBATCH -c 1 # `--cpus-per-task`, number of cores / threads **per** tasks / MPI ranks / processes. $SRUN_CPUS_PER_TASK
# #SBATCH -N 18 # `--nodes`, number of **minimium** nodes!! $SLURM_NNODES
# #SBATCH --ntasks-per-node=56 # number of **minimium** tasks / MPI ranks / processes per nodes!! $SLURM_NTASKS_PER_NODE
#
# $SLURM_NTASKS <= $SLURM_NNODES * $SLURM_NTASKS_PER_NODE
ml purge
# srun ...
Note that the shebang is necessary. sbatch
seems to launch the job by that.
In pratical, we often approximate
$3.1415926\dots$ $4\arctan1$
The history that people tried to computes
Nowadays, there are some common approaches to caculate the value of
-
Monte Carlo method
- This figure was adopted from Wikipedia.
-
Gregory–Leibniz series, e.g.,
$\arctan$ at$1$ would be$\displaystyle1-\frac{1}{3}+\frac{1}{5}-\frac{1}{7}+\dots=\sum_{k=0}^\infty\frac{(-1)^k}{2k+1}$ -
Integral of some functions, e.g.,
$4\int_0^1\sqrt{1-x^2}dx$ $4\int_0^1\frac{1}{1+x^2}dx$
Any of the above ways requires quite a few iterations to attain the desired precision. As a consequence, this task becomes a really good example to illustrate the power of parallel programming and HPC.
By default, there are GCC 4.8.5
and Open MPI 4.0.3
on Taiwania 3. In this template, we could use make all CC=mpicc
to build the binaries. To launch the MPI programs, for MPI with PMI libraries, e.g., Open MPI, srun
would be more convenient; for Intel MPI which wasn't built with PMI, we should call mpiexec.hydra -bootsrap slurm
, or shortly mpirun
.
There are many modules of different versions of compilers on Taiwania 3, to which there would be some implementations of MPI corresponding. Nevertheless, there are several pitfalls within.
For instance, the UCX config UCX_NET_DEVICES
was incorrectly set by Intel MPI for Intel Compiler 2022 and Open MPI for GCC 10.2.0. Please refer to the job scripts provided.
- Modify the simple job script and submit it.
- Check out different numbers of tasks / MPI ranks / processes!!
- Rewrite
pi_integral.c
with MPI multiprocessing. Design your job script for it based on the templates provided. - Try and observe the wall time of different compiler & MPI combinations.
Windows users no longer need MobaXTerm
since Windows 10 has built-in OpenSSH
client. For the X11
apps, one might install a X11
server, or WSLg1 would be of help. BTW, Windows Terminal looks far more delightful than the default terminal.
Run a VS Coder server on Taiwania so that we could connect to it via VS Code avoiding annoying OTP.
- Log into Taiwania with your password and OTP.
-
mkdir -p ~/.local/bin wget -O ~/.local/bin/code-server https://aka.ms/vscode-server-launcher/x86_64-unknown-linux-musl
- We need run the server in background even though current session is closed. So type
tmux
to create a session. Then runcode-server
. - For the first time, open GitHub Device Activation (this link would appear in the terminal) in your browser and enter the code in the terminal.
- Now we could detach from the
tmux
seesion. Press ⌃Ctrl$+$ B, then D. - In your VS Code, update to at least
$1.73$ and install the Remote Tunnels extension. Click the left-bottom corner and then connect to the tunnel. You might be required to log into your GitHub account. Voilà!