CHPC 2025 Student Cluster Competition

Welcome the Center for High Performance Computing (CHPC)'s Student Cluster Competition (SCC) - Team Selection Round. This round requires each team to build a prototype multi-node compute cluster within the National Integrated Cyber Infrastructure Systems (NICIS) virtual compute cloud (described below).

The goal of this document is to introduce you to the competition platform and familiarise you with some Linux and systems administration concepts. This competition provides you with a fixed set of virtual resources, that you will use to initialize a set a set of virtual machines instances based on your choice or flavor of Linux.

Structure of the Competition

The CHPC invites applications from suitably qualified candidates to enter the CHPC Student Cluster Competition. The CHPC Student Cluster Competition gives undergraduate students at South African universities exposure to the High Performance Computing (HPC) Industry. The winning team will be entered into the ISC Student Cluster Competition hosted at the 2026 International Supercomputing Conference held in Hamburg, Germany.

You will be accessing all of the course work and material through this GitHub repository, which you and your team must check regularly to receive updates.

Getting Help

You are strongly encouraged to get help and even assist others by Opening and Participating in Discussions.

Tip

Active participation in the student discussions is an easy way to separate yourselves from the rest of the competition and make it easy for the instructors to notice you!

Timetable

Everyday will comprise of four lectures in the mornings and tutorials taking place in the afternoons. A PDF Version of the Timetable is available for you to download.

Scoring

Teams will be evaluate according to the following breakdown, with your progress in the tutorials and your final presentations carrying the most weight.

Component	Weight

Technical Knowledge Assessment	0.2
Tutorials	0.4
Cluster Design Presentation	0.4

Instructions for Mentors

The role of mentors, instructors and volunteers is to provide leadership and guidance for the student competitors participating in this year's Center for High Performance Computing 2025 Student Cluster Competition.

In preparing your teams for the competition, your main goal is to ensure that you teach and impart knowledge to the student participants in such a way that they are empowered and enable to tackle the problems and benchmarking tasks themselves.

Hands-Off Rule (You may not touch the keyboard)

Under no circumstances whatsoever may mentors touch any competition hardware belonging to either their team, or the competition hardware of another team. Mentors are encouraged to provide guidance and leadership to their (as well as other) teams.

Any mentors found to be directly in contravention of this rule, may result in their team incurring a penalty. Repeated infringements may result in possible disqualification of their team.

We monitor all network traffic!

Cheat Sheet

Below is a table with a number of Linux system commands and utilities that you may find useful in assisting you to debug problems that you may encounter with your clusters. Note that some of these utilities do not ship with the base deployment of a number of Linux flavors, and you may be required to install the associated packages, prior to making use of them.

Command	Description
ssh	Used from logging into the remote machine and for executing commands on the remote machine.
scp	SCP copies files between hosts on a network. It uses ssh for data transfer, and uses the same authentication and provides the same security as ssh.
wget / curl	Utility for non-interactive download of files from the Web.It supports HTTP, HTTPS, and FTP protocols.
top / htop / btop	Provides a dynamic real-time view of a running system. It can display system summary information as well as a list of processes or threads.
screen / tmux	Full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells).
ip a	Display IP Addresses and property information
dmesg	Prints the message buffer of the kernel. The output of this command typically contains the messages produced by the device drivers
watch	Execute a program periodically, showing output fullscreen.
df -h	Report file system disk space usage.
ping	PING command is used to verify that a device can communicate within another on a network.
lynx	Command-line based web browser (more useful than you think)
ctrl+alt+[F1...F6]	Open another shell session (multiple ‘desktops’)
ctrl+z	Move command to background (useful with ‘bg’)
du -h	Summarize disk usage of each FILE, recursively for directories.
lscpu	Command line utility that provides system CPU related information.
lstotp	View the topology of a Linux system.
inxi	Lists information related to your systems' sensors, partitions, drives, networking, audio, graphics, CPU, system, etc...
hwinfo	Hardware probing utility that provides detailed info about various components.
lshw	Hardware probing utility that provides detailed info about various components.
proc	Information and control center of the kernel, providing a communications channel between kernel space and user space. Many of the preceding commands query information provided by proc, i.e. `cat /proc/cpuinfo`.
uname	Useful for determining information about your current flavor and distribution of your operating system and its version.
lsblk	Provides information about block devices (disks, hard drives, flash drives, etc) connected to your system and their partitioning schemes.

Deliverables

You will need to submit the following for scoring and evaluation by the judges:

Cluster Design Assignment [40 %]
- One PDF Presentation Slide with Team Profiles This slide must clearly indicate your Team Name and Institution. Below each team member's photograph, indicate their
  - Name and surname,
  - Degree and Year of study,
- Presentation Slides
- Short Technical Brief with Cluster Design Specifications
Technical Knowledge Assessment [20 %]
Tutorials [40 %]

Cluster Design Assignment

You are tasked with designing a small cluster, with at least three nodes, to the value of R 500  000.00 (ZAR) and present your design to the judging panel. In your design you must specify hardware and software for an operational cluster and describe how it functions. The design must be based on servers and interconnects from either HPE, and accessories from either NVIDIA, or AMD or Intel. You MUST use the prices you find in the Parts List Spreadsheet.

The primary purpose of your HPC cluster is to run the following applications and benchmarks as efficiently as possible:

In addition, your choice of design must take into consideration:

Base Platform (Server),
Target Processing Unit (CPU / GPU),
Memory, Networking and Storage Requirements,
System and Application Dependency Software Requirements,
Ease of Use (Build, Assembly, Deployment),
Efficiency, Performance, Power Consumption and Reliability and
Team Management, Coordination and Planning.

Important

You may submit an additional design, that extends upon your small R 500 000.00 cluster, up to the value of R 5 000 000.00. You may use any of the above links for this exercise, using a Dollar to Rand conversion rate or 1:20. You may use GPU's from either AMD or NVIDIA. You may utilize CPUs from either AMD or Intel. You must use HPE as a base platform for your severs.

In this revised design, consider additional nodes, additional / performance CPU's, additional RAM, GPU's, InfiniBand interconnects and any other aspects that you think would improve the performance of your initial cluster design.

This additional design should be no more than one slide. Price breakdown and additional component(s) motivation.

You will be presenting your findings in a short technical brief and specification. Detailing the specific components you've incorporated into your cluster, a spreadsheet with a clear breakdown of price, quantity and name / code of components would be useful. You must also present a 10 minute slideshow of your findings and cluster design.

The 10 minute slide presentation by the whole team must include your design decisions and the features of your cluster, including: cost, hardware, software, configuration and operation. Each member of the team is required to present even though you will be assessed as a team.

After the presentation the judging panel will have an opportunity to ask questions to each member of your team. All members of your team can be questioned about any part of the cluster, so make sure you are fully familiar with the design.

Caution

The deadline for submission of the Cluster Design Assignment is 23:00 on Friday the 11th July. Late submissions will be penalized.

Technical Knowledge Assessment

Each Team must work together to answer and complete the Technical Knowledge Assessment to the best of their ability. Team Captains must email your findings to the organizers no later than 23:00 12th July. You are required to demonstrate your understanding of the concepts in YOUR OWN WORDS. Keep your answers succinct and to the point. Your answers to each of the questions, should not exceed more than 2-3 lines.

Caution

The deadline for submission of the Technical Knowledge Assessment is 23:00 on Saturday the 12th July. Late submissions will be penalized.

Tutorials

You will be evaluated on your overall progress in the tutorials. Below you will find an overview, glossary and high level breakdown of the tutorials. You must progress through four tutorials, which will be released daily. Your overall progress through the tutorials forms a large component of you score. By the end of the week you would have covered a considerable amount of content, use the links provided should you need to refer to a specific section and are having trouble remembering where is it.

Warning

Please note that the tutorial content matter is subject to change at any time, and you must regularly check the main branch of this Github repository for updates.

Tutorial 1 deals with introducing concepts to users and getting them started with using the virtual lab, standing up the first virtual machine instance and connecting to it remotely. The content is as follows:

Tutorial 2 will demonstrate how to configure and stand-up a compute node, and access it using a transparently created, port forwarding SSH tunnel between your workstation and your head node. You will then install a number of critical services across your cluster.

Tutorial 3 will demonstrate how to configure, build, compile and install a number of various system software and applications. You will also be building these applications with different tools. Finally, you will learn how to run applications across your cluster.

Tutorial 4 demonstrates how to configure docker containers to deploy a monitoring stack, comprising of a metrics database service, an exporting / scraping service and a metric visualization services. You will then learn the very basics of how to visualize and interpret data. You will then learn how to automate the deployment of your Sebowa OpenStack infrastructure. Lastly, you'll deploy a scheduler and submit a job to it.

Lecture Slides and Video Recordings

The lecture slides are available for download - follow the link and download the raw files.

Day 1

Day 2

Contributing to the Project

Important

While we value your feedback, the following sections are primarily targeted as Contributors to the Project. As a student participating in the competition, do NOT spend your time working through any of the material below. However, we would love to have your contributions to the project, after the competition.

You are strongly encouraged to contribute and improve the project by Opening and Participating in Discussions, Raising, Addressing and Resolving Issues. The following guide describes How to clone, push, and pull with git (beginners GitHub tutorial).

Steps to follow when editing existing content

In order to effectively manage the various workflows and stages of development, testing and deployment, the project is comprised of three primary branches:

main: Stable and production-ready deployment branch of the project.
stag: Staging branch which mirrors production and is used for integration testing of new features.
dev: Development branch for incorporating new features and bug fixes.

Editing the content directly, will require the use of Git. Using a terminal application or Git for Windows PowerShell or Git for MobaXTerm.

Generate an SSH Key (or use an existing one).
Add your SSH key to your Git profile.
- Navigate to your 'Profile' and go to 'Settings'.
- Under 'Access', navigate to 'SSH and GPG Keys'
git clone a local copy of the repository, to your personal work space.

You can copy the command from GitHub itself.
```
git clone git@github.com:chpc-tech-eval/scc.git
```
When starting work on a new feature or bug fix, create a feature branch off of the development branch and regularly get updates from dev to ensure that you remain consistent with any changes to dev:
```
git checkout dev
git pull origin dev
```
Create a new branch to work on. i.e. git branch tutX/bugfix-or-new-feature followed by git checkout tutX/bugfix-or-new-feature, or simply use a single command git checkout -b tutX/bugfix-or-new-feature.
- Give the branch a sensible name.
- You are encouraged to push the branch back to the remote so that collaborators can see what you are working on as you make the changes.

Make the appropriate changes and commit them locally:

git add <relative_path_to_changed_file(s)>
git commit -m "some_message_pertaining_to_changes_made"

When you have completed editing your feature, merge any remote changes from dev and then push your local changes, back upstream to the remote repository:

git pull origin dev # (optional) it is generally a good practice to incorporate any changes in dev into your code early and often
git pull origin feature/bugfix-or-new-feature # (optional) if you are collaborating on a specific feature with someone, it is important to incorporate their changes early and often
git push origin feature/bugfix-or-new-feature

Once you are satisfied with the changes you've have been editing, eliminate all merge conflicts by pulling all remote changes and deviations into your local working copy. git pull.
- If you are confident that your feature does not or has not deviated from the remote dev branch, use git pull to automatically fetch and merge remote changes from dev into your feature branch.
- Alternatively, if your branch is old, or depends on / requires changes from remote use git fetch, to fetch remote changes and be able to preview them before merging.
- Eliminate your local conflicts and merge all remote changes git merge.
- Once all the conflicts have been resolved, and you've successfully merged all remote changes, push your branch upstream.
Create a pull request to the remote dev branch on GitHub, to incorporate your feature.
- Or another branch, if your feature branch was adding functionality to an existing feature branch.

Syntax and Style

Use the following guide on Github Markdown Syntax Editing.

Name		Name	Last commit message	Last commit date
Latest commit History 436 Commits
resources		resources
tutorial1		tutorial1
tutorial2		tutorial2
tutorial3		tutorial3
tutorial4		tutorial4
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CHPC 2025 Student Cluster Competition

Table of Contents

Structure of the Competition

Getting Help

Timetable

Scoring

Instructions for Mentors

Hands-Off Rule (You may not touch the keyboard)

Cheat Sheet

Deliverables

Cluster Design Assignment

Technical Knowledge Assessment

Tutorials

Lecture Slides and Video Recordings

Day 1

Day 2

Contributing to the Project

Steps to follow when editing existing content

Syntax and Style

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

chpc-tech-eval/scc

Folders and files

Latest commit

History

Repository files navigation

CHPC 2025 Student Cluster Competition

Table of Contents

Structure of the Competition

Getting Help

Timetable

Scoring

Instructions for Mentors

Hands-Off Rule (You may not touch the keyboard)

Cheat Sheet

Deliverables

Cluster Design Assignment

Technical Knowledge Assessment

Tutorials

Lecture Slides and Video Recordings

Day 1

Day 2

Contributing to the Project

Steps to follow when editing existing content

Syntax and Style

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages