Skip to content

Latest commit

 

History

History
86 lines (58 loc) · 3.1 KB

File metadata and controls

86 lines (58 loc) · 3.1 KB

Graphcore

Connection to Graphcore

Graphcore connection diagram

Login to the Graphcore login node from your local machine. Once you are on the login node, ssh to one of the Graphcore nodes.

ssh ALCFUserID@gc-login-01.ai.alcf.anl.gov
# or
ssh ALCFUserID@gc-login-02.ai.alcf.anl.gov

ssh gc-poplar-02.ai.alcf.anl.gov
# or
ssh gc-poplar-03.ai.alcf.anl.gov
# or
ssh gc-poplar-04.ai.alcf.anl.gov

Clone Graphcore Examples

We use examples from Graphcore Examples repository for this hands-on. Clone the Graphcore Examples repository.

mkdir ~/graphcore
cd ~/graphcore
git clone https://github.com/graphcore/examples.git
cd examples

Job Queuing and Submission

ALCF's Graphcore POD64 system uses Slurm for job submission and queueing. Below are some of the important commands for using Slurm.

  • The Slurm command srun can be used to run individual Python scripts. Use the --ipus= option to specify the number of IPUs required for the run. srun --ipus=1 python mnist_poptorch.py
  • The jobs can be submitted to the Slurm workload manager through a batch script by using the sbatch command
  • The squeue command provides information about jobs located in the Slurm scheduling queue.
  • SCancel is used to signal or cancel jobs, job arrays, or job steps.

Profiling

We will use Pop Vision Graph Analyzer and System Analyzer to produce profiles.

PopVision Graph Analyzer

To generate a profile for PopVision Graph Analyzer, run the executable with the following prefix

$ POPLAR_ENGINE_OPTIONS='{"autoReport.all":"true", "autoReport.directory":"./graph_profile", "profiler.includeFlopEstimates": "true"}' python mnist_poptorch.py

This will generate all the graph profiling reports along with flops estimates and save the output to the graph_profile directory.

To visualize the profiles, download generated profiles to a local machine and open them using PopVision Graph Analyzer.

PopVision System Analyzer

To generate a profile for PopVision System Analyzer, run the executable with the following prefix

$ PVTI_OPTIONS='{"enable":"true", "directory": "./system_profile"}' python mnist_poptorch.py

This will generate all the system profiling reports and save the output to system_profile directory.

To visualize the profiles, download generated profiles to a local machine and open them using PopVision Graph Analyzer.

Software Stack

The Graphcore Hands-on section consists of examples using PyTorch and Poplar Software Stack.

Useful Resources