Skip to content

Commit

Permalink
Documentation: Improved README and Quick start (#105)
Browse files Browse the repository at this point in the history
* Improved README and quickstart

* Notes for single GPUs

* typo
  • Loading branch information
sofiemartins authored Sep 10, 2024
1 parent b4d43bf commit baf6066
Show file tree
Hide file tree
Showing 2 changed files with 258 additions and 86 deletions.
196 changes: 146 additions & 50 deletions Doc/user_guide/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,12 @@
<a href="https://asciinema.org/a/550942"><img src="https://asciinema.org/a/550942.svg" alt="asciicast" class="inline"></a>
\endhtmlonly
<!-- [![asciicast](https://asciinema.org/a/550942.svg)](https://asciinema.org/a/550942) -->

# Dependencies

* A C99 compiler (GCC, clang, icc). OpenMP can be used if supported by the compiler.
* If MPI is needed, an MPI implementation, i.e. OpenMPI or MPICH for MPI support. Use a CUDA-aware MPI implementation for multi-GPU support.
* If GPU acceleration if needed, CUDA 11.x and nvcc compiler to make use of CUDA GPU acceleration.
* If MPI is needed, an MPI implementation, i.e., OpenMPI or MPICH, for MPI support. Use a CUDA-aware MPI implementation for multi-GPU support.
* If GPU acceleration is needed, use CUDA 11.x and the nvcc compiler to use CUDA GPU acceleration.
* Perl 5.x for compilation.
* [ninja build](https://ninja-build.org/) for compilation.

Expand All @@ -27,105 +28,196 @@ git clone https://github.com/claudiopica/HiRep
Make sure the build command `Make/nj` and `ninja` are in your `PATH`.

## Adjust compilation options

Adjust the file `Make/MkFlags` to set the desired options.
The option file can be generated by using the `Make/Utils/write_mkflags.pl` tools.
Use:
The option file can be generated by using the `Make/write_mkflags.pl` tools.
Use
```
write_mkflags.pl -h
```
for a list of available options. The most important ones include:

* Number of colors (NG)
```
### Number of colors (NG)
```bash
NG=3
```
### Gauge group SU(NG) or SO(NG)

* Gauge group SU(NG) or SO(NG)
```
```bash
GAUGE_GROUP = GAUGE_SUN
#GAUGE_GROUP = GAUGE_SON
```

* Representation of fermion fields
```
### Representation of fermion fields
```bash
REPR = REPR_FUNDAMENTAL
#REPR = REPR_SYMMETRIC
#REPR = REPR_ANTISYMMETRIC
#REPR = REPR_ADJOINT
```

* Lattice boundary Conditions
### Lattice boundary conditions

Comment out the line here, when you want to establish certain boundary conditions into the respective direction.
```
#Available choices of boundary conditions:
#T => PERIODIC, ANTIPERIODIC, OPEN, THETA
#X => PERIODIC, ANTIPERIODIC, THETA
#Y => PERIODIC, ANTIPERIODIC, THETA
#Z => PERIODIC, ANTIPERIODIC, THETA
MACRO += -DBC_T_ANTIPERIODIC
MACRO += -DBC_X_PERIODIC
MACRO += -DBC_Y_PERIODIC
MACRO += -DBC_Z_PERIODIC
Comment out the line here when you want to establish certain boundary conditions in the respective direction.

Available options are
1. BC_\<DIR\>_PERIODIC, for periodic boundary conditions
2. BC_\<DIR\>_ANTIPERIODIC, for antiperiodic boundary conditions
3. BC_\<DIR\>_THETA associates a twisting angle to the fermionic field in the specified direction \<DIR\>. The concrete angle has to be specified in the input file.
4. BC_\<DIR\>_OPEN, for open boundary conditions. Open boundary conditions can only be set in the T direction.

Example for antiperiodic boundary conditions in the time direction and periodic boundary conditions in the spatial dimensions.
```bash
MACRO += BC_T_ANTIPERIODIC
MACRO += BC_X_PERIODIC
MACRO += BC_Y_PERIODIC
MACRO += BC_Z_PERIODIC
```

* MACRO options
### Parallelization

You can select a number of features via the `MACRO` variable. The most important ones are:

Specify, whether you want to compile with MPI by using
Specify whether you want to compile with MPI by using

```bash
MACRO += WITH_MPI
```
#MACRO += -DWITH_MPI

For compilation with GPU acceleration for CUDA GPUs enable GPU use and use the new geometry. If you try to compile with GPUs but forget to set the new geometry, the compilation will fail.

```bash
MACRO += WITH_GPU
MACRO += WITH_NEW_GEOMETRY
```

For compilation with GPU acceleration for CUDA GPUs use:
If you want to compile your code for AMD GPUs, additionally add the flag

```bash
MACRO += WITH_GPU
MACRO += WITH_NEW_GEOMETRY
MACRO += HIP
```
MACRO += -DWITH_GPU

### Other standard options

```bash
MACRO += UPDATE_EO
```

* Compiler options
enables even-odd preconditioning, so you never want to disable it.

You can set your choice of C, C++, MPI and CUDA compiler and their options by using the variables:
```bash
MACRO += NDEBUG
```

**suppresses** debug output. If you delete this option, `HiRep` will print a lot more unnecessary output.

```bash
MACRO += CHECK_SPINOR_MATCHING
```

This performs a check on the geometries of the spinors and is essential for debugging. In general, leaving it as a safety check does not hurt, but if you simulate with very small local lattices, you may want to disable it and check whether there is a performance improvement.

```bash
MACRO += IO_FLUSH
```

Prints to file immediately. If the simulation or analysis prints an unusual amount of data, it may affect performance.

### Compiler options

To compile the code for your laptop, you only need to set the C compiler. For example
```bash
CC = gcc
CFLAGS = -Wall -O3
INCLUDE =
LDFLAGS =
```

If you want support for parallelization, you need to include the MPI compiler wrapper
```bash
CC = gcc
MPICC = mpicc
CFLAGS = -Wall -O3
GPUFLAGS =
INCLUDE =
LDFLAGS =
```

Another example: To use the Intel compiler and Intel's MPI implementation, and no CUDA, one could use:

```bash
CC = icc
MPICC = mpiicc
LDFLAGS = -O3
INCLUDE =
```

With a single NVIDIA GPU and without MPI:
```bash
CC = gcc
NVCC = nvcc
CXX = g++
LDFLAGS = -Wall -O3
GPUFLAGS = -arch=sm_80
INCLUDE =
GPUFLAGS =
INCLUDE =
```
Note that this compiles a fat binary but you can also specify a target architecture under the `GPUFLAGS`.

For example, to use the Intel compiler and Intel's MPI implementation, and no CUDA, one could use:
For a single AMD GPU `nvcc` needs to be replaced by `hipcc`. For LUMI, the standard C and C++ compilers are `cc` and `CC`.

```bash
CC = cc
NVCC = hipcc
CXX = CC
LDFLAGS = -Wall -O3
GPUFLAGS =
INCLUDE =
```
CC = icc
MPICC = mpiicc
LDFLAGS = -O3

Multi-GPU simulations on NVIDIA GPUs: you can set your choice of C, C++, MPI, and CUDA compiler and their options by using the variables:
```bash
CC = gcc
MPICC = mpicc
NVCC = nvcc
CXX = g++
LDFLAGS = -Wall -O3
GPUFLAGS =
INCLUDE =
```

For LUMI AMD Multi-GPU jobs, it seems to be favorable to use hipcc instead of `CC`.
```bash
ENV = MPICH_CC=hipcc
CC = gcc
MPICC = cc
CFLAGS = -Wall -O3
NVCC = mpicc
GPUFLAGS = -w --offload-arch=gfx90a
INCLUDE =
LDFLAGS = --offload-arch=gfx90a
```

For more information on configuring the code for AMD GPUs, see the user guide on the GitHub pages.

## Compile the code

From the root folder just type:
```
```bash
nj
```
(this is a tool in the `Make/` folder: make sure it is in your path!)
The above will compile the `libhr.a` library and all the available executable in the HiRep distribution, including executable for dnamical fermions `hmc` and pure gauge `suN` simulations and all the applicable tests.
If you wish to compile only one of the executable, e.g. `suN`, just change to the corresponding directory, e.g. `PureGauge`, and execute the `nj` command from there.
The above will compile the `libhr.a` library and all the available executables in the HiRep distribution, including executables for dynamical fermions `hmc` and pure gauge `suN` simulations and all the applicable tests.
If you wish to compile only one of the executables, e.g., `suN`, just change to the corresponding directory, e.g., `PureGauge`, and execute the `nj` command from there.

All build artefacts, except the final executables, are located in the `build` folder at the root directory of the distribution.


# Run

## Adjust input file
As example we will use the `hmc` program which can be found in the `HMC` directory (to create the executable type `nj` in that directory).
The `hmc` program will run the generation of lattice configurations with dynamical fermions by using a hybrid Monte Carlo algorithm. The program uses a number of parameters which needs to be specified in an input file, see `HMC/input_file` for an example.
Input parameters are divided in different sections, such as: global lattice size, number of MPI processes per direction, random number generator, run control variables, definition of the lattice action to use for the run, etc.
As an example, we will use the `hmc` program, which can be found in the `HMC` directory (to create the executable type `nj` in that directory).
The `hmc` program will generate lattice configurations with dynamical fermions using a hybrid Monte Carlo algorithm. The program uses a number of parameters that need to be specified in an input file; see `HMC/input_file` for an example.
Input parameters are divided into different sections, such as global lattice size, number of MPI processes per direction, random number generator, run control variables, and definition of the lattice action to use for the run.
For example, for basic run control variables, one can have a look at the section `Run control variables`.

```
Expand All @@ -137,27 +229,31 @@ gauge start = random
last conf = +1
```

The "+" in front of `last conf` specifies the number of additional trajectories to be generated after the chosen startup configuration. I.e. if the startup configuration is trajectory number 5 and `last conf = 6` then one additional trajectory will be generated, while if `last conf = +6` then six additional trajectories will be generated (i.e. the last configuration generated will be number 11).
The "+" in front of `last conf` specifies the number of additional trajectories to be generated after the chosen startup configuration. I.e., if the startup configuration is trajectory number 5 and `last conf = 6`, then one additional trajectory will be generated, while if `last conf = +6`, then six additional trajectories will be generated (i.e., the last configuration generated will be number 11).

## Execute Binary

When not using MPI, simply run:

```
$ ./hmc -i input_file
./hmc -i input_file
```

where `hmc` is the binary generated from `hmc.c`. If you are using openmp, remeber to set `OMP_NUM_THREADS` and other relevant environment variables to the desired value.
where `hmc` is the binary generated from `hmc.c`. If you are using OpenMP, remember to set `OMP_NUM_THREADS` and other relevant environment variables to the desired value.

For the MPI version, run

```
$ mpirun -np <number of MPI processes> ./hmc -i input_file
mpirun -np <number of MPI processes> ./hmc -i input_file
```

The GPU version of the code uses 1 GPU per MPI process.
or follow the instructions for submitting your script to Slurm. See examples for submit scripts in the documentation.

The output file is written only by the MPI process rank 0, by default in a file called `out_0` in the current directory. A different name for the output file can be set by using the `-o` option.
The GPU version of the code uses 1 GPU per MPI process.

For debug purposes it is sometimes useful to have output files from all MPI processes. This can be enabled with the compilation option: `MACRO += -DLOG_ALLPIDS`.
Only the MPI process rank 0 writes the output file, which is by default in a file called `out_0` in the current directory. The `-o` option allows you to set a different name for the output file.

It is sometimes helpful to have output files from all MPI processes for debugging purposes. This can be enabled with the compilation option:
```bash
MACRO += LOG_ALLPIDS
```
Loading

0 comments on commit baf6066

Please sign in to comment.