README


********************************************************************************

                         openQCD Simulation Program

********************************************************************************


LATTICE THEORY

Currently the common features of the supported lattice theories are the
following:

* 4-dimensional hypercubic N0xN1xN2xN3 lattice with even sizes N0,N1,N2,N3.
  Open, Schrödinger functional (SF), open-SF or periodic boundary conditions
  in the time direction and periodic boundary conditions in the space
  directions.

* SU(3) gauge group, plaquette plus planar double-plaquette gauge action
  (Wilson, Symanzik, Iwasaki,...).

* O(a)-improved Wilson quarks in the fundamental representation of the gauge
  group. Among the supported quark multiplets are the classical ones (pure
  gauge, two-flavour theory, 2+1 and 2+1+1 flavour QCD), but doublets with a
  twisted mass and theories with many doublets, for example, are also
  supported.

The O(a)-improvement includes the boundary counterterms required for the
improvement of the correlation functions near the boundaries of the lattice in
the time direction if open, SF or open-SF boundary conditions are chosen. For
the quark fields phase-periodic boundary conditions in the space directions
are implemented too.


SIMULATION ALGORITHM

The simulation program is based on the HMC algorithm. For the heavier quarks,
a version of the RHMC algorithm is used. Several advanced techniques are
implemented that can be configured at run time:

* Nested hierarchical integrators for the molecular-dynamics equations, based
  on any combination of the leapfrog, 2nd order Omelyan-Mryglod-Folk (OMF) and
  4th order OMF elementary integrators, are supported.

* Twisted-mass Hasenbusch frequency splitting, with any number of factors
  and twisted masses. Optionally with even-odd preconditioning.

* Twisted-mass determinant reweighting.

* Deflation acceleration and chronological solver along the molecular-dynamics
  trajectories.

* A choice of solvers (CGNE, MSCG, SAP+GCR, deflated SAP+GCR) for the Dirac
  equation, separately configurable for each force component and
  pseudo-fermion action.

All of these depend on a number of parameters, whose values are passed to the
simulation program together with those of the action parameters (coupling
constants, quark masses, etc.) through a structured input parameter file.


PROGRAM FEATURES

All programs parallelize in 0,1,2,3 or 4 dimensions, depending on what is
specified at compilation time. They are highly optimized for machines with
current Intel or AMD processors, but will run correctly on any system that
complies with the ISO C89 (formerly ANSI C) and the MPI 1.2 standards.

For the purpose of testing and code development, the programs can also
be run on a desktop or laptop computer. All what is needed for this is
a compliant C compiler and a local MPI installation such as Open MPI.


DOCUMENTATION

The simulation program has a modular form, with strict prototyping and a
minimal use of external variables. Each program file contains a small number
of externally accessible functions whose functionality is described at the top
of the file.

The data layout is explained in various README files and detailed instructions
are given on how to run the main programs. A set of further documentation
files are included in the doc directory, where the normalization conventions,
the chosen algorithms and other important program elements are described.


COMPILATION

The compilation of the programs requires an ISO C89 compliant compiler and a
compatible MPI installation that complies with the MPI standard 1.2 (or later).

In the main and devel directories, a GNU-style Makefile is included which
compiles and links the programs (type "make" to compile everything; "make
clean" removes the files generated by "make"). The compiler options can be set
by editing the CFLAGS line in the Makefiles.

The Makefiles assume that the following environment variables are set:

  GCC             GNU C compiler command [Example: /usr/bin/gcc].

  MPI_HOME        MPI home directory [Example: /usr/lib64/mpi/gcc/openmpi].
                  The mpicc command used is the one in $MPI_HOME/bin and
                  the MPI libraries are expected in $MPI_HOME/lib.

  MPI_INCLUDE     Directory where the mpi.h file is to be found.

All programs are then compiled and linked using the mpicc command.

The compiler command to be used for the compilation of the modules and the
link step can be changed by editing the CC and CLINKER lines in the Makefiles.
Independently of what is specified there, the GCC compiler is used to resolve
the dependencies on the include files.


SSE/AVX ACCELERATION

Current Intel and AMD processors are able to perform arithmetic operations on
short vectors of floating-point numbers in just one or two machine cycles,
using SSE and/or AVX instructions.

Many programs in the module directories include inline-assembly SSE and AVX
code. Inline assembly is a GCC extension of the C language that may not be
supported by other compilers. On 64bit systems the code can be activated by
setting the compiler flags -Dx64 or -DAVX, respectively. In addition, SSE
prefetch instructions will be used if one of the following options is
specified:

  -DP4     Assume that prefetch instructions fetch 128 bytes at a time
           (Pentium 4 and related Xeons).

  -DPM     Assume that prefetch instructions fetch 64 bytes at a time
           (Athlon, Opteron, Pentium M, Core, Core 2 and related Xeons).

  -DP3     Assume that prefetch instructions fetch 32 bytes at a time
           (Pentium III).

These options have an effect only if -Dx64 or -DAVX is set. The option
-DAVX implies -Dx64. If none of these options is set, the programs do
not make use of any C language extensions and are fully portable.

The latest x86 processors furthermore support fused multiply-add (FMA3)
instructions. OpenQCD makes use of these if the option -DFMA3 is set
in addition to -DAVX (setting -DFMA3 alone has no effect).

On recent x86-64 machines the recommended compiler flags are thus

    -std=c89 -O -mno-avx -DAVX -DFMA3 -DPM

For older machines that do not support the AVX instruction set, the
recommended flags are

    -std=c89 -O -mno-avx -Dx64 -DPM

Aggressive optimization levels such as -O2 and -O3 tend to have little effect
on the execution speed of the programs, but the risk of generating wrong code
is higher.

AVX instructions and the option -mno-avx may not be known to old versions of
the GCC compiler, in which case one may be limited to SSE accelerations with
option string -std=c89 -O -Dx64 -DPM (or no acceleration at all).

If compilers other than GCC are used together with the option -Dx64 or -DAVX,
it is strongly recommended to verify the correctness of the compilation using
the check programs in the devel directory.


DEBUGGING FLAGS

For troubleshooting and parameter tuning, it may helpful to switch on some
debugging flags at compilation time. The simulation program then prints a
detailed report to the log file on the progress made in specified subprogram.

The available flags are:

-DCGNE_DBG         CGNE solver.

-DFGCR_DBG         GCR solver.

-FGCR4VD_DBG       GCR solver for the little Dirac equation.

-DMSCG_DBG         MSCG solver.

-DDFL_MODES_DBG    Deflation subspace generation.

-DMDINT_DBG        Integration of the molecular-dynamics equations.

-DRWRAT_DBG        Computation of the rational function reweighting
                   factor.


RUNNING A SIMULATION

The simulation programs reside in the directory "main". For each program,
there is a README file in this directory which describes the program
functionality and its parameters.

Running a simulation for the first time requires its parameters to be chosen,
which tends to be a non-trivial task. The syntax of the input parameter files
and the meaning of the various parameters is described in some detail in
main/README.infiles and doc/parms.pdf. Examples of valid parameter files are
contained in the directory main/examples.


EXPORTED FIELD FORMAT

The field configurations generated in the course of a simulation are written
to disk in a machine-independent format (see modules/misc/archive.c).
Independently of the machine endianness, the fields are written in little
endian format. A byte-reordering is therefore not required when machines with
different endianness are used for the simulation and the physics analysis.


AUTHORS

The initial release of the openQCD package was written by Martin Lüscher and
Stefan Schaefer. Support for Schrödinger functional boundary conditions was
added by John Bulava. Phase-periodic boundary conditions for the quark fields
were introduced by Isabel Campos. Several modules were taken over from the
DD-HMC program tree, which includes contributions from Luigi Del Debbio,
Leonardo Giusti, Björn Leder and Filippo Palombi.


ACKNOWLEDGEMENTS

In the course of the development of the openQCD code, many people suggested
corrections and improvements or tested preliminary versions of the programs.
The authors are particularly grateful to Isabel Campos, Dalibor Djukanovic,
Georg Engel, Leonardo Giusti, Björn Leder, Carlos Pena and Hubert Simma for
their communications and help.


LICENSE

The software may be used under the terms of the GNU General Public Licence
(GPL).


BUG REPORTS

If a bug is discovered, please send a report to <luscher@mail.cern.ch>.


ALTERNATIVE PACKAGES AND COMPLEMENTARY PROGRAMS

There is a publicly available BG/Q version of openQCD that takes advantage of
the machine-specific features of IBM BlueGene/Q computers. The version is
available at <http://hpc.desy.de/simlab/codes/>.

The openQCD programs currently do not support reweighting in the quark
masses, but a module providing this functionality can be downloaded from
<http://www-ai.math.uni-wuppertal.de/~leder/mrw/>.

Full-fledged QCD simulation programs tend to have many adjustable parameters.
In the case of openQCD, most parameters are passed to the programs through a
human-readable structured file. Liam Keegan's sleek graphical editor for these
parameter files offers some guidance and complains when inconsistent parameter
values are entered (see <http://lkeegan.github.io/openQCD-input-file-editor>).