-
Notifications
You must be signed in to change notification settings - Fork 17
writejdf
See Understanding the Compilation process to get a big picture.
The syntax is presented with a simple example that is commented below:
It is a simplified version of the Cholesky Factorization in PaRSEC. Simplifications consisted in removing GPU-related code, hints on scheduling, and benchmarking options, to measure the overheads of scheduling of the engine.
First, the JDF begins with a preamble in C:
In this preamble, we add the includes that define the prototype of the functions that each task will call, when executed. We also include the parsec.h and the generic include file for tiled matrix distributions. Functions definitions, or global variables, can be added here, but caution should be used, since this code will be dumped as is at the beginning of the generated C file.
Then, we define a set of objects that are used in the tasks below. These objects are recognized by the JDF translator, and thus they must follow a non-C syntax:
Here, we define for example the general type of the object as a C structure of type (more later about this on the data distribution page). and are LAPACK parameters, used by the kernels. The generator function that is going to be created by the JDF compiler will take all these objects as parameters, each one of the specified type (hence, the generator function prototype will be See more explanations on How to Write a program that uses a PaRSEC-Enabled Operation).
A Cholesky factorization contains four kernels, called **POTRF, HERK, TRSM, and GEMM**. For each of these kernels, the JDF holds a parameterized task. Let us look closely at the first one, **POTRF**:
The task is defined with a single parameter, . This parameter must have an execution space that is entirely defined by the list of global objects that were defined in the beginning of the JDF. In this case, is between and (inclusive), and since is a parameter of the generator function, is entirely defined. As you can see with **GEMM**, for example, multiple parameters can be used, and parameters can depend on the value of preceding parameters (in **GEMM**, is defined by , is defined by the possible values of and by , is defined by the possible values of and ). This definition must be restrictive: if the task **POTRF(0)** exists by its parameters (i.e. can be equal to ), it will be scheduled by the runtime system.
Then, the task defines its data affinity: task **POTRF(k)** will run on the node that holds the part of the matrix pointed by . Each task must define a single data as its affinity. The entry is a reference on a object, with the appropriate number of parameters. See more on Write a program that uses a PaRSEC-Enabled Operation about data distributions.
After the data affinity, comes the data flow of the task.
This data flow defines a single data element, , that flows inside **POTRF(k)** with the first line, and after has been processed by the task, it flows out of it, towards tasks or memory, as specified by the two last lines. Because the data element is accessed in read and write by the task, it is specified so using the keyword in front of it. Each data element that is so defined must have a single input flow (although multiple input flows are acceptable syntactically, they must be mutually exclusive, to ensure the data element is defined by a single task). The first line reads as " comes from the memory if is equal to , and then it is accessed directly from memory, as provided by ; otherwise, comes from the data element also called in the task **HERK(k-1, k)**". The second line reads as " flows to the data element also called in the task **TRSM(k, k+1)**, and to the data element also called in the task **TRSM(k, k+2)**, and so on until the data element called in the task **TRSM(k, A->mt-1)**". If this set reduces to the empty set, will not flow out of this task. The last line reads as "one this data element has been processed by this task, no other task of the same operation will modify its value, and it should be copied back into memory, at the location ".
The PaRSEC runtime engine does not guarantee that will always be located in during the whole execution: it can be copied in a temporary memory, for communications or optimization purposes. Nor does it guarantees that will not point to if this is at all possible. That is the reason why the developer should always specify when the data should be safely written back in memory, and never access the memory directly, but only through the data flow.
Data flows in the JDF must be symmetric: if **POTRF** writes that it receives data from **HERK**, **HERK** must write that it sends data to **POTRF**. The inversion of parameters must match.
Tasks can define up to data elements. If more data elements are necessary, it may be necessary to create pseudo tasks to aggregate them.
Additionally to ternary conditionals, the grammar accepts binary conditionals, and most of the classical operations on integers.
The last element of the task definition is its body:
The body is a piece of C code that can use the parameters of the task, the data elements defined in the task, and all the global objects that have been defined (although it should not use data references on data elements that flow in one of the tasks). For example, the body of **POTRF** uses and , which are a memory reference and an integer defined by the global objects, but do not flow in the data flow. And it uses , which is a data element defined by the data flow of this task. In this case, the C-code to call is minimal: it is just a simple call to the BLAS routine potrf with the appropriate parameters.
The JDF translator will create a piece of code that defines as a function of the data flow, then dump this C-code, which is not parsed or interpreted, and then a piece of code to handle (potentially copying it back into memory, if needed), and/or pass it on to other tasks.
A JDF file can, and must, be converted into a source, aka. compilable, version using the PaRSEC source-to-source precompiler, . This translator, **parsec_ptgpp**, is located in the PaRSEC source parsec/interfaces/ptg/ptg-compiler sub-directory, and is installed in the bin directory upon successful compilation.
The outcome of the precompiler is to generate a ***.c** source file, and the associated **.h*** header file that correspond to the JDF dataflow, files containing the concise representation of the DAG, and the events that triggers the progress of the entire algorithm.
- parsec_ptgpp** takes the following options:
- --debug|-d Enable bison debug output
- --input|-i Input File (JDF) (default '-')
- --output|-o Set the BASE name for .c, .h and function name (no default).
Changing this value has precedence over the defaults of --output-c, --output-h, and --function-name
- --output-c|-C Set the name of the .c output file (default 'a.c' or BASE.c)
- --output-h|-H Set the name of the .h output file (default 'a.h' or BASE.h)
- --function-name|-f Set the unique identifier of the generated function
The generated function will be called PaRSEC_<ID>_new (default a)
- --dep-management|-M Select how dependencies tracking is managed. Possible choices
are 'index-array' or 'dynamic-hash-table' (default 'dynamic-hash-table')
- --noline Do not dump the JDF line number in the .c output file
- --line Force dumping the JDF line number in the .c output file
Default: --line
- --preproc|-E Stop after the preprocessing stage. The output is generated
in the form of preprocessed source code, but they are not compiled.
- --showme Print the flags used to compile for preprocessed files
- --Werror Exit with non zero value if at least one warning is encountered
- --Wmasked Do NOT print warnings for masked variables
- --Wmutexin Do NOT print warnings for non-obvious mutual exclusion of
input flows
- --Wremoteref Do NOT print warnings for potential remote memory references
Your PaRSEC-Enabled operation is ready! To test it, you should Write a program that uses a PaRSEC-Enabled Operation.