-
Notifications
You must be signed in to change notification settings - Fork 2
Trilliasm Language Overview: Vector SIMD in C
The structure of this document is as follows:
- Quickstart Code Example
- Description of Trilliasm language constructs and their semantics
- Gluer rationale/behavior
To get started quickly with Trilliasm, it's easiest to take a look at the documented code example below (from programs-phil/spad/template
).
[TODO: I would love to rewrite the below, and the template code it comes from, to use the "interleaved" vector/scalar style.]
void tril_template_vec(int mask)
{
//this template uses separate scalar and vector code blocks but they can be interspersed as well as shown here
//https://github.com/cucapra/gem5-mesh/wiki/Trilliasm-Language-Overview:-Vector-SIMD-in-C
// DO NOT PUT ANY CODE BEFORE VECTOR_EPOCH.
// any values you want the scalar and vector cores to have in common must be computed before entering this function
#ifdef SCALAR_CORE
VECTOR_EPOCH(mask);
//---------------------------------
//scalar core code iterspersed with vissue
ISSUE_VINST(init_label); //eg: this block will deal with initila stack manipulation and initilaization of variables
//-----------------------------------
//issue stack end portions of vector cores
ISSUE_VINST(vector_stack_label);
// devec with unique tag
DEVEC(devec_0);
//fence for all cores to ensure memory operations have completed
asm volatile("fence\n\t");
asm("trillium vissue_delim return scalar_return"); //return delimiter, delimiters can be of many types
return;
//all the vissue labels below:
init_label: //this name matches with vissue label name
asm("trillium glue_point init"); //name over here "init" matches with delimiter in vector code
vector_stack_label:
asm("trillium glue_point vector_stack"); //name over here "vector_stack" matches with delimiter in vector code
#elif defined VECTOR_CORE
asm("trillium vissue_delim until_next init"); //until_next delimiter used, name (init) over here same as in glue point above
//vector core code
asm("trillium vissue_delim return vector_stack"); //return delimiter
return;
#endif
}
-
Use
#define SCALAR/VECTOR_CORE
directives to separate vector and scalar code -
Mark Vissue Blocks with Trilliasm Delimiters and Gluepoints
Contiguous statements can be named and grouped into a vissue-able (vissue) block by means of Trilliasm Delimiters. There are 4 kinds of delimiters:
-
until_next
: groups all statements following the delimiter until the next delimiter (of any kind) is found. All jump/branch instructions are automatically deleted from within this block. -
begin
/end
: groups all statements afterbegin
and beforeend
delimiters. Also deletes jumps and branches.- You can optionally use
end at_jump
instead of just plainend
. This almost ends the vector block, but not quite: it gathers up a few more instructions, until the next time we hit a jump/branch instruction. This is especially useful at the ends of loops, where GCC sometimes moves instructions past theend
delimiter, so they'd otherwise be missed. See #51 for details. - [FOR DEBUGGING] all code blocks after an
end
delimiter and before any other delimiter (including jump/branch instructions) are assigned a vissue key of the formtrillium_junkn
(e.g.,trillium_junk0
,trillium_junk1
.., etc, in order of appearance). - these code blocks are vissue-able just like any other code block.
- You can optionally use
-
begin_if
/end
: Similar tobegin
/end
, but doesn't drop jump/branch instructions. Use this if your vector block might contain control flow. (Like normalbegin
, also supportsend at_jump
.) -
return
: a special delimiters that MUST be placed before both the scalar and vector return statements.
Delimiter syntax:
asm("trillium vissue_delim /*delim-kind*/ /*vissue block name*/")
For each vissue block, there must be a corresponding
glue_point
within the "vissue label" (i.e., the argument toISSUE_VINST
) of the same name.Some notes on delimiters:
- the semantics of the first delimiter is slightly different than usual: it incorporates not only the assembly corresponding to the contained vissue block, but also any "setup" assembly code that might be generated before it.
- The
return
delimiter marks the assembly code corresponding to thereturn
C statement, excluding the jump to the return address. This corresponds to "stack cleanup" code, and must be executed right before DEVEC. - As of this first version of Trilliasm, only the vector vissue block must be called explicitly before DEVEC. The scalar
return
delimiter must never be explicitly vissued, since the gluer will implicitly call it for you right before the DEVEC. (and after the vectorreturn
block, though the order doesn't matter). - Although the same vissue block name can be used, the last defined one of that name will override all of them. To avoid confusion, define vissue blocks exactly once, in the most natural place (e.g., the steady-state loop in the three phase prefetch loop).
-
Here are somethings to do to get GCC to emit predictable enough code that Trilliasm can process it:
- Use
-fno-reorder-blocks
. The Trilliasm Makefile snippet does this by default. - In the vector code, use
do { ... } while (black_hole);
for "replacement" loops. (Using plainwhile
loops interacts poorly with-fno-reorder-blocks
.) - Do not put code before VECTOR_EPOCH
- If you see errors like
glue point vector_return is missing a label
and the label exists in code, then it is likely gcc is duplicating the labels. To resolve this putexit(1);
between each glue point like example below.
vec_body_init_label:
asm("trillium glue_point vec_body_init");
exit(1);
vec_body_label:
asm("trillium glue_point vec_body");
exit(1);