-
Notifications
You must be signed in to change notification settings - Fork 356
How To Add An Instruction
This tutorial will walk through the process of adding a new instruction to the instruction set, implementing it in hardware and the emulator, and exposing it through the compiler/assembler. Although adding an instruction isn't a common task, walking through the process is a good way to get an overview of the various components of this project. Much of this work involves changes to the LLVM target backend. Having a good understanding of LLVM is useful, but it has a pretty steep learning curve. However, you can do a lot by copying how other instructions are implemented.
Because the instruction set and implementation is still in flux, this is probably stale and not completely correct. It should be interpreted as a general guide.
For this tutorial, we will add an instruction to branch if the number of set bits in a word is odd (odd parity).
Example:
boddp s0, loop
The steps are as follows:
- Pick Instruction Encoding
- Add Instruction Encoding Test
- Add Instruction to Compiler Backend
- Add Functional Tests
- Implement Instruction in Emulator
- Implement RTL
There are three bits that define type for branch instructions.
https://github.com/jbush001/NyuziProcessor/wiki/Instruction-Set#branch
We'll use an unused code '101':
This test will ensure the proper binary code is generated for this instruction. The test is contained in the toolchain repository (tools/NyuziToolchain, which is a git subproject), test/MC/Nyuzi. We'll update 'branch.s'. We can look at the instruction encoding wiki to see where the bits should be. This is a little endian machine, so the bytes in brackets are reversed. The A values represent the offset bits (or bytes) in the instruction that are patched when the program is linked.
boddp s5, target5 # CHECK: encoding: [0bAAA00101,A,A,0b1111000A]
# CHECK: fixup A - offset: 0, value: target4, kind: fixup_Nyuzi_PCRel_Branch
Run the test (in a separate shell, because we need to modify the environment) and ensure it fails, since the instruction has not been implemented yet.
$ export PATH=<source dir>;/build/bin/:$PATH
$ llvm-lit . -- Testing: 1 tests, 1 threads --
FAIL: LLVM :: MC/Nyuzi/branch.s (1 of 1)
Testing Time: 0.27s
********************
Failing Tests (1):
LLVM :: MC/Nyuzi/branch.s
Unexpected Failures: 1
The file llvm/lib/Target/Nyuzi/NyuziInstrInfo.td contains definitions of instructions, which are read by LLVM's TableGen tool. These definitions are used in several places: they automatically generate the assembler and disassembler patterns, as well as enabling instruction matching for the compiler. This instruction will only be exposed to the assembler, so we'll ignore compiler code generation in this document.
What's not discussed here is how to expose the instruction to C/C++ code. This is a little tricky for branch instructions. For other instruction types like arithmetic, this can be done by adding an Intrinsic or by adding a DAG pattern to match
We will add the new instruction to the section that specifies branches:
let isTerminator = 1 in {
...
def BODDP: ConditionalBranchInst<
(outs),
(ins GPR32:$test, brtarget:$dest),
"boddp $test, $dest",
[], // Assembler only
BT_All>;
If you run the test now, it should pass:
$ llvm-lit .
-- Testing: 7 tests, 4 threads --
...
PASS: LLVM :: MC/Nyuzi/branch.s
...
In the NyuziProcessor tree, tests/core/isa/branch.S:
move s3, 3
move s4, 2
...
# bodd, taken
boddp s3, 1f
should_not_get_here
1:
# bodd, not taken
boddp s2, 1f
b 2f
1: should_not_get_here
2:
If we run this now, it should fail because the instruction is not implemented in hardware yet:
$ ./runtest.py branch.S
FAIL
We can also add a randomized test. In generate_random.py:
BRANCH_TYPES = [
('boddp', True)
...
We need the emulator to support this instruction for testing and verification. In the NyuziProcessor tree, tools/emulator/core.c:
void executeBranchInstruction(Thread *thread, unsigned int instr)
{
int branchTaken;
int srcReg = bitField(instr, 0, 5);
switch (bitField(instr, 25, 3))
{
...
case 5: // boddp
branchTaken = __builtin_popcount(getThreadScalarReg(thread, srcReg)) & 1;
break;
In rtl/core/defines.v, add a new enumeration for this branch encoding:
// Instruction format E operation types
typedef enum logic[2:0] {
BRANCH_ODDP = 3'b101,
...
Here's the meat of the implementation. In rtl/core/int_cycle_execute_stage.v:
if (of_instruction.is_branch)
begin
case (of_instruction.branch_type)
BRANCH_ODDP:
begin
branch_taken = ^of_operand1;
is_conditional_branch = 1;
end
Rebuild the verilog model by typing 'make' in the hardware/ directory. Then run the test from the core directory
$./runtest branch.S
PASS