|
| 1 | ++++ |
| 2 | +title = "Appendix: Trampolines" |
| 3 | +layout = "single" |
| 4 | ++++ |
| 5 | + |
| 6 | +Trampolines are used to interface between the Go runtime and the generated |
| 7 | +code, in two cases: |
| 8 | + |
| 9 | +- when we need to **enter the generated code** from the Go runtime. |
| 10 | +- when we need to **leave the generated code** to invoke a host function |
| 11 | + (written in Go). |
| 12 | + |
| 13 | +In this section we want to complete the picture of how a Wasm function gets |
| 14 | +translated from Wasm to executable code in the optimizing compiler, by |
| 15 | +describing how to jump into the execution of the generated code at run-time. |
| 16 | + |
| 17 | +## Entering the Generated Code |
| 18 | + |
| 19 | +At run-time, user space invokes a Wasm function through the public |
| 20 | +`api.Function` interface, using methods `Call()` or `CallWithStack()`. The |
| 21 | +implementation of this method, in turn, eventually invokes an ASM |
| 22 | +**trampoline**. The signature of this trampoline in Go code is: |
| 23 | + |
| 24 | +```go |
| 25 | +func entrypoint( |
| 26 | + preambleExecutable, functionExecutable *byte, |
| 27 | + executionContextPtr uintptr, moduleContextPtr *byte, |
| 28 | + paramResultStackPtr *uint64, |
| 29 | + goAllocatedStackSlicePtr uintptr) |
| 30 | +``` |
| 31 | + |
| 32 | +- `preambleExecutable` is a pointer to the generated code for the preamble (see |
| 33 | + below) |
| 34 | +- `functionExecutable` is a pointer to the generated code for the function (as |
| 35 | + described in the previous sections). |
| 36 | +- `executionContextPtr` is a raw pointer to the `wazevo.executionContext` |
| 37 | + struct. This struct is used to save the state of the Go runtime before |
| 38 | +entering or leaving the generated code. It also holds shared state between the |
| 39 | +Go runtime and the generated code, such as the exit code that is used to |
| 40 | +terminate execution on failure, or suspend it to invoke host functions. |
| 41 | +- `moduleContextPtr` is a pointer to the `wazevo.moduleContextOpaque` struct. |
| 42 | + This struct Its contents are basically the pointers to the module instance, |
| 43 | +specific objects as well as functions. This is sometimes called "VMContext" in |
| 44 | +other Wasm runtimes. |
| 45 | +- `paramResultStackPtr` is a pointer to the slice where the arguments and |
| 46 | + results of the function are passed. |
| 47 | +- `goAllocatedStackSlicePtr` is an aligned pointer to the Go-allocated stack |
| 48 | + for holding values and call frames. For further details refer to |
| 49 | + [Backend § Prologue and Epilogue](../backend/#prologue-and-epilogue) |
| 50 | + |
| 51 | +The trampoline can be found in`backend/isa/<arch>/abi_entry_<arch>.s`. |
| 52 | + |
| 53 | +For each given architecture, the trampoline: |
| 54 | +- moves the arguments to specific registers to match the behavior of the entry preamble or trampoline function, and |
| 55 | +- finally, it jumps into the execution of the generated code for the preamble |
| 56 | + |
| 57 | +The **preamble** that will be jumped from `entrypoint` function is generated per function signature. |
| 58 | + |
| 59 | +This is implemented in `machine.CompileEntryPreamble(*ssa.Signature)`. |
| 60 | + |
| 61 | +The preamble sets the fields in the `wazevo.executionContext`. |
| 62 | + |
| 63 | +At the beginning of the preamble: |
| 64 | + |
| 65 | +- Set a register to point to the `*wazevo.executionContext` struct. |
| 66 | +- Save the stack pointers, frame pointers, return addresses, etc. to that |
| 67 | + struct. |
| 68 | +- Update the stack pointer to point to `paramResultStackPtr`. |
| 69 | + |
| 70 | +The generated code works in concert with the assumption that the preamble has |
| 71 | +been entered through the aforementioned trampoline. Thus, it assumes that the |
| 72 | +arguments can be found in some specific registers. |
| 73 | + |
| 74 | +The preamble then assigns the arguments pointed at by `paramResultStackPtr` to |
| 75 | +the registers and stack location that the generated code expects. |
| 76 | + |
| 77 | +Finally, it invokes the generated code for the function. |
| 78 | + |
| 79 | +The epilogue reverses part of the process, finally returning control to the |
| 80 | +caller of the `entrypoint()` function, and the Go runtime. The caller of |
| 81 | +`entrypoint()` is also responsible for completing the cleaning up procedure by |
| 82 | +invoking `afterGoFunctionCallEntrypoint()` (again, implemented in |
| 83 | +backend-specific ASM). which will restore the stack pointers and return |
| 84 | +control to the caller of the function. |
| 85 | + |
| 86 | +The arch-specific code can be found in |
| 87 | +`backend/isa/<arch>/abi_entry_preamble.go`. |
| 88 | + |
| 89 | +[wazero-engine-stack]: https://github.com/tetratelabs/wazero/blob/095b49f74a5e36ce401b899a0c16de4eeb46c054/internal/engine/compiler/engine.go#L77-L132 |
| 90 | +[abi-arm64]: https://tip.golang.org/src/cmd/compile/abi-internal#arm64-architecture |
| 91 | +[abi-amd64]: https://tip.golang.org/src/cmd/compile/abi-internal#amd64-architecture |
| 92 | +[abi-cc]: https://tip.golang.org/src/cmd/compile/abi-internal#function-call-argument-and-result-passing |
| 93 | + |
| 94 | + |
| 95 | +## Leaving the Generated Code |
| 96 | + |
| 97 | +In "[How do compiler functions work?][how-do-compiler-functions-work]", we |
| 98 | +already outlined how _leaving_ the generated code works with the help of a |
| 99 | +function. We will complete here the picture by briefly describing the code that |
| 100 | +is generated. |
| 101 | + |
| 102 | +When the generated code needs to return control to the Go runtime, it inserts a |
| 103 | +meta-instruction that is called `exitSequence` in both `amd64` and `arm64` |
| 104 | +backends. This meta-instruction sets the `exitCode` in the |
| 105 | +`wazevo.executionContext` struct, restore the stack pointers and then returns |
| 106 | +control to the caller of the `entrypoint()` function described above. |
| 107 | + |
| 108 | +As described in "[How do compiler functions |
| 109 | +work?][how-do-compiler-functions-work]", the mechanism is essentially the same |
| 110 | +when invoking a host function or raising an error. However, when a function is |
| 111 | +invoked the `exitCode` also indicates the identifier of the host function to be |
| 112 | +invoked. |
| 113 | + |
| 114 | +The magic really happens in the `backend.Machine.CompileGoFunctionTrampoline()` |
| 115 | +method. This method is actually invoked when host modules are being |
| 116 | +instantiated. It generates a trampoline that is used to invoke such functions |
| 117 | +from the generated code. |
| 118 | + |
| 119 | +This trampoline implements essentially the same prologue as the `entrypoint()`, |
| 120 | +but it also reserves space for the arguments and results of the function to be |
| 121 | +invoked. |
| 122 | + |
| 123 | +A host function has the signature: |
| 124 | + |
| 125 | +``` |
| 126 | +func(ctx context.Context, stack []uint64) |
| 127 | +``` |
| 128 | + |
| 129 | +the function arguments in the `stack` parameter are copied over to the reserved |
| 130 | +slots of the real stack. For instance, on `arm64` the stack layout would look |
| 131 | +as follows (on `amd64` it would be similar): |
| 132 | + |
| 133 | +```goat |
| 134 | + (high address) |
| 135 | + SP ------> +-----------------+ <----+ |
| 136 | + | ....... | | |
| 137 | + | ret Y | | |
| 138 | + | ....... | | |
| 139 | + | ret 0 | | |
| 140 | + | arg X | | size_of_arg_ret |
| 141 | + | ....... | | |
| 142 | + | arg 1 | | |
| 143 | + | arg 0 | <----+ <-------- originalArg0Reg |
| 144 | + | size_of_arg_ret | |
| 145 | + | ReturnAddress | |
| 146 | + +-----------------+ <----+ |
| 147 | + | xxxx | | ;; might be padded to make it 16-byte aligned. |
| 148 | + +--->| arg[N]/ret[M] | | |
| 149 | + sliceSize| | ............ | | goCallStackSize |
| 150 | + | | arg[1]/ret[1] | | |
| 151 | + +--->| arg[0]/ret[0] | <----+ <-------- arg0ret0AddrReg |
| 152 | + | sliceSize | |
| 153 | + | frame_size | |
| 154 | + +-----------------+ |
| 155 | + (low address) |
| 156 | +``` |
| 157 | + |
| 158 | +Finally, the trampoline jumps into the execution of the host function using the |
| 159 | +`exitSequence` meta-instruction. |
| 160 | + |
| 161 | +Upon return, the process is reversed. |
| 162 | + |
| 163 | +## Code |
| 164 | + |
| 165 | +- The trampoline to enter the generated function is implemented by the |
| 166 | + `backend.Machine.CompileEntryPreamble()` method. |
| 167 | +- The trampoline to return traps and invoke host functions is generated by |
| 168 | + `backend.Machine.CompileGoFunctionTrampoline()` method. |
| 169 | + |
| 170 | +You can find arch-specific implementations in |
| 171 | +`backend/isa/<arch>/abi_go_call.go`, |
| 172 | +`backend/isa/<arch>/abi_entry_preamble.go`, etc. The trampolines are found |
| 173 | +under `backend/isa/<arch>/abi_entry_<arch>.s`. |
| 174 | + |
| 175 | +## Further References |
| 176 | + |
| 177 | +- Go's [internal ABI documentation][abi-internal] details the calling convention similar to the one we use in both arm64 and amd64 backend. |
| 178 | +- Raphael Poss's [The Go low-level calling convention on |
| 179 | + x86-64][go-call-conv-x86] is also an excellent reference for `amd64`. |
| 180 | + |
| 181 | +[abi-internal]: https://tip.golang.org/src/cmd/compile/abi-internal |
| 182 | +[go-call-conv-x86]: https://dr-knz.net/go-calling-convention-x86-64.html |
| 183 | +[proposal-register-cc]: https://go.googlesource.com/proposal/+/master/design/40724-register-calling.md#background |
| 184 | +[how-do-compiler-functions-work]: ../../how_do_compiler_functions_work/ |
| 185 | + |
0 commit comments