Skip to content
This repository has been archived by the owner on Jul 12, 2024. It is now read-only.

[DO NOT MERGE] Initial memory allocation and WASI support #135

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tanishiking
Copy link
Owner

This commit adds experimental WASI preview1 support (at least fd_write for now) via memory instructions.

Memory Allocation overview

To allocate memory in Wasm's linear memory, use MemoryAllocator.allocate(bytes). This method returns a MemorySegment representing the contiguous memory space. It has methods to get/set values in that memory segment (e.g., getByte, setByte), which will be compiled into memory operations (e.g., i32.load8, i32.store8).

Using withMemoryAllocator, you can instantiate a MemoryAllocator class, and you can allocate memory segments inside the block. When we exit from the block, all allocated memory segments will be freed.

Memory Allocation implement

The current MemoryAllocator implementation is very basic. It manages a single "current" memory address and allocates a memory segment of N bytes from the current address when allocate(N) is called, shifting the current memory address by N bytes. (Alignment are currently not implemented.) When freeing memory, the implementation resets the current memory address to 0.

This implementation has some issues. For example, nesting withMemoryAllocator blocks can cause problems, as exiting an inner block will free all allocated memory. To prevent this, we could either prohibit nesting withMemoryAllocator or only free memory when exiting all withMemoryAllocator blocks.

Alternatively, each allocator could record the memory segments it allocated and only free those segments when free is called. While this would be aligned with normal memory allocator, it might be too much for our use, considering all memory allocation/deallocation happens only around the withMemoryAllocator scope.

Another concern is multi-threading. However, current Wasm is single-threaded.
While Wasm thread proposal introduces shared memory and atomic load/store instructions, the linear memory for communicating with WASI should be okay to be thread-local.

WASI Support and Wasm Intrinsic Functions

Currently only fd_write is supported from WASI preview1.

The fdWrite method in the wasi object is translated to a fd_write call from the wasi_snapshot_preview1 during Wasm compilation. This intrinsic function translation is implemented by matching with the hardcoded class and method names for now. Hardcoding names for all WASI functions could work for supporting only wasi_preview1. However, introducing annotations to specify which imported Wasm functions to call (such as @WasmFunction("wasi_snapshot_preview1" "fd_write") ? this annotation function body will be swapped into fd_write call) would provide better usability, especially for supporting the Wasm component model / WASI preview2.

Similarly, functions like MemoryAllocator.allocate and MemorySegment.set are also translated into Wasm intrinsic functions during linking, where they are replaced with the corresponding instruction sequences defined in Wasm.

During this process, a temporary data structure called SWasmTree is used. This serves as a design prototype for potentially introducing these data structures as SJSIR Node trees in the future, making it easier to work with them.

This commit adds experimental [WASI preview1](https://github.com/WebAssembly/WASI/blob/main/legacy/preview1/docs.md) support (at least `fd_write` for now) via memory instructions.

**Memory Allocation overview**

To allocate memory in Wasm's linear memory, use `MemoryAllocator.allocate(bytes)`.
This method returns a `MemorySegment` representing the contiguous memory space. It has methods to get/set values in that memory segment (e.g., `getByte`, `setByte`), which will be compiled into memory operations (e.g., `i32.load8`, `i32.store8`).

Using `withMemoryAllocator`, you can instantiate a `MemoryAllocator` class, and you can allocate memory segments inside the block.
When we exit from the block, all allocated memory segments will be freed.

**Memory Allocation implement**

The current `MemoryAllocator` implementation is very basic.
It manages a single "current" memory address and allocates a memory segment of N bytes from the current address when `allocate(N)` is called, shifting the current memory address by N bytes. (Alignment are currently not implemented.)
When freeing memory, the implementation resets the current memory address to 0.

This implementation has some issues. For example, nesting `withMemoryAllocator` blocks can cause problems, as exiting an inner block will free all allocated memory.
To prevent this, we could either prohibit nesting `withMemoryAllocator` or only free memory when exiting all `withMemoryAllocator` blocks.

Alternatively, each allocator could record the memory segments it allocated and only free those segments when `free` is called. While this would be aligned with normal memory allocator, it might be too much for our use, considering all memory allocation/deallocation happens only around the `withMemoryAllocator` scope.

Another concern is multi-threading. However, current Wasm is
single-threaded.
While [Wasm thread proposal](https://github.com/WebAssembly/threads) introduces shared memory and atomic load/store
instructions, the linear memory for communicating with WASI should be okay to be thread-local.

**WASI Support and Wasm Intrinsic Functions**

Currently only `fd_write` is supported from WASI preview1.

The `fdWrite` method in the `wasi` object is translated to a `fd_write` call from the `wasi_snapshot_preview1` during Wasm compilation.
This intrinsic function translation is implemented by matching with the hardcoded class and method names for now. Hardcoding names for all WASI functions could work for supporting only `wasi_preview1`.
However, introducing annotations to specify which imported Wasm functions to call (such as `@WasmFunction("wasi_snapshot_preview1" "fd_write")` ? this annotation function body will be swapped into `fd_write` call) would provide better usability, especially for supporting the Wasm component model / WASI preview2.

Similarly, functions like `MemoryAllocator.allocate` and `MemorySegment.set` are also translated into Wasm intrinsic functions during linking, where they are replaced with the corresponding instruction sequences defined in Wasm.

During this process, a temporary data structure called `SWasmTree` is used. This serves as a design prototype for potentially introducing these data structures as SJSIR Node trees in the future, making it easier to work with them.
Comment on lines +1076 to +1077
case SpecialNames.WasmMemorySegmentClass =>
if (SpecialNames.loadMethodNames.contains(methodName)) {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we match the className and methodName with the hardcoded SpecialNames, and generate SWasmTree (kind of fake SJSIR Tree), and FunctionEmitter.emitIntrinsicFucntion to inject an implementation.

@@ -2166,4 +2200,106 @@ object CoreWasmLib {
fb.buildAndAddToModule()
}

private def genAllocate()(implicit ctx: WasmContext): Unit = {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What allocate does is just shifting the currentAddress based on the given size of bytes. If the current memory size (by memory.size) is in short, we call memory.grow (if it returns -1 it means reached to the max, throwing an exception).

@@ -247,7 +258,7 @@ final class Emitter(config: Emitter.Config) {
// Finish the start function

fb.buildAndAddToModule()
ctx.moduleBuilder.setStart(genFunctionName.start)
// ctx.moduleBuilder.setStart(genFunctionName.start)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WASI application ABI requires to export start or initialize function, and WASI will setup the runtime in there (I suppose).

If we set start function for the module, Wasm module will call the start function when we instantiate it. If start function calls a WASI function (in main method in Scala) it will fail because WASI haven't yet prepared.

@@ -301,14 +312,20 @@ final class Emitter(config: Emitter.Config) {
|${moduleImports.mkString("\n")}
|
|import { load as __load } from './${config.loaderModuleName}';
|import { WASI } from 'wasi';
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* @see
* https://www.w3.org/TR/wasm-core-2/#memory-instructions%E2%91%A4
*/
def apply(): MemoryArg = MemoryArg(0, 0)
Copy link
Owner Author

@tanishiking tanishiking May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory instructions receive it's arguments both from operand and it's immediate arguments (MemoryArg).

For example, the following wasm code will load i32 value from memory offset 300( = 100 + 200).

i32.const 100
i32.load offset=200 align=0

Since we set offset of the immediate memory arg to be 0, the offset will be all given from the operand.

@@ -0,0 +1,52 @@
package scala.scalajs.wasm
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are failing because these code are not available outside the sample project.
However, I'm not sure where to put those library code in our project 🤔

Comment on lines +370 to +371
factory.instantiateClass(WasmMemorySegmentClass, WasmMemorySegmentCtor),
factory.instantiateClass(WasmMemoryAllocatorClass, WasmMemoryAllocatorCtor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what causes the tests failure, because the backend now refuses to run at all without these classes.

Ideally, instead, the backend should adapt to whether these classes are available or not. So not declare them as symbol requirements. Instead, only generate the helper functions that manipulate them if they exist in the list of LinkedClasses we receive.

Comment on lines +51 to +53
// WASI functions
val WASI = ClassName("scala.scalajs.wasm.wasi$")
val wasiFdWrite = MethodName("fdWrite", List(IntRef, IntRef, IntRef, IntRef), IntRef)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These we should find a way to make generic. The backend should know about memory instructions, because they are opcodes. But library functions coming from imports shouldn't be hard-coded. We should make it possible for users to define these interoperability anchors as library themselves.

It might be hard/impossible to do as long as we're not merged upstream, though. We would likely do this based on some sort of annotation (like @WasmImport), but annotations are not persisted in the IR in general.

@@ -246,6 +270,36 @@ object FunctionEmitter {
}

private type Env = Map[LocalName, VarStorage]

object SWasmTrees {
abstract sealed class SWasmTree {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want custom trees that integrate well within IR trees, you can use ir.Trees.Transient nodes. They carry an arbitrary payload that implements Transient.Value.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants