[DO NOT MERGE] Initial memory allocation and WASI support #135

tanishiking · 2024-05-20T11:58:09Z

This commit adds experimental WASI preview1 support (at least fd_write for now) via memory instructions.

Memory Allocation overview

To allocate memory in Wasm's linear memory, use MemoryAllocator.allocate(bytes). This method returns a MemorySegment representing the contiguous memory space. It has methods to get/set values in that memory segment (e.g., getByte, setByte), which will be compiled into memory operations (e.g., i32.load8, i32.store8).

Using withMemoryAllocator, you can instantiate a MemoryAllocator class, and you can allocate memory segments inside the block. When we exit from the block, all allocated memory segments will be freed.

Memory Allocation implement

The current MemoryAllocator implementation is very basic. It manages a single "current" memory address and allocates a memory segment of N bytes from the current address when allocate(N) is called, shifting the current memory address by N bytes. (Alignment are currently not implemented.) When freeing memory, the implementation resets the current memory address to 0.

This implementation has some issues. For example, nesting withMemoryAllocator blocks can cause problems, as exiting an inner block will free all allocated memory. To prevent this, we could either prohibit nesting withMemoryAllocator or only free memory when exiting all withMemoryAllocator blocks.

Alternatively, each allocator could record the memory segments it allocated and only free those segments when free is called. While this would be aligned with normal memory allocator, it might be too much for our use, considering all memory allocation/deallocation happens only around the withMemoryAllocator scope.

Another concern is multi-threading. However, current Wasm is single-threaded.
While Wasm thread proposal introduces shared memory and atomic load/store instructions, the linear memory for communicating with WASI should be okay to be thread-local.

WASI Support and Wasm Intrinsic Functions

Currently only fd_write is supported from WASI preview1.

The fdWrite method in the wasi object is translated to a fd_write call from the wasi_snapshot_preview1 during Wasm compilation. This intrinsic function translation is implemented by matching with the hardcoded class and method names for now. Hardcoding names for all WASI functions could work for supporting only wasi_preview1. However, introducing annotations to specify which imported Wasm functions to call (such as @WasmFunction("wasi_snapshot_preview1" "fd_write") ? this annotation function body will be swapped into fd_write call) would provide better usability, especially for supporting the Wasm component model / WASI preview2.

Similarly, functions like MemoryAllocator.allocate and MemorySegment.set are also translated into Wasm intrinsic functions during linking, where they are replaced with the corresponding instruction sequences defined in Wasm.

During this process, a temporary data structure called SWasmTree is used. This serves as a design prototype for potentially introducing these data structures as SJSIR Node trees in the future, making it easier to work with them.

This commit adds experimental [WASI preview1](https://github.com/WebAssembly/WASI/blob/main/legacy/preview1/docs.md) support (at least `fd_write` for now) via memory instructions. **Memory Allocation overview** To allocate memory in Wasm's linear memory, use `MemoryAllocator.allocate(bytes)`. This method returns a `MemorySegment` representing the contiguous memory space. It has methods to get/set values in that memory segment (e.g., `getByte`, `setByte`), which will be compiled into memory operations (e.g., `i32.load8`, `i32.store8`). Using `withMemoryAllocator`, you can instantiate a `MemoryAllocator` class, and you can allocate memory segments inside the block. When we exit from the block, all allocated memory segments will be freed. **Memory Allocation implement** The current `MemoryAllocator` implementation is very basic. It manages a single "current" memory address and allocates a memory segment of N bytes from the current address when `allocate(N)` is called, shifting the current memory address by N bytes. (Alignment are currently not implemented.) When freeing memory, the implementation resets the current memory address to 0. This implementation has some issues. For example, nesting `withMemoryAllocator` blocks can cause problems, as exiting an inner block will free all allocated memory. To prevent this, we could either prohibit nesting `withMemoryAllocator` or only free memory when exiting all `withMemoryAllocator` blocks. Alternatively, each allocator could record the memory segments it allocated and only free those segments when `free` is called. While this would be aligned with normal memory allocator, it might be too much for our use, considering all memory allocation/deallocation happens only around the `withMemoryAllocator` scope. Another concern is multi-threading. However, current Wasm is single-threaded. While [Wasm thread proposal](https://github.com/WebAssembly/threads) introduces shared memory and atomic load/store instructions, the linear memory for communicating with WASI should be okay to be thread-local. **WASI Support and Wasm Intrinsic Functions** Currently only `fd_write` is supported from WASI preview1. The `fdWrite` method in the `wasi` object is translated to a `fd_write` call from the `wasi_snapshot_preview1` during Wasm compilation. This intrinsic function translation is implemented by matching with the hardcoded class and method names for now. Hardcoding names for all WASI functions could work for supporting only `wasi_preview1`. However, introducing annotations to specify which imported Wasm functions to call (such as `@WasmFunction("wasi_snapshot_preview1" "fd_write")` ? this annotation function body will be swapped into `fd_write` call) would provide better usability, especially for supporting the Wasm component model / WASI preview2. Similarly, functions like `MemoryAllocator.allocate` and `MemorySegment.set` are also translated into Wasm intrinsic functions during linking, where they are replaced with the corresponding instruction sequences defined in Wasm. During this process, a temporary data structure called `SWasmTree` is used. This serves as a design prototype for potentially introducing these data structures as SJSIR Node trees in the future, making it easier to work with them.

tanishiking · 2024-05-20T12:05:10Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/ClassEmitter.scala

+      case SpecialNames.WasmMemorySegmentClass =>
+        if (SpecialNames.loadMethodNames.contains(methodName)) {


Here, we match the className and methodName with the hardcoded SpecialNames, and generate SWasmTree (kind of fake SJSIR Tree), and FunctionEmitter.emitIntrinsicFucntion to inject an implementation.

tanishiking · 2024-05-20T12:07:42Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/CoreWasmLib.scala

@@ -2166,4 +2200,106 @@ object CoreWasmLib {
    fb.buildAndAddToModule()
  }

+  private def genAllocate()(implicit ctx: WasmContext): Unit = {


What allocate does is just shifting the currentAddress based on the given size of bytes. If the current memory size (by memory.size) is in short, we call memory.grow (if it returns -1 it means reached to the max, throwing an exception).

tanishiking · 2024-05-20T12:15:00Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/Emitter.scala

@@ -247,7 +258,7 @@ final class Emitter(config: Emitter.Config) {
    // Finish the start function

    fb.buildAndAddToModule()
-    ctx.moduleBuilder.setStart(genFunctionName.start)
+    // ctx.moduleBuilder.setStart(genFunctionName.start)


WASI application ABI requires to export start or initialize function, and WASI will setup the runtime in there (I suppose).

If we set start function for the module, Wasm module will call the start function when we instantiate it. If start function calls a WASI function (in main method in Scala) it will fail because WASI haven't yet prepared.

tanishiking · 2024-05-20T12:15:16Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/Emitter.scala

@@ -301,14 +312,20 @@ final class Emitter(config: Emitter.Config) {
      |${moduleImports.mkString("\n")}
      |
      |import { load as __load } from './${config.loaderModuleName}';
+      |import { WASI } from 'wasi';


https://nodejs.org/api/wasi.html

tanishiking · 2024-05-20T12:19:25Z

wasm/src/main/scala/org/scalajs/linker/backend/webassembly/Instructions.scala

+      * @see
+      *   https://www.w3.org/TR/wasm-core-2/#memory-instructions%E2%91%A4
+      */
+    def apply(): MemoryArg = MemoryArg(0, 0)


Memory instructions receive it's arguments both from operand and it's immediate arguments (MemoryArg).

For example, the following wasm code will load i32 value from memory offset 300( = 100 + 200).

i32.const 100 i32.load offset=200 align=0

Since we set offset of the immediate memory arg to be 0, the offset will be all given from the operand.

tanishiking · 2024-05-20T12:21:23Z

sample/src/main/scala/memory.scala

@@ -0,0 +1,52 @@
+package scala.scalajs.wasm


The tests are failing because these code are not available outside the sample project.
However, I'm not sure where to put those library code in our project 🤔

sjrd · 2024-05-22T08:43:47Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/Emitter.scala

+      factory.instantiateClass(WasmMemorySegmentClass, WasmMemorySegmentCtor),
+      factory.instantiateClass(WasmMemoryAllocatorClass, WasmMemoryAllocatorCtor)


This is what causes the tests failure, because the backend now refuses to run at all without these classes.

Ideally, instead, the backend should adapt to whether these classes are available or not. So not declare them as symbol requirements. Instead, only generate the helper functions that manipulate them if they exist in the list of LinkedClasses we receive.

sjrd · 2024-05-22T08:53:19Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/SpecialNames.scala

+  // WASI functions
+  val WASI = ClassName("scala.scalajs.wasm.wasi$")
+  val wasiFdWrite = MethodName("fdWrite", List(IntRef, IntRef, IntRef, IntRef), IntRef)


These we should find a way to make generic. The backend should know about memory instructions, because they are opcodes. But library functions coming from imports shouldn't be hard-coded. We should make it possible for users to define these interoperability anchors as library themselves.

It might be hard/impossible to do as long as we're not merged upstream, though. We would likely do this based on some sort of annotation (like @WasmImport), but annotations are not persisted in the IR in general.

sjrd · 2024-05-22T08:56:14Z

wasm/src/main/scala/org/scalajs/linker/backend/wasmemitter/FunctionEmitter.scala

@@ -246,6 +270,36 @@ object FunctionEmitter {
  }

  private type Env = Map[LocalName, VarStorage]
+
+  object SWasmTrees {
+    abstract sealed class SWasmTree {


If you want custom trees that integrate well within IR trees, you can use ir.Trees.Transient nodes. They carry an arbitrary payload that implements Transient.Value.

tanishiking commented May 20, 2024

View reviewed changes

sjrd reviewed May 22, 2024

View reviewed changes

This was referenced May 29, 2024

Make Scala.js Wasm backend suitable for standalone Wasm VMs scala-js/scala-js#4991

Open

WASI support tanishiking/scala-js#1

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] Initial memory allocation and WASI support #135

[DO NOT MERGE] Initial memory allocation and WASI support #135

tanishiking commented May 20, 2024

tanishiking May 20, 2024

tanishiking May 20, 2024

tanishiking May 20, 2024

tanishiking May 20, 2024

tanishiking May 20, 2024 •

edited

Loading

tanishiking May 20, 2024

sjrd May 22, 2024

sjrd May 22, 2024

sjrd May 22, 2024

		case SpecialNames.WasmMemorySegmentClass =>
		if (SpecialNames.loadMethodNames.contains(methodName)) {

		factory.instantiateClass(WasmMemorySegmentClass, WasmMemorySegmentCtor),
		factory.instantiateClass(WasmMemoryAllocatorClass, WasmMemoryAllocatorCtor)

[DO NOT MERGE] Initial memory allocation and WASI support #135

Are you sure you want to change the base?

[DO NOT MERGE] Initial memory allocation and WASI support #135

Conversation

tanishiking commented May 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tanishiking May 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tanishiking May 20, 2024 •

edited

Loading