Skip to content

Moving wasm-bindgen from "Future Compatible with" to "Polyfill for" WebIDL BindingsΒ #1524

Closed
@fitzgen

Description

@fitzgen

Where we are Now

Here is a simplified view of our Rust and Wasm toolchain, its artifacts and the processes that input and output these artifacts:

+--------------+
| Rust Sources |
+--------------+
       |
       |
     rustc
       |
       |
       V
+----------------------------------------+
| Initial .wasm w/ wasm-bindgen metadata |
+----------------------------------------+
       |
       |
  wasm-bindgen CLI ----------.
       |                     |
       |                     |
       V                     V
+-------------+       +------------------+
| Final .wasm |       | JS Bindings Glue |
+-------------+       +------------------+

Right now the initial .wasm contains two things that are used by the wasm-bindgen CLI, and stripped from the final .wasm binary:

  • A wasm-bindgen-specific custom section. This contains a summary of all the imported and exported functions, structs, fields, and methods. It doesn't include functions' and methods' ABI types, however.

  • "Descriptor" functions. These are evaluated by the CLI and communicate the types being sent and received over the ABI boundary for a given function or method. We do this because we want the ABI representation of types to be trait-based, so we can customize it for different types, and it was the best way at the time to support that.

The generated JS bindings glue is responsible for the following things:

  • Translating JS values going to Wasm into their ABI representation.

  • Translating values coming out of Wasm to JS from their ABI representation into JS values.

  • Instantiating the Wasm module and hooking up imported functions.

Vision

We leverage WebIDL bindings and anyref to completely remove the wasm-bindgen-specific metadata and 99% of the JS bindings glue:

+--------------+
| Rust Sources |
+--------------+
       |
       |
     rustc
       |
       |
       V
+---------------------------------------------------+
| Final .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------------+
       |
       |
  wasm-bindgen CLI
       |
       |
       V
+-----------------------+
| JS Instantiation Glue |
+-----------------------+

rustc has first-class support for anyref and emitting the WebIDL bindings custom section.

The JS bindings glue is no longer responsible for translating between ABI representations and JS values:

  • The ABI representation of most JS values is simply anyref. In JS hosts, an anyref is a JS value, or at least the conversion is so transparent that they are effectively the same from our perspective.

  • Strings typically use WebIDL bindings' UTF-8 string shepherding facilities (unless they are trying to handle unpaired surrogates, in which case they can work with JS string values as anyrefs).

The only thing that the generated JS is responsible for now is WebAssembly instantiation with all of the right WebIDL function imports, and polyfilling Wasm's ESM integration.

For Browsers that Don't Support WebIDL Bindings Yet

To support browsers that don't support WebIDL bindings yet in this future world, the wasm-bindgen CLI will also be able to act as a polyfill, removing the WebIDL bindings section from the .wasm binary, and instead emitting equivalent JS bindings glue (essentially what it emits today).

+--------------+
| Rust Sources |
+--------------+
       |
       |
     rustc
       |
       |
       V
+---------------------------------------------+
| .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------+
       |
       |
  wasm-bindgen CLI --------------------------------------.
       |                                                 |
       |                                                 |
       V                                                 V
+----------------------------------------+       +------------------+
| .wasm w/out WebIDL Bindings nor anyref |       | JS Bindings Glue |
+----------------------------------------+       +------------------+

An Intermediate Goal

Polyfill generating the WebIDL bindings section in the wasm-bindgen CLI:

+--------------+
| Rust Sources |
+--------------+
       |
       |
     rustc
       |
       |
       V
+------------------------------------------------------+
| Initial .wasm w/ wasm-bindgen metadata and no anyref |
+------------------------------------------------------+
       |
       |
  wasm-bindgen CLI
       |
       |
       V
+---------------------------------------------+
| .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------+

Then, to run on environments that don't support WebIDL bindings yet, we use the polyfill described in the previous section to emit a final .wasm and JS bindings glue that is equivalent to what we use today:

+--------------+
| Rust Sources |
+--------------+
       |
       |
     rustc
       |
       |
       V
+------------------------------------------------------+
| Initial .wasm w/ wasm-bindgen metadata and no anyref |
+------------------------------------------------------+
       |
       |
  wasm-bindgen CLI
       |
       |
       V
+---------------------------------------------+
| .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------+
       |
       |
  wasm-bindgen CLI --------------------------------------.
       |                                                 |
       |                                                 |
       V                                                 V
+----------------------------------------+       +------------------+
| .wasm w/out WebIDL Bindings, nor       |       | JS Bindings Glue |
| wasm-bindgen metadata, nor anyref      |       +------------------+
+----------------------------------------+

In practice, we won't need to literally invoke the wasm-bindgen CLI twice and serialize the .wasm to disk just to read it back again, but having the phases split and explicitly represented like this will be helpful for moving us forward, I suspect.

Things we can do now

  • Fix the current experimental anyref implementation and upgrade it to non-experimental. It seems to have bugs where it isn't working in Firefox/SpiderMonkey anymore. Node.js >= 12.x also has support for anyref behind a flag, so we should start running our test suite with anyref on Node.js in CI, instead of just checking that it builds.

  • Build a crate to consume and emit the text and binary representations of the WebIDL bindings custom section. We can use this in walrus and wasm-bindgen and wherever else we find a need.

  • Stop emitting descriptor functions and interpreting them in the wasm-bindgen CLI. We do this by switching from descriptor functions to building up a static from associated consts. Essentially a static array of the exact same thing stuff that we pass to a series of calls to __wbindgen_describe now.

  • After that, we should align the format of our custom section with the format of WebIDL bindings. We will emit a copy of this custom section for each #[wasm_bindgen] use, so it won't be 100% compatible with the WebIDL bindings section. Instead it will be like a pre-linked "object section", that we will link into a single section in the CLI.

Unresolved Questions

  • What is the bare minimum we need to get rustc (and presumably LLVM and lld) to support anyref enough to use them and WebIDL bindings together without wasm-bindgen post-processing the .wasm binary?

    • I'm assuming we want to just immediately spill anyrefs to a shadow stack or heap immediately, work with them as indices, and avoid the question of actually representing anyrefs at the Rust language level directly. At least for the foreseeable future.

    • So rustc and LLVM need some knowledge about the anyref table that we use for the shadow stack and heap, and what function to call to allocate into them? Presumably that function is returning an index for the anyref to move into, so that these functions can be authored in Rust, and don't have to touch an anyref directly.

    • Alternatively, it might be simpler to only have an anyref table that is used as a shadow stack at the LLVM level, and then separately at the Rust level also expose intrinsics for copying anyrefs from that shadow stack to a separate anyref table that we use as a heap? This might be a smaller set of moving parts that we have to get everyone to agree upon...

  • Presumably the user-facing building block for emitting WebIDL bindings is some sort of attribute on an extern declaration. Not clear to me exactly what this attribute looks like, but I think it should be pretty easy to design in comparison to the last point.

  • I think all the borrowed-vs-owned anyref stuff can be handled in the Rust source that the proc-macro generates when we are managing the anyref table from Rust instead of in the JS glue. I don't think this stuff leaks into anything the WebIDL bindings needs to know about (other than WebIDL imports returning strings, which it has facilities for). But I'm not 100% sure on this point.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions