Description
Where we are Now
Here is a simplified view of our Rust and Wasm toolchain, its artifacts and the processes that input and output these artifacts:
+--------------+
| Rust Sources |
+--------------+
|
|
rustc
|
|
V
+----------------------------------------+
| Initial .wasm w/ wasm-bindgen metadata |
+----------------------------------------+
|
|
wasm-bindgen CLI ----------.
| |
| |
V V
+-------------+ +------------------+
| Final .wasm | | JS Bindings Glue |
+-------------+ +------------------+
Right now the initial .wasm
contains two things that are used by the wasm-bindgen
CLI, and stripped from the final .wasm
binary:
-
A
wasm-bindgen
-specific custom section. This contains a summary of all the imported and exported functions, structs, fields, and methods. It doesn't include functions' and methods' ABI types, however. -
"Descriptor" functions. These are evaluated by the CLI and communicate the types being sent and received over the ABI boundary for a given function or method. We do this because we want the ABI representation of types to be trait-based, so we can customize it for different types, and it was the best way at the time to support that.
The generated JS bindings glue is responsible for the following things:
-
Translating JS values going to Wasm into their ABI representation.
-
Translating values coming out of Wasm to JS from their ABI representation into JS values.
-
Instantiating the Wasm module and hooking up imported functions.
Vision
We leverage WebIDL bindings and anyref
to completely remove the wasm-bindgen
-specific metadata and 99% of the JS bindings glue:
+--------------+
| Rust Sources |
+--------------+
|
|
rustc
|
|
V
+---------------------------------------------------+
| Final .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------------+
|
|
wasm-bindgen CLI
|
|
V
+-----------------------+
| JS Instantiation Glue |
+-----------------------+
rustc
has first-class support for anyref
and emitting the WebIDL bindings custom section.
The JS bindings glue is no longer responsible for translating between ABI representations and JS values:
-
The ABI representation of most JS values is simply
anyref
. In JS hosts, ananyref
is a JS value, or at least the conversion is so transparent that they are effectively the same from our perspective. -
Strings typically use WebIDL bindings' UTF-8 string shepherding facilities (unless they are trying to handle unpaired surrogates, in which case they can work with JS string values as
anyref
s).
The only thing that the generated JS is responsible for now is WebAssembly instantiation with all of the right WebIDL function imports, and polyfilling Wasm's ESM integration.
For Browsers that Don't Support WebIDL Bindings Yet
To support browsers that don't support WebIDL bindings yet in this future world, the wasm-bindgen
CLI will also be able to act as a polyfill, removing the WebIDL bindings section from the .wasm
binary, and instead emitting equivalent JS bindings glue (essentially what it emits today).
+--------------+
| Rust Sources |
+--------------+
|
|
rustc
|
|
V
+---------------------------------------------+
| .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------+
|
|
wasm-bindgen CLI --------------------------------------.
| |
| |
V V
+----------------------------------------+ +------------------+
| .wasm w/out WebIDL Bindings nor anyref | | JS Bindings Glue |
+----------------------------------------+ +------------------+
An Intermediate Goal
Polyfill generating the WebIDL bindings section in the wasm-bindgen
CLI:
+--------------+
| Rust Sources |
+--------------+
|
|
rustc
|
|
V
+------------------------------------------------------+
| Initial .wasm w/ wasm-bindgen metadata and no anyref |
+------------------------------------------------------+
|
|
wasm-bindgen CLI
|
|
V
+---------------------------------------------+
| .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------+
Then, to run on environments that don't support WebIDL bindings yet, we use the polyfill described in the previous section to emit a final .wasm
and JS bindings glue that is equivalent to what we use today:
+--------------+
| Rust Sources |
+--------------+
|
|
rustc
|
|
V
+------------------------------------------------------+
| Initial .wasm w/ wasm-bindgen metadata and no anyref |
+------------------------------------------------------+
|
|
wasm-bindgen CLI
|
|
V
+---------------------------------------------+
| .wasm w/ WebIDL Bindings Section and anyref |
+---------------------------------------------+
|
|
wasm-bindgen CLI --------------------------------------.
| |
| |
V V
+----------------------------------------+ +------------------+
| .wasm w/out WebIDL Bindings, nor | | JS Bindings Glue |
| wasm-bindgen metadata, nor anyref | +------------------+
+----------------------------------------+
In practice, we won't need to literally invoke the wasm-bindgen
CLI twice and serialize the .wasm
to disk just to read it back again, but having the phases split and explicitly represented like this will be helpful for moving us forward, I suspect.
Things we can do now
-
Fix the current experimental
anyref
implementation and upgrade it to non-experimental. It seems to have bugs where it isn't working in Firefox/SpiderMonkey anymore. Node.js >= 12.x also has support foranyref
behind a flag, so we should start running our test suite withanyref
on Node.js in CI, instead of just checking that it builds. -
Build a crate to consume and emit the text and binary representations of the WebIDL bindings custom section. We can use this in
walrus
andwasm-bindgen
and wherever else we find a need. -
Stop emitting descriptor functions and interpreting them in the
wasm-bindgen
CLI. We do this by switching from descriptor functions to building up astatic
from associatedconst
s. Essentially a static array of the exact same thing stuff that we pass to a series of calls to__wbindgen_describe
now. -
After that, we should align the format of our custom section with the format of WebIDL bindings. We will emit a copy of this custom section for each
#[wasm_bindgen]
use, so it won't be 100% compatible with the WebIDL bindings section. Instead it will be like a pre-linked "object section", that we will link into a single section in the CLI.
Unresolved Questions
-
What is the bare minimum we need to get
rustc
(and presumably LLVM andlld
) to supportanyref
enough to use them and WebIDL bindings together withoutwasm-bindgen
post-processing the.wasm
binary?-
I'm assuming we want to just immediately spill
anyref
s to a shadow stack or heap immediately, work with them as indices, and avoid the question of actually representinganyref
s at the Rust language level directly. At least for the foreseeable future. -
So
rustc
and LLVM need some knowledge about theanyref
table that we use for the shadow stack and heap, and what function to call to allocate into them? Presumably that function is returning an index for theanyref
to move into, so that these functions can be authored in Rust, and don't have to touch ananyref
directly. -
Alternatively, it might be simpler to only have an
anyref
table that is used as a shadow stack at the LLVM level, and then separately at the Rust level also expose intrinsics for copyinganyref
s from that shadow stack to a separateanyref
table that we use as a heap? This might be a smaller set of moving parts that we have to get everyone to agree upon...
-
-
Presumably the user-facing building block for emitting WebIDL bindings is some sort of attribute on an
extern
declaration. Not clear to me exactly what this attribute looks like, but I think it should be pretty easy to design in comparison to the last point. -
I think all the borrowed-vs-owned
anyref
stuff can be handled in the Rust source that the proc-macro generates when we are managing theanyref
table from Rust instead of in the JS glue. I don't think this stuff leaks into anything the WebIDL bindings needs to know about (other than WebIDL imports returning strings, which it has facilities for). But I'm not 100% sure on this point.