Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Unwinding as a modifier to branching #124

Open
@RossTate

Description

@RossTate

I think there's a way to view unwinding as a modifier to branching. But before I get into that, I first want to make sure I'm on the same page as everyone with respect to branching.

Branching

Contrary to what their names suggest, branching instructions like br do not correspond at the assembly level to jumping instructions. Rather, that's only half of the story. The other half has to do with the stack.

Suppose an engine has some type that is managed using reference counting—call it refcounted. Then consider the branch in the following code that calls some function $foo : [] -> [refcounted]:

(block $target ;; [] -> []
  (call $foo)
  (br $target)
)

This branch does not compile to simply a jump to the assembly-code location corresponding to $target. It also cleans up the portion of the stack that is not relevant to $target. In this case, that involves decrementing the reference count of the refcounted value returned by $foo that is sitting on the stack. In a single-threaded engine, that might in turn result in freeing the corresponding memory space and consequently require recursively decrementing the counts of any values contained therein. Depending on how large this structure is and how much of it is no longer necessary, this might take a while. So a simple instruction like br can actually execute a fair amount behind the scenes depending on how the engine manages resources and the stack.

All this is to illustrate that a label like $target more accurately corresponds to an assembly-code location and a location within the stack, and likewise an instruction like br $target more accurately corresponds to "clean up the stack up to the stack location corresponding to $target and then jump to the assembly-code location corresponding to $target".

Note, though, that "clean up" here only corresponds to the engine's resources on the stack. But what about the application's resources? That's where unwinding comes into play.

Unwinding

For the sake of this discussion, I am going to say that unwinders are specified using try instr1* unwind instr2* end, which executes instr* but indicates that instr2* : [] -> [] should be used to "unwind the stack", i.e. to perform application-level clean up. In a second, I'll get to how one causes the stack to be "unwound" rather than just "cleaned up".

Now consider some surface-level code using the common finally construct:

while (true) {
  File file = open(...);
  try {
    try {
      if (...)
        break;
    } catch (SomeException) {}
  } finally {
    close(file);
  }
}

Normally a break in a while loop would translate to a br in WebAssembly, but the finally clause in this snippet requires that its body be executed no matter how control leaves the try body. We could consider inlining the body of the finally at the break, but that results in code duplication, plus it would result in incorrectly catching SomeException if one gets thrown by close(file).nor does it work as well in other examples where there are other try/catch clauses surrounding the break).

Really what we want to do is to extend the semantics of "clean up the stack" that is already part of branching to incorporate "and execute unwinders". That is, we want to modify the branch instruction so that it also unwinds the stack.

One way we could enable this is to introduce an unwinding instruction that must precede a branching instruction, and its semantics is to modify that branching instruction to execute the unwinders in unwind clauses as it cleans up the stack. With this, the break instruction in the example above would translate to the instruction sequence unwinding (br $loop_exit).

Exception Handling

So far I haven't talked about exception handling, just unwinding. This illustrates that unwinding the stack, like cleaning up the stack, is its own concept. And although unwinding is an integral part of exception handling, bundling it with exception handling as the current proposal does is a misunderstanding of the concept.

But if unwinding is a separable component of exception handling, what is the other component? The answer to that depends on whether you're talking about single-phase exception handling or two-phase exception handling.

Single-Phase

Again, for the sake of this discussion, I am going to say that single-phase exception handling is done using try instr* catch $event $label, which indicates that any $event exceptions thrown from instr* should be caught and handled by $label (where the types of $event and $label match).

Now, consider the following WebAssembly program:

(try
  (try
    (throw $event)
  unwind
    ...
  end)
catch $event $label)

We can reduce this program to the following:

(try
  (try
    unwinding (br $label)
  unwind
    ...
  end)
catch $event $label)

When we can see the contents of the stack, we can replace a throw $event with a unwinding (br $label) to whatever label the event is currently bound to in the stack. That is, events are dynamically scoped variables that get bound to labels, and throw means "branch-and-unwind to whatever label the event is bound to in the current stack". (Of course, an important optimization is to unwind the stack as you search for these dynamically-scoped binding.)

This suggests that we can break throw up into two parts: unwinding and br_stack $event. The latter is an instruction that just transfers control to and does necessary cleanup up to some label determined by the current stack. This instruction on its own could even have utility, say for more severe exceptions that want to bypass unwinders or guarantee transfer.

Two-Phase

In two-phase exception handling, you use some form of stack inspection to determine the target label before you execute the unwinding branch to that label.

For the sake of this discussion, I'll say that an inspection is begun by using the instruction call_stack $call_tag, which looks up the stack for contexts of the form answer $call_tag instr1* within instr2* end (where execution is currently within instr2*, in which case the instructions instr1* are executed as the body of a dynamically-scoped function (see WebAssembly/design#1356 for more info).

As an example of unwinding in two-phase exception handling, consider the following C# code:

bool flag = …;
try {
    …
} catch (Exception) when (flag = !flag) {
    println(“caught first”);
} catch (Exception) {
    println(“caught second”);
}

This would be compiled to the following WebAssembly code (assuming C# throw compiles to call_stack $csharp_throw):

(call_tag $csharp_throw : [csharp_ref] -> [])

(block $outer
    (block $first
        (block $second
            (answer $csharp_throw ;; [csharp_ref] -> []
                (local.set $flag (i32.xor 1 (local.get $flag)))
                unwinding (br_if $first (local.get $flag))
                unwinding (br $second)
            within
                …
                (br $outer)
            )
        ) ;; $second : [csharp_ref]
        (call $println “caught second”)
        (br $outer)
    ) ;; $first : [csharp_ref]
    (call $println “caught first”)
    (br $outer)
) ;; $outer : []

Notice that the answer csharp_throw has a bunch of unwinding branches. Which of these gets executed depends on the state of the flag variable at the time the exception reaches the try in the C# source code. (Note that there is no try nor events in the compiled WebAssembly.) Depending on that flag, we'll either having an unwinding branch to $first or an unwinding branch to $second. In either case, the semantics is "clean up and unwind the stack up to the stack location corresponding to the chosen label and then jump to the assembly-code location corresponding to the chosen label". The difference between here and the original examples using finally is that the portion of the stack that needs to be cleaned up and unwound is not known statically. That is important for implementation (e.g. because it requires stack walking), but semantically speaking it is straightforward and aligns well with the existing abstractions in WebAssembly.

Summary

Regardless of whether we want to actually make an unwinding instruction, the important thing to note here is that unwinding is always done with a destination. How that destination is determined varies, but in most of the examples above the destination is known before unwinding begins.

The current proposal is about single-phase exception handling. But as I've tried to illustrate here, single-phase exception handling is really two concepts combined: dynamically-scoped branching and unwinding. So for the proposal to be extensible, it is important that its design for unwinding is compatible with other notions of destination, even if this proposal on its own solely enables dynamically-scoped destination labels (i.e. events). Ideas like those in #123 would help achieve this goal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions