Unwinding as a modifier to branching

I think there's a way to view unwinding as a modifier to branching. But before I get into that, I first want to make sure I'm on the same page as everyone with respect to branching.

## Branching

Contrary to what their names suggest, branching instructions like `br` do not correspond at the assembly level to jumping instructions. Rather, that's only half of the story. The other half has to do with the stack.

Suppose an engine has some type that is managed using reference counting&mdash;call it `refcounted`. Then consider the branch in the following code that calls some function `$foo : [] -> [refcounted]`:
```
(block $target ;; [] -> []
  (call $foo)
  (br $target)
)
```
This branch does not compile to simply a jump to the assembly-code location corresponding to `$target`. It also cleans up the portion of the stack that is not relevant to `$target`. In this case, that involves decrementing the reference count of the `refcounted` value returned by `$foo` that is sitting on the stack. In a single-threaded engine, that might in turn result in freeing the corresponding memory space and consequently require recursively decrementing the counts of any values contained therein. Depending on how large this structure is and how much of it is no longer necessary, this might take a while. So a simple instruction like `br` can actually execute a fair amount behind the scenes depending on how the engine manages resources and the stack.

All this is to illustrate that a label like `$target` more accurately corresponds to an assembly-code location *and* a location within the stack, and likewise an instruction like `br $target` more accurately corresponds to "clean up the stack up to the stack location corresponding to `$target` and then jump to the assembly-code location corresponding to `$target`".

Note, though, that "clean up" here only corresponds to the engine's resources on the stack. But what about the application's resources? That's where unwinding comes into play.

## Unwinding

For the sake of this discussion, I am going to say that unwinders are specified using `try instr1* unwind instr2* end`, which executes `instr*` but indicates that `instr2* : [] -> []` should be used to "unwind the stack", i.e. to perform application-level clean up. In a second, I'll get to how one causes the stack to be "unwound" rather than just "cleaned up".

Now consider some surface-level code using the common `finally` construct:
```
while (true) {
  File file = open(...);
  try {
    try {
      if (...)
        break;
    } catch (SomeException) {}
  } finally {
    close(file);
  }
}
```
Normally a `break` in a `while` loop would translate to a `br` in WebAssembly, but the `finally` clause in this snippet requires that its body be executed no matter how control leaves the `try` body. We could consider inlining the body of the `finally` at the `break`, but that results in code duplication, plus it would result in incorrectly catching `SomeException` if one gets thrown by `close(file)`.nor does it work as well in other examples where there are other `try`/`catch` clauses surrounding the `break`).

Really what we want to do is to extend the semantics of "clean up the stack" that is already part of branching to incorporate "and execute unwinders". That is, we want to *modify* the branch instruction so that it also unwinds the stack.

One way we could enable this is to introduce an `unwinding` instruction that must precede a branching instruction, and its semantics is to modify that branching instruction to execute the unwinders in `unwind` clauses as it cleans up the stack. With this, the `break` instruction in the example above would translate to the instruction sequence `unwinding (br $loop_exit)`.

## Exception Handling

So far I haven't talked about exception handling, just unwinding. This illustrates that unwinding the stack, like cleaning up the stack, is its own concept. And although unwinding is an integral part of exception handling, bundling it with exception handling as the current proposal does is a misunderstanding of the concept.

But if unwinding is a separable component of exception handling, what is the other component? The answer to that depends on whether you're talking about single-phase exception handling or two-phase exception handling.

### Single-Phase

Again, for the sake of this discussion, I am going to say that single-phase exception handling is done using `try instr* catch $event $label`, which indicates that any `$event` exceptions thrown from `instr*` should be caught and handled by `$label` (where the types of `$event` and `$label` match).

Now, consider the following WebAssembly program:
```
(try
  (try
    (throw $event)
  unwind
    ...
  end)
catch $event $label)
```
We can reduce this program to the following:
```
(try
  (try
    unwinding (br $label)
  unwind
    ...
  end)
catch $event $label)
```

When we can see the contents of the stack, we can replace a `throw $event` with a `unwinding (br $label)` to whatever label the event is currently bound to in the stack. That is, events are dynamically scoped variables that get bound to labels, and `throw` means "branch-and-unwind to whatever label the event is bound to in the current stack". (Of course, an important optimization is to unwind the stack as you search for these dynamically-scoped binding.)

This suggests that we can break `throw` up into two parts: `unwinding` and `br_stack $event`. The latter is an instruction that just transfers control to and does necessary cleanup up to some label determined by the current stack. This instruction on its own could even have utility, say for more severe exceptions that want to bypass unwinders or guarantee transfer.

### Two-Phase

In two-phase exception handling, you use some form of stack inspection to determine the target label *before* you execute the `unwinding` branch to that label.

For the sake of this discussion, I'll say that an inspection is begun by using the instruction `call_stack $call_tag`, which looks up the stack for contexts of the form `answer $call_tag instr1* within instr2* end` (where execution is currently within `instr2*`, in which case the instructions `instr1*` are executed as the body of a dynamically-scoped function (see WebAssembly/design#1356 for more info).

As an example of unwinding in two-phase exception handling, consider the following C# code:
```
bool flag = …;
try {
    …
} catch (Exception) when (flag = !flag) {
    println(“caught first”);
} catch (Exception) {
    println(“caught second”);
}
```

This would be compiled to the following WebAssembly code (assuming C# `throw` compiles to `call_stack $csharp_throw`):
```
(call_tag $csharp_throw : [csharp_ref] -> [])

(block $outer
    (block $first
        (block $second
            (answer $csharp_throw ;; [csharp_ref] -> []
                (local.set $flag (i32.xor 1 (local.get $flag)))
                unwinding (br_if $first (local.get $flag))
                unwinding (br $second)
            within
                …
                (br $outer)
            )
        ) ;; $second : [csharp_ref]
        (call $println “caught second”)
        (br $outer)
    ) ;; $first : [csharp_ref]
    (call $println “caught first”)
    (br $outer)
) ;; $outer : []
```
Notice that the `answer csharp_throw` has a bunch of `unwinding` branches. Which of these gets executed depends on the state of the `flag` variable at the time the exception reaches the `try` in the C# source code. (Note that there is no `try` nor events in the compiled WebAssembly.) Depending on that `flag`, we'll either having an `unwinding` branch to `$first` or an `unwinding` branch to `$second`. In either case, the semantics is "clean up and unwind the stack up to the stack location corresponding to the chosen label and then jump to the assembly-code location corresponding to the chosen label". The difference between here and the original examples using `finally` is that the portion of the stack that needs to be cleaned up and unwound is not known statically. That is important for implementation (e.g. because it requires stack walking), but semantically speaking it is straightforward and aligns well with the existing abstractions in WebAssembly.

## Summary

Regardless of whether we want to actually make an `unwinding` instruction, the important thing to note here is that unwinding is *always done with a destination*. How that destination is determined varies, but in most of the examples above the destination is known *before unwinding begins*.

The current proposal is about single-phase exception handling. But as I've tried to illustrate here, single-phase exception handling is really two concepts combined: dynamically-scoped branching *and* unwinding. So for the proposal to be extensible, it is important that its design for unwinding is compatible with other notions of destination, even if this proposal on its own solely enables dynamically-scoped destination labels (i.e. events). Ideas like those in #123 would help achieve this goal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unwinding as a modifier to branching #124

Branching

Unwinding

Exception Handling

Single-Phase

Two-Phase

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unwinding as a modifier to branching #124

Description

Branching

Unwinding

Exception Handling

Single-Phase

Two-Phase

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions