externref/funcref distinction fundamental for toolchain?

Hello,

Going to implement `ref.null` support in LLVM, I ran into an interesting issue.  The summary is that I think it makes sense to define an LLVM-specific top type encompassing both `externref` and `funcref`; an `anyref`, if you will.

Concretely, `ref.null` needs a type operand.  The instruction is specified as being `ref.null REFTYPE` (https://webassembly.github.io/reference-types/core/syntax/instructions.html#reference-instructions), and encodes as such in the binary.

Currently `REFTYPE` can only be `externref` or `funcref`.  However with [typed function references](https://github.com/WebAssembly/function-references/blob/master/proposals/function-references/Overview.md#reference-types), the set becomes unbounded, as users can define their own e.g. `(func (i32 i32) -> (i64 f32))` and similar.  So at least on the MC layer we will need for `ref.null` to have a reftype operand.  Therefore I will probably make a `Reftype` operand kind, which is similar in a way to the [`Signature` operand to block instructions](https://github.com/llvm/llvm-project/blob/master/llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td#L185).

To compare, the approach taken in the implementation of the table instructions was to provide e.g. [`TABLE.GET_externref` for tables returning externref, and `TABLE.GET_funcref` for those returning funcref](https://github.com/llvm/llvm-project/blob/master/llvm/lib/Target/WebAssembly/WebAssemblyInstrTable.td#L15).  Given the signature operand to `ref.null` though, it is not necessary from a target point of view to have two kinds of `ref.null`.

Which leads me to my proposal: what good does it do us in LLVM to distinguish `externref` and `funcref` values as different `MachineValueType`s?  It's not sufficient to provide the information needed to `ref.null`, and yet not necessary for instructions like `table.get`.  It would be simpler if we could just treat all reference types the same.

In the case of `table.get` and similar instructions, it turns out that discriminating between `externref` and `funcref` is not necessary for the target encoding; the result of `table.get` is the type of the table.  We could remove the duplicate instruction definitions, and define `table.get` as just returning a value of type `anyref`.

If this analysis is right, we should replace the `externref` and `funcref` MVT's with one `anyref`.  If the difference is important for the instruction encoding, the instruction will have to take a `Reftype` operand.  I will work up a patch.

One question is, how do we represent `ref.null` on the IR level.  Given that the set of types is unbounded (once we have typed function references), a quick-and-dirty way would be to define the intrinsic as `anyref __builtin_wasm_ref_null(const char *type)`, and pass either `externref` or `funcref` as immediate strings.  This is just a placeholder idea, I guess.

For context, it used to be that there was just `anyref` in the reference-types proposal, but it was later [changed to `externref` and `funcref`](https://github.com/WebAssembly/reference-types/pull/87).  This was essentially for run-time concerns, AFAIU: you might want to represent function references and GC objects differently, and that forcing a top type onto them constrains run-time in undesirable ways.  I get that.  But for the compiler, it doesn't seem to me like the difference buys us anything.

Cc @tlively @sbc100 @pmatos.  If this discussion might be better elsewhere, happy to take it there :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

externref/funcref distinction fundamental for toolchain? #150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

externref/funcref distinction fundamental for toolchain? #150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions