Skip to content

Resource arrays design #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
141 changes: 139 additions & 2 deletions proposals/0028-resource-arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -348,11 +348,148 @@ call %dx.types.Handle @dx.op.createHandleFromBinding(i32 217, %dx.types.ResBind

## Proposed solution

To avoid issues described
[above](#example-where-dxc-does-not-handle-local-arrays-correctly), we should
aim to resolve initialization of resource array handles early in Clang. That
involves intercepting the codegen for array access to emit the a resource class
constructor call, which will eventually be transformed to
`@dx.op.createHandleFromBinding` DXIL intrinsic. Creating local copies of
resource arrays is mostly handled by the existing array parameter passing code,
but additional work will likely be needed to support indexing subsets of
multi-dimensional arrays.

### Changes from DXC

One notable difference from DXC behavior that we propose for Clang concerns the
handling of unbounded resource arrays:

*Local declarations of unbounded resource arrays will not be allowed.*

Locally-scoped unbounded resource arrays are incorrect from a language
perspective, as are unbounded arrays used as function arguments. Local arguments
are initialized by creating a local copy of the array, which does not make sense
for arrays of unknown size.

Unbounded resource array declarations will only be allowed at global scope and
can only be referenced through the global variable with which they are declared.

## Detailed design

## Alternatives considered (Optional)
### Codegen for Resource Array Indexing

We need to intercept codegen for `ArraySubscriptExpr` in Clang codegen. If the
indexed array is a resource array declared at global scope and the expected
result is a single resource, the array element access should be generated as a
resource class constructor call with the appropriate indexing and binding
values.

For example in this simple shader:

```
RWBuffer<float> A[4] : register(u10);
RWStructuredBuffer<float> Out;

[numthreads(4,1,1)]
void main() {
Out[0] = A[2][1];
}
```

The code generated for the `A[2][1]` expression should be equivalent
`RWBuffer<float>(10, 0, 4, 2, "A")[1]` - that is, a constructor call followed by a
subscript operator invoked on the resource class. Note that the constructor
arguments represent the resource binding register `10`, space `0`, size of the
array `4`, and index in the resource array `2`.

Unlike individual resource declarations, resource arrays will not be initialized
by Sema. However, Sema must ensure that the constructor for the specific
resource template class is instantiated and its definition is emitted, so that
Clang codegen can call it.

### Codegen Changes to Handle Local Resource Arrays

Local copies of resource arrays should be mostly handled by the existing code
that creates array copies for function arguments and by the presence of copy
constructors on resource classes. Some additional work will likely be required
to support indexing sub-arrays of multi-dimensional resource arrays.

## Acknowledgments (Optional)
Consider this example:

```
RWBuffer<float> N[10][5] : register(u0);
RWBuffer<float> Out;

float foo(RWBuffer<float> P[5]) {
return P[3][0];
}

[numthreads(4,1,1)]
void main() {
Out[0] = foo(N[7]);
}
```

The `N[7]` expression should create a local copy of the sub-array of size `5`. The
changes needed to support this will most likely be required in the same part of
Clang code generation that handles `ArraySubscriptExpr`.

### Changed Needed in LLVM Passes

After Clang code generation and LLVM optimizations, the code related to resource
arrays will most likely require some changes in LLVM backend passes.

The existing `DXILForwardHandleAccesses` pass aims to eliminate redundant stores
and loads from resource handle globals. However, it currently expects resource
handles to be loaded from a global variable, which is not the case for resource
array element handles.

Additionally, local resource arrays and their copies tend to generate IR code
that includes resource handles stored in local variables (using `alloca` and
lifetime markers), with `load` and `store` operations on those handles through
an `i32` type.These will need to be either cleaned up in the
`DXILForwardHandleAccesses` pass, or changes may need to be made to Clang
codegen for array copying or to subsequent LLVM array optimizations to
eliminate these unwanted constructs. The scope of this work is currently TBD.

## Alternatives Considered

### Resolving Resource Array Indexing in LLVM Pass

Following the DXC approach, leaving resource array handle resolution until after
optimization, to be performed in a dedicated LLVM pass. We would like to avoid
this because this can get very complicated very fast. The DXC pass that handles
this is notoriously buggy (see example
[here](#example-where-dxc-does-not-handle-local-arrays-correctly)).

### Use Builtin Template Class to Represent Resource Arrays

Another approach considered is to encapsulate resource arrays in a built-in
template class defined in `HLSLExternalSemaSource`. The class could be named
`BoundResourceRange` to emphasize that it represents a set of resources
mapped/bound to a descriptor range, rather than a traditional array. The
template arguments would specify the array size and the resource type it
contains.Any resource array variable declaration at global scope would have its
type replaced with the corresponding `BoundResourceRange` specialization.

The class would provide a subscript operator (`[]`) that initializes individual
resource handles as they are accessed. It would support unbounded arrays, and
multi-dimensional arrays could be represented by nesting `BoundResourceRange`

Introducing such a class raises several new questions:
- Should it be directly spellable by users?
- Should users have the option to use either syntax?
- Should we allow explicit `BoundResourceRange` function arguments?
- Does the rewriter need to preserve the original array declaration syntax, or
is it acceptable to replace resource arrays with `BoundResourceRange`?
- Handling of resources arrays in Sema would need to support both
`BoundResourceRange` and resource arrays types.
- reating a local copy of a multi-dimensional array cannot be implemented by the
class alone and would require changes in Clang code generation anyway.

## Acknowledgments

Chris Bieneman
Justin Bogner
Tex Riddell

<!-- {% endraw %} -->