Skip to content

Implement the dot4add_i8packed HLSL Function #99220

Closed
@farzonl

Description

@farzonl
  • Implement dot4add_i8packed clang builtin,
  • Link dot4add_i8packed clang builtin with hlsl_intrinsics.h
  • Add sema checks for dot4add_i8packed to CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
  • Add codegen for dot4add_i8packed to EmitHLSLBuiltinExpr in CGBuiltin.cpp
  • Add codegen tests to clang/test/CodeGenHLSL/builtins/dot4add_i8packed.hlsl
  • Add sema tests to clang/test/SemaHLSL/BuiltIns/dot4add_i8packed-errors.hlsl
  • Create the int_dx_dot4add_i8packed intrinsic in IntrinsicsDirectX.td
  • Create the DXILOpMapping of int_dx_dot4add_i8packed to 163 in DXIL.td
  • Create the dot4add_i8packed.ll and dot4add_i8packed_errors.ll tests in llvm/test/CodeGen/DirectX/
  • Create the int_spv_dot4add_i8packed intrinsic in IntrinsicsSPIRV.td
  • In SPIRVInstructionSelector.cpp create the dot4add_i8packed lowering and map it to int_spv_dot4add_i8packed in SPIRVInstructionSelector::selectIntrinsic.
  • Create SPIR-V backend test case in llvm/test/CodeGen/SPIRV/hlsl-intrinsics/dot4add_i8packed.ll

DirectX

DXIL Opcode DXIL OpName Shader Model Shader Stages
163 Dot4AddI8Packed 6.4 ()

SPIR-V

OpSDot:

Description:

Signed integer dot product of Vector 1 and Vector 2.

Result Type must be an integer type whose Width must be greater than
or equal to that of the components of Vector 1 and Vector 2.

Vector 1 and Vector 2 must have the same type.

Vector 1 and Vector 2 must be either 32-bit integers (enabled by the
DotProductInput4x8BitPacked capability) or vectors of
integer type (enabled by the DotProductInput4x8Bit or
DotProductInputAll capability).

When Vector 1 and Vector 2 are scalar integer types, Packed Vector
Format
must be specified to select how the integers are to be
interpreted as vectors.

All components of the input vectors are sign-extended to the bit width
of the result’s type. The sign-extended input vectors are then
multiplied component-wise and all components of the vector resulting
from the component-wise multiplication are added together. The resulting
value will equal the low-order N bits of the correct result R, where N
is the result width and R is computed with enough precision to avoid
overflow and underflow.

Capability:
DotProduct

Missing before version 1.6.

Word Count Opcode Results Operands

5 + variable

4450

<id>
Result Type

Result <id>

<id>
Vector 1

<id>
Vector 2

Optional
Packed Vector Format
Packed Vector Format

Test Case(s)

Example 1

//dxc dot4add_i8packed_test.hlsl -T lib_6_8 -enable-16bit-types -O0

export int fn(uint p1, uint p2, int p3) {
    return dot4add_i8packed(p1, p2, p3);
}

HLSL:

Syntax

int dot4add_i8packed(uint a, uint b, int c);

Type Description

Name Template Type Component Type Size
ret scalar int 1
a scalar uint 1
b scalar uint 1
c scalar int 1

Minimum Shader Model

This function is supported in the following shader models.

Shader Model Supported
Shader Model 6.4 and higher shader models yes

Shader Stages

See also

Metadata

Metadata

Assignees

Labels

HLSLHLSL Language Supportbackend:DirectXbackend:SPIR-Vbot:HLSLclang:codegenIR generation bugs: mangling, exceptions, etc.metaissueIssue to collect references to a group of similar or related issues.

Type

No type

Projects

Status

Closed

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions