Description
In all generated vector store instruction tests like (vse8.v), the resultdata area is never cleared or advanced between consecutive tests. Every test stores into the same base address and then reloads from the same address, so any bytes not overwritten by the current test retain values written by the previous test. This causes TEST_CASE checks to read stale data and produce incorrect pass/fail results — most severely in masked tests where VL is small or zero and few (or no) bytes are actually written.
Affected instructions
All unit-stride, strided, and indexed vector store instructions where the test generator emits a la a0, resultdata reload after the store block, including but not limited to:
vse8.v, vse16.v, vse32.v, vse64.v
vsse8.v, vsse16.v, vsse32.v, vsse64.v
vsoxei*.v, vsuxei*.v
- All masked (
v0.t) variants of the above
Root cause
After every store test block the generator emits a reload of v1 from resultdata:
la a0, resultdata
vle8.v v1, (a0)
The pointer always resets to the base of resultdata with no offset increment and no zeroing. When the next test writes fewer elements than the previous one (e.g. masked store with VL=0), the unwritten bytes still hold data from the earlier test. The TEST_CASE macros then read those stale bytes as if they were produced by the current test.
Minimal example
# Test 1 — VL:2, mf8, e8, mask:false
la a0, resultdata
vse8.v v1, (a0) # writes 2 elements at resultdata+0
vle8.v v1, (a0) # reloads — ok here
TEST_CASE(34, t0, 0xf020100, ld t0, 0(a0); addi a0, a0, 8)
# ^^^ passes, but resultdata still holds 0xf020100 in bytes 0-3
# Test 2 — VL:0, mf8, e8, mask:true (should write nothing)
la a0, resultdata # ← same address, no clear, stale bytes still present
vse8.v v1, (a0), v0.t # VL=0 → writes nothing
vle8.v v1, (a0) # reloads stale bytes from test 1
TEST_CASE(38, t0, 0xf020100, ld t0, 0(a0); addi a0, a0, 8)
# ^^^ passes INCORRECTLY — value is leftover from test 1, not produced by test 2
Expected behaviour
Each test must observe only the bytes it wrote. Two approaches would fix this:
- Option A — zero resultdata before each test (recommended): emit a small clear sequence (e.g.
vmv.v.i + vse) covering the maximum store footprint before every store test block. Simple, guaranteed isolation, no offset bookkeeping needed.
- Option B — advance the resultdata pointer: track a running byte offset in the generator and bump it by
LMUL × VLEN / SEW bytes after each test, resetting to the base when the region would overflow. More efficient at runtime but requires careful offset tracking in the generator.
Option A is simpler to implement and less error-prone.
Impact
- Masked store tests with VL=0 always read stale data — expected values in
TEST_CASE are wrong.
- Any store test that writes fewer elements than the previous test inherits leftover bytes in the upper positions.
- Tests may spuriously pass (hiding real bugs in the implementation under test) or spuriously fail depending on test ordering.
Description
In all generated vector store instruction tests like (
vse8.v), theresultdataarea is never cleared or advanced between consecutive tests. Every test stores into the same base address and then reloads from the same address, so any bytes not overwritten by the current test retain values written by the previous test. This causesTEST_CASEchecks to read stale data and produce incorrect pass/fail results — most severely in masked tests where VL is small or zero and few (or no) bytes are actually written.Affected instructions
All unit-stride, strided, and indexed vector store instructions where the test generator emits a
la a0, resultdatareload after the store block, including but not limited to:vse8.v,vse16.v,vse32.v,vse64.vvsse8.v,vsse16.v,vsse32.v,vsse64.vvsoxei*.v,vsuxei*.vv0.t) variants of the aboveRoot cause
After every store test block the generator emits a reload of
v1fromresultdata:The pointer always resets to the base of
resultdatawith no offset increment and no zeroing. When the next test writes fewer elements than the previous one (e.g. masked store with VL=0), the unwritten bytes still hold data from the earlier test. TheTEST_CASEmacros then read those stale bytes as if they were produced by the current test.Minimal example
Expected behaviour
Each test must observe only the bytes it wrote. Two approaches would fix this:
vmv.v.i+vse) covering the maximum store footprint before every store test block. Simple, guaranteed isolation, no offset bookkeeping needed.LMUL × VLEN / SEWbytes after each test, resetting to the base when the region would overflow. More efficient at runtime but requires careful offset tracking in the generator.Option A is simpler to implement and less error-prone.
Impact
TEST_CASEare wrong.