Skip to content
Closed
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
e6d6c98
CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-…
dsarno Sep 5, 2025
6e2a1b3
CI: support both ULF + EBL; validate ULF before -manualLicenseFile; r…
dsarno Sep 5, 2025
3a33f75
CI: activate EBL via container using UNITY_IMAGE; fix readiness regex…
dsarno Sep 5, 2025
f9141e4
CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error p…
dsarno Sep 5, 2025
6a253ed
CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manua…
dsarno Sep 5, 2025
e69b460
CI: assert EBL activation writes entitlement to host mount; fail fast…
dsarno Sep 5, 2025
cb6c694
CI: use heredoc in wait step to avoid nested-quote issues; remove red…
dsarno Sep 5, 2025
124a872
CI: harden wait step (container status check, broader ready patterns,…
dsarno Sep 5, 2025
6f8695b
CI: wait step — confirm bridge readiness via status JSON (unity_port)…
dsarno Sep 5, 2025
deac721
CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP prob…
dsarno Sep 5, 2025
2ffcb56
CI: refine license error pattern to ignore benign LicensingClient cha…
dsarno Sep 5, 2025
3e89666
Improve Unity bridge wait logic in CI workflow
dsarno Sep 6, 2025
ece0836
Add comprehensive Unity workflow improvements
dsarno Sep 6, 2025
eadfa7b
Refine Unity workflow licensing and permissions
dsarno Sep 6, 2025
5a11cbf
fix workflow YAML parse
dsarno Sep 6, 2025
0d43cfd
Merge pull request #65 from dsarno/codex/fix-wait-gate-for-unity-lice…
dsarno Sep 6, 2025
1b63bb6
Normalize NL/T JUnit names and robust summary
dsarno Sep 6, 2025
ebbed23
Merge pull request #66 from dsarno/codex/fix-testcase-naming-mismatch…
dsarno Sep 6, 2025
0e2c324
Fix Python import syntax in workflow debug step
dsarno Sep 6, 2025
3bc7bf5
Improve prompt clarity for XML test fragment format
dsarno Sep 6, 2025
fd626ea
Fix problematic regex substitution in test name canonicalization
dsarno Sep 6, 2025
181f3ad
CI: NL/T hardening — enforce filename-derived IDs, robust backfill, s…
dsarno Sep 6, 2025
3e49259
fix: keep file ID when canonicalizing test names
dsarno Sep 6, 2025
40463d1
Merge pull request #67 from dsarno/codex/fix-testcase-naming-to-respe…
dsarno Sep 6, 2025
d0937e8
CI: move Unity Pro license return to teardown after stopping Unity; k…
dsarno Sep 6, 2025
c86c683
CI: remove revert helper & baseline snapshot; stop creating scripts d…
dsarno Sep 6, 2025
e1d8ac5
CI: remove mini workflow and obsolete NL prompts; redact email in all…
dsarno Sep 6, 2025
8234a5d
NL/T prompt: enforce allowed ops, require per-test fragment emission …
dsarno Sep 6, 2025
c92f605
NL suite: enforce strict NL-4 emission; remove brittle relabeling; ke…
dsarno Sep 6, 2025
2598516
NL/T: minimize transcript; tighten NL-4 console reads; add final erro…
dsarno Sep 6, 2025
7a73d98
CI: add staged report fragment promotion step (reports/_staging -> re…
dsarno Sep 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 0 additions & 45 deletions .claude/prompts/nl-unity-claude-tests-mini.md

This file was deleted.

122 changes: 112 additions & 10 deletions .claude/prompts/nl-unity-suite-full-additive.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,25 @@ AllowedTools: Write,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__un
2) Execute **all** NL/T tests in order using minimal, precise edits that **build on each other**.
3) Validate each edit with `mcp__unity__validate_script(level:"standard")`.
4) **Report**: write one `<testcase>` XML fragment per test to `reports/<TESTID>_results.xml`. Do **not** read or edit `$JUNIT_OUT`.

**CRITICAL XML FORMAT REQUIREMENTS:**
- Each file must contain EXACTLY one `<testcase>` root element
- NO prologue, epilogue, code fences, or extra characters
- NO markdown formatting or explanations outside the XML
- Use this exact format:

```xml
<testcase name="T-D — End-of-Class Helper" classname="UnityMCP.NL-T">
<system-out><![CDATA[
(evidence of what was accomplished)
]]></system-out>
</testcase>
```

- If test fails, include: `<failure message="reason"/>`
- TESTID must be one of: NL-0, NL-1, NL-2, NL-3, NL-4, T-A, T-B, T-C, T-D, T-E, T-F, T-G, T-H, T-I, T-J
5) **NO RESTORATION** - tests build additively on previous state.
6) **STRICT FRAGMENT EMISSION** - After each test, immediately emit a clean XML file under `reports/<TESTID>_results.xml` with exactly one `<testcase>` whose `name` begins with the exact test id. No prologue/epilogue or fences. If the test fails, include a `<failure message="..."/>` and still emit.

---

Expand All @@ -32,7 +50,12 @@ CI provides:
## Tool Mapping
- **Anchors/regex/structured**: `mcp__unity__script_apply_edits`
- Allowed ops: `anchor_insert`, `replace_method`, `insert_method`, `delete_method`, `regex_replace`
- For `anchor_insert`, always set `"position": "before"` or `"after"`.
- **Precise ranges / atomic batch**: `mcp__unity__apply_text_edits` (non‑overlapping ranges)
STRICT OP GUARDRAILS
- Do not use `anchor_replace`. Structured edits must be one of: `anchor_insert`, `replace_method`, `insert_method`, `delete_method`, `regex_replace`.
- For multi‑spot textual tweaks in one operation, compute non‑overlapping ranges with `mcp__unity__find_in_file` and use `mcp__unity__apply_text_edits`.

- **Hash-only**: `mcp__unity__get_sha` — returns `{sha256,lengthBytes,lastModifiedUtc}` without file body
- **Validation**: `mcp__unity__validate_script(level:"standard")`
- **Dynamic targeting**: Use `mcp__unity__find_in_file` to locate current positions of methods/markers
Expand All @@ -49,7 +72,8 @@ CI provides:
5. **Composability**: Tests demonstrate how operations work together in real workflows

**State Tracking:**
- Track file SHA after each test to ensure operations succeeded
- Track file SHA after each test (`mcp__unity__get_sha`) and use it as a precondition
for `apply_text_edits` in T‑F/T‑G/T‑I to exercise `stale_file` semantics.
- Use content signatures (method names, comment markers) to verify expected state
Comment on lines +87 to 89
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm precondition parameter name used by apply_text_edits.

The prompt requires SHA preconditions but doesn’t name the exact field. Verify the MCP server expects, e.g., preconditionSha256 (or equivalent), to avoid silent no-ops.


🏁 Script executed:

#!/bin/bash
# Locate the apply_text_edits handler and its request schema to confirm the precondition field name.
rg -nP -C3 '\bapply_text_edits\b|precondition|sha256|If-Match' UnityMcpBridge/UnityMcpServer~ --type py --type ts --type cs || true

Length of output: 26666


Specify precondition_sha256 in the prompt
Use the exact precondition_sha256 parameter name for apply_text_edits (not preconditionSha256 or similar) to ensure the SHA is passed correctly and avoid silent no-ops.

🧰 Tools
🪛 LanguageTool

[grammar] ~72-~72: There might be a mistake here.
Context: .../T‑I to exercise stale_file semantics. - Use content signatures (method names, co...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In .claude/prompts/nl-unity-suite-full-additive.md around lines 71 to 73, the
prompt references passing a file SHA to apply_text_edits but does not use the
exact parameter name; update the prompt to use the exact snake_case parameter
name precondition_sha256 (not preconditionSha256 or any other variant) wherever
apply_text_edits is described or an example call is shown so the SHA is passed
correctly and avoids silent no-ops.

- Validate structural integrity after each major change

Expand Down Expand Up @@ -85,7 +109,8 @@ CI provides:
### NL-3. End-of-Class Content (Additive State C)
**Goal**: Demonstrate end-of-class insertions with smart brace matching
**Actions**:
- Use anchor pattern to find the class-ending brace (accounts for previous additions)
- Match the final class-closing brace by scanning from EOF (e.g., last `^\s*}\s*$`)
or compute via `find_in_file` + ranges; insert immediately before it.
- Insert three comment lines before final class brace:
```
// Tail test A
Expand Down Expand Up @@ -115,7 +140,7 @@ CI provides:
**Actions**:
- Use `find_in_file` to locate current `HasTarget()` method (modified in NL-1)
- Edit method body interior: change return statement to `return true; /* test modification */`
- Use `validate: "relaxed"` for interior-only edit
- Validate with `mcp__unity__validate_script(level:"standard")` for consistency
- Verify edit succeeded and file remains balanced
- **Expected final state**: State C + modified HasTarget() body

Expand All @@ -132,12 +157,14 @@ CI provides:
**Actions**:
- Use smart anchor matching to find current class-ending brace (after NL-3 tail comments)
- Insert permanent helper before class brace: `private void TestHelper() { /* placeholder */ }`
- Validate with `mcp__unity__validate_script(level:"standard")`
- **IMMEDIATELY** write clean XML fragment to `reports/T-D_results.xml` (no extra text). The `<testcase name>` must start with `T-D`. Include brief evidence and the latest SHA in `system-out`.
- **Expected final state**: State E + TestHelper() method before class end

### T-E. Method Evolution Lifecycle (Additive State G)
**Goal**: Insert → modify → finalize a method through multiple operations
**Goal**: Insert → modify → finalize a field + companion method
**Actions**:
- Insert basic method: `private int Counter = 0;`
- Insert field: `private int Counter = 0;`
- Update it: find and replace with `private int Counter = 42; // initialized`
- Add companion method: `private void IncrementCounter() { Counter++; }`
- **Expected final state**: State F + Counter field + IncrementCounter() method
Expand All @@ -152,6 +179,7 @@ CI provides:
3. Add final class comment: `// end of test modifications`
- All edits computed from same file snapshot, applied atomically
- **Expected final state**: State G + three coordinated comments
- After applying the atomic edits, run `validate_script(level:"standard")` and emit a clean fragment to `reports/T-F_results.xml` with a short summary and the latest SHA.

### T-G. Path Normalization Test (No State Change)
**Goal**: Verify URI forms work equivalently on modified file
Expand All @@ -161,13 +189,15 @@ CI provides:
- Second should return `stale_file`, retry with updated SHA
- Verify both URI forms target same file
- **Expected final state**: State H (no content change, just path testing)
- Emit `reports/T-G_results.xml` showing evidence of stale SHA handling and final SHA.

### T-H. Validation on Modified File (No State Change)
**Goal**: Ensure validation works correctly on heavily modified file
**Actions**:
- Run `validate_script(level:"standard")` on current state
- Verify no structural errors despite extensive modifications
- **Expected final state**: State H (validation only, no edits)
- Emit `reports/T-H_results.xml` confirming validation OK and including the latest SHA.

### T-I. Failure Surface Testing (No State Change)
**Goal**: Test error handling on real modified file
Expand All @@ -176,13 +206,18 @@ CI provides:
- Attempt edit with stale SHA (should fail cleanly)
- Verify error responses are informative
- **Expected final state**: State H (failed operations don't modify file)
- Emit `reports/T-I_results.xml` capturing error evidence and final SHA; file must contain one `<testcase>`.

### T-J. Idempotency on Modified File (Additive State I)
**Goal**: Verify operations behave predictably when repeated
**Actions**:
- Add unique marker comment: `// idempotency test marker`
- Attempt to add same comment again (should detect no-op)
- Remove marker, attempt removal again (should handle gracefully)
- **Insert (structured)**: `mcp__unity__script_apply_edits` with:
`{"op":"anchor_insert","anchor":"// Tail test C","position":"after","text":"\n // idempotency test marker"}`
- **Insert again** (same op) → expect `no_op: true`.
- **Remove (structured)**: `{"op":"regex_replace","pattern":"(?m)^\\s*// idempotency test marker\\r?\\n?","text":""}`
- **Remove again** (same `regex_replace`) → expect `no_op: true`.
- `mcp__unity__validate_script(level:"standard")`
- **IMMEDIATELY** write clean XML fragment to `reports/T-J_results.xml` with evidence of both `no_op: true` outcomes. The `<testcase name>` must start with `T-J` and include the latest SHA.
- **Expected final state**: State H + verified idempotent behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make T‑J state semantics consistent (it leaves content unchanged).

T‑J’s header says “Additive State I”, while the steps remove what they insert and the expected state says “State H”. Keep it “No State Change” for clarity.

-### T-J. Idempotency on Modified File (Additive State I)
+### T-J. Idempotency on Modified File (No State Change)
-**Expected final state**: State H + verified idempotent behavior
+**Expected final state**: State H (verified idempotent behavior; no content change)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### T-J. Idempotency on Modified File (Additive State I)
**Goal**: Verify operations behave predictably when repeated
**Actions**:
- Add unique marker comment: `// idempotency test marker`
- Attempt to add same comment again (should detect no-op)
- Remove marker, attempt removal again (should handle gracefully)
- **Insert (structured)**: `mcp__unity__script_apply_edits` with:
`{"op":"anchor_insert","anchor":"// Tail test C","position":"after","text":"\n // idempotency test marker"}`
- **Insert again** (same op) → expect `no_op: true`.
- **Remove (structured)**: `{"op":"regex_replace","pattern":"(?m)^\\s*// idempotency test marker\\r?\\n?","text":""}`
- **Remove again** (same `regex_replace`) → expect `no_op: true`.
- `mcp__unity__validate_script(level:"standard")`
- **IMMEDIATELY** write clean XML fragment to `reports/T-J_results.xml` with evidence of both `no_op: true` outcomes. The `<testcase name>` must start with `T-J` and include the latest SHA.
- **Expected final state**: State H + verified idempotent behavior
### T-J. Idempotency on Modified File (No State Change)
**Goal**: Verify operations behave predictably when repeated
**Actions**:
- **Insert (structured)**: `mcp__unity__script_apply_edits` with:
`{"op":"anchor_insert","anchor":"// Tail test C","position":"after","text":"\n // idempotency test marker"}`
- **Insert again** (same op) → expect `no_op: true`.
- **Remove (structured)**: `{"op":"regex_replace","pattern":"(?m)^\\s*// idempotency test marker\\r?\\n?","text":""}`
- **Remove again** (same `regex_replace`) → expect `no_op: true`.
- `mcp__unity__validate_script(level:"standard")`
- **IMMEDIATELY** write clean XML fragment to `reports/T-J_results.xml` with evidence of both `no_op: true` outcomes. The `<testcase name>` must start with `T-J` and include the latest SHA.
- **Expected final state**: State H (verified idempotent behavior; no content change)
🧰 Tools
🪛 LanguageTool

[grammar] ~207-~207: There might be a mistake here.
Context: ...ency on Modified File (Additive State I) Goal: Verify operations behave predict...

(QB_NEW_EN)


[grammar] ~209-~209: There might be a mistake here.
Context: ...e predictably when repeated Actions: - Insert (structured): `mcp__unity__scri...

(QB_NEW_EN)


[grammar] ~213-~213: There might be a mistake here.
Context: ...o_op: true. - **Remove (structured)**: {"op":"regex_replace","pattern":"(?m)^\s*// idempotency test marker\r?\n?","text":""}- **Remove again** (sameregex_replace`) → ...

(QB_NEW_EN)


[grammar] ~216-~216: There might be a mistake here.
Context: ...pt(level:"standard")- **IMMEDIATELY** write clean XML fragment toreports/T-J_resu...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In .claude/prompts/nl-unity-suite-full-additive.md around lines 207–218, the
test header and expected-final-state are inconsistent (header reads "Additive
State I" while steps undo their changes and expected state says "State H");
change the header to reflect "No State Change (Idempotency)" or similar, update
any inline semantics text to state that the test leaves content unchanged, and
replace the "Expected final state: State H" line with "Expected final state: No
State Change" so the header, steps, and expected outcome are consistent.

---
Expand Down Expand Up @@ -219,7 +254,8 @@ find_in_file(pattern: "public bool HasTarget\\(\\)")
1. Verify expected content exists: `find_in_file` for key markers
2. Check structural integrity: `validate_script(level:"standard")`
3. Update SHA tracking for next test's preconditions
4. Log cumulative changes in test evidence
4. Emit a per‑test fragment to `reports/<TESTID>_results.xml` immediately. If the test failed, still write a single `<testcase>` with a `<failure message="..."/>` and evidence in `system-out`.
5. Log cumulative changes in test evidence

**Error Recovery:**
- If test fails, log current state but continue (don't restore)
Expand All @@ -237,4 +273,70 @@ find_in_file(pattern: "public bool HasTarget\\(\\)")
5. **Better Failure Analysis**: Failures don't cascade - each test adapts to current reality
6. **State Evolution Testing**: Validates SDK handles cumulative file modifications correctly

This additive approach produces a more realistic and maintainable test suite that better represents actual SDK usage patterns.
This additive approach produces a more realistic and maintainable test suite that better represents actual SDK usage patterns.

---

BAN ON EXTRA TOOLS AND DIRS
- Do not use any tools outside `AllowedTools`. Do not create directories; assume `reports/` exists.

---

## XML Fragment Templates (T-F .. T-J)

Use these skeletons verbatim as a starting point. Replace the bracketed placeholders with your evidence and the latest SHA. Ensure each file contains exactly one `<testcase>` element and that the `name` begins with the exact test id.

```xml
<testcase name="T-F — Atomic Multi-Edit" classname="UnityMCP.NL-T">
<system-out><![CDATA[
Applied 3 non-overlapping edits in one atomic call:
- HasTarget(): added "// validated access"
- ApplyBlend(): added "// safe animation"
- End-of-class: added "// end of test modifications"
validate_script: OK
SHA: [sha-here]
]]></system-out>
</testcase>
```

```xml
<testcase name="T-G — Path Normalization Test" classname="UnityMCP.NL-T">
<system-out><![CDATA[
Edit via unity://path/... succeeded.
Same edit via Assets/... returned stale_file, retried with updated SHA: OK.
Final SHA: [sha-here]
]]></system-out>
</testcase>
```

```xml
<testcase name="T-H — Validation on Modified File" classname="UnityMCP.NL-T">
<system-out><![CDATA[
validate_script(level:"standard"): OK on the modified file.
SHA: [sha-here]
]]></system-out>
</testcase>
```

```xml
<testcase name="T-I — Failure Surface Testing" classname="UnityMCP.NL-T">
<system-out><![CDATA[
Overlapping edit: failed cleanly (error captured).
Stale SHA edit: failed cleanly (error captured).
File unchanged; final SHA: [sha-here]
]]></system-out>
</testcase>
```

```xml
<testcase name="T-J — Idempotency on Modified File" classname="UnityMCP.NL-T">
<system-out><![CDATA[
Insert marker after "// Tail test C": OK.
Insert same marker again: no_op: true.
regex_remove marker: OK.
regex_remove again: no_op: true.
validate_script: OK.
SHA: [sha-here]
]]></system-out>
</testcase>
```
Loading