Lift superpowers:code-reviewer agent into the requesting-code-review skill by obra · Pull Request #1299 · obra/superpowers

obra · 2026-04-28T20:08:16Z

What problem are you trying to solve?

The plugin had exactly one named agent (agents/code-reviewer.md, dispatched as superpowers:code-reviewer) used by two skills, while every other reviewer/implementer subagent in the repo is dispatched as general-purpose with the prompt template living alongside the skill that uses it.

That asymmetry caused real friction:

Two sources of truth, drifting independently. Both agents/code-reviewer.md (persona + checklist) and skills/requesting-code-review/code-reviewer.md (placeholder template) defined how a code review should look. Whenever someone updated one, the other rotted. PR fix(code-review): fix placeholder mismatch in reviewer template #915 is currently open fixing a placeholder-name mismatch ({PLAN_REFERENCE} vs {PLAN_OR_REQUIREMENTS}) that exists only because the template-shape definition is split across two files. My refactor folds the two into one self-contained template; the bug fix(code-review): fix placeholder mismatch in reviewer template #915 is fixing disappears as a side effect.
Codex users couldn't dispatch the named agent. skills/using-superpowers/references/codex-tools.md had a "Named agent dispatch" workaround section explaining how Codex users had to flatten the named agent into a worker dispatch — a section that documented behavior superpowers itself was the only consumer of.
PR fix: distinguish Skills from Agent types in subagent-driven-development Integration section #1078 is open trying to disambiguate Skills from Agent types in the SDD skill docs because Claude was getting confused about whether superpowers:code-reviewer was a skill or an agent. The named agent's existence is what created that confusion.

The thing that prompted the audit: I was re-reading the SDD skill prompt templates and noticed subagent-driven-development/code-quality-reviewer-prompt.md says Task tool (superpowers:code-reviewer) while every sibling file says Task tool (general-purpose). I asked "is there a reason for this asymmetry?" and there wasn't one — the named agent was added in v3.2.1 (2025-10-20) so users wouldn't need personal agent config, but the same goal is achieved by inlining the prompt the way every other reviewer in this repo already does.

What does this PR change?

Merges the system-prompt content of agents/code-reviewer.md into skills/requesting-code-review/code-reviewer.md to produce a single, self-contained Task-dispatch template, switches both call sites to dispatch general-purpose, removes the now-vestigial named-agent dispatch instructions from the platform tool-mapping references, and deletes the empty agents/ directory. Adds a Tier 3 behavioral test that verifies the dispatched reviewer actually catches real planted bugs.

Is this change appropriate for the core library?

Yes. It's a structural cleanup of core skill infrastructure — every user who runs requesting-code-review or subagent-driven-development is affected, and the change reduces complexity and surface area. No third-party dependencies introduced, nothing project-specific or domain-specific.

What alternatives did you consider?

Promote the other reviewers to named agents. Make spec-reviewer, plan-reviewer, etc. all into agents/*.md. Rejected: more moving parts, more YAML metadata to maintain, more cross-platform compatibility headaches (Codex can't dispatch named agents at all without the manual workaround).
Leave a stub agents/code-reviewer.md that re-exports the prompt template, for backward-compat with anyone wired against superpowers:code-reviewer outside the repo. Rejected: git grep shows zero external uses inside this repo, and named-agent registries are local to each user's harness install — there's no public API to break here.
Just fix the placeholder bug from fix(code-review): fix placeholder mismatch in reviewer template #915 in place. Rejected: that fixes the symptom while leaving the root cause (split template definition) intact. Drift will reintroduce a similar bug in six months.

Does this PR contain multiple unrelated changes?

No. Everything in the diff is in service of a single change: collapse the named code-reviewer agent into the skill that dispatches it, then prove with a Tier 3 test that the lifted template still produces effective reviewers. The SDD prompt-template update, the platform-doc cleanup, and the new behavioral test are all consequences of the lift.

A separate PR (#TBD-this-test-fix-already-merged-to-dev: e795530) fixed three pre-existing bugs in test-subagent-driven-development-integration.sh that prevented the SDD integration test from running its assertions at all. That PR landed on dev before this one because validating this refactor required a working integration test on the SDD pipeline.

Existing PRs

I have reviewed all open AND closed PRs for duplicates or prior art
Related PRs:
- fix(code-review): fix placeholder mismatch in reviewer template #915 (fix(code-review): fix placeholder mismatch in reviewer template) — open. Fixes the {PLAN_REFERENCE}/{PLAN_OR_REQUIREMENTS} drift bug. This PR supersedes it: the merged template only uses {PLAN_OR_REQUIREMENTS}, so the drift class of bug becomes structurally impossible. fix(code-review): fix placeholder mismatch in reviewer template #915 can be closed once this lands.
- fix: distinguish Skills from Agent types in subagent-driven-development Integration section #1078 (fix: distinguish Skills from Agent types in SDD Integration section) — open. Patches doc wording to disambiguate superpowers:code-reviewer from skills. With the named agent gone, the underlying confusion goes away — the doc patches in fix: distinguish Skills from Agent types in subagent-driven-development Integration section #1078 become redundant rather than necessary.
- No prior closed PRs attempted this lift.

Environment tested

Harness (e.g. Claude Code, Cursor)	Harness version	Model	Model version/ID
Claude Code (CLI, headless `claude -p`)	2.1.119	Claude	claude-opus-4-7 (Opus 4.7)

Evaluation

Initial prompt that started the session: "we currently use a ton of subagents as part of superpowers, but only one of those agent types is an actual agent in 'agents' — i don't think there's a good reason for that? can you study the code base and let me know? would it make sense to lift that agent prompt into the relevant skill as a progressive disclosure doc?"

Eval sessions run after the change:

1× tests/skill-triggering/run-test.sh requesting-code-review against the working tree → skill triggered correctly on a naive prompt about reviewing a diff.
3× tests/claude-code/test-requesting-code-review.sh (the new Tier 3 test) against the working tree → 5/5 assertions pass on every run. Each run plants real bugs (SQL injection, plaintext password "hash", credential logging) into a tiny project, dispatches the code reviewer via the actual skill, and asserts the reviewer catches each planted bug at Critical/Important severity and refuses to approve. The reviewer also caught issues I hadn't planted (timing-side-channel in login, missing tests, no error handling, return-shape ambiguity), which is the calibration we want.
2× tests/claude-code/test-subagent-driven-development-integration.sh against the working tree → 11/11 assertions pass. The SDD pipeline still produces working code with the new dispatch type.

Before/after delta (measured via session JSONL inspection):

Before this PR: SDD's session shows 4 dispatches as general-purpose and 3 as superpowers:code-reviewer. With agents/code-reviewer.md removed, those 3 dispatches would either fail or fall back, depending on harness version.
After this PR: SDD's session shows 7/7 dispatches as general-purpose. Zero superpowers:code-reviewer references anywhere in the pipeline. Spec compliance still rejects extra features, code is still produced and tested.

That's the smoking-gun evidence that the refactor reaches the running code path — not just the skill prose — and that it doesn't break the SDD review loop in the process.

Rigor

If this is a skills change: I used superpowers:writing-skills and completed adversarial pressure testing (paste results below)
This change was tested adversarially, not just on the happy path
I did not modify carefully-tuned content (Red Flags table, rationalizations, "human partner" language) without extensive evals showing the change is an improvement

The skill-content change here is structural (template merge + dispatch type), not behavior-shaping prose. No Red Flags tables, rationalization lists, or "human partner" wording were touched. The new template preserves the persona framing and review checklist from the old agents/code-reviewer.md, plus the structured-output and severity-categorization rules from the old code-reviewer.md template, deduplicated and tightened.

The adversarial test isn't a writing-skills eval — it's a behavioral test: plant deliberate Critical bugs in a diff, dispatch the reviewer via the actual skill, assert the reviewer flags them. That's the right test for this class of change because the change is about dispatch plumbing, not about how the reviewer thinks. If the lifted template produced sycophantic reviewers, the planted-bug test would fail.

Human review

A human has reviewed the COMPLETE proposed diff before submission

Jesse and I worked through the design and the diff together over the course of the session — including pushing back on my initial take that the existing test suite already had Tier 3 coverage (it did, but it was silently broken; that's what motivated the test infrastructure fix that landed on dev separately as e795530).

…skill The plugin had a single named agent (`agents/code-reviewer.md`) used by two skills, while every other reviewer/implementer subagent in the repo is dispatched as `general-purpose` with the prompt template living alongside its skill. That asymmetry had no upside and several costs: - Two sources of truth for the code review checklist (the agent file and `requesting-code-review/code-reviewer.md`), both drifting independently. - `Codex` users could not use the named agent directly; the codex-tools reference doc had a workaround section explaining how to flatten the named agent into a `worker` dispatch. - No third-party reliance on `superpowers:code-reviewer` inside this repo. Changes: - Merge `agents/code-reviewer.md` (persona + checklist) and `skills/requesting-code-review/code-reviewer.md` (placeholder template) into a single self-contained Task-dispatch template, matching the shape of `implementer-prompt.md`, `spec-reviewer-prompt.md`, etc. - Update `skills/requesting-code-review/SKILL.md` and `skills/subagent-driven-development/code-quality-reviewer-prompt.md` to dispatch `Task (general-purpose)` instead of the named agent. - Drop the now-obsolete "Named agent dispatch" workaround sections from `codex-tools.md` and `copilot-tools.md` — superpowers no longer ships any named agents, so those instructions documented nothing. - Delete `agents/code-reviewer.md` and the empty `agents/` directory. Tier 3 coverage for the change: a new behavioral test `tests/claude-code/test-requesting-code-review.sh` plants real bugs (SQL injection, plaintext password handling, credential logging) into a tiny project, runs the actual `requesting-code-review` skill against the working tree, and asserts the dispatched reviewer flags every planted issue at Critical/Important severity and refuses to approve the diff. Verified end-to-end on this branch: - The new test passes (5/5 assertions; reviewer caught all planted bugs and several others). - The existing SDD integration test still passes (7/7 subagents dispatched, all as `general-purpose`; spec compliance still rejects extra features; produced code is correct). - Session JSONLs confirm zero remaining `superpowers:code-reviewer` dispatches anywhere in the SDD pipeline.

arittr

LGTM

obra requested a review from arittr April 28, 2026 20:08

obra force-pushed the lift-code-reviewer-agent branch from 45975ec to 8d9d82b Compare April 30, 2026 21:22

arittr approved these changes Apr 30, 2026

View reviewed changes

arittr merged commit 471fe32 into dev Apr 30, 2026

obra mentioned this pull request May 4, 2026

Release v5.1.0 #1468

Merged

scicco mentioned this pull request May 5, 2026

feat: add plan review cycle skill #1473

Closed

6 tasks

This was referenced May 10, 2026

.cursor-plugin/plugin.json references deleted agents/ and commands/ directories #1514

Closed

tracking: 同步上游 superpowers v5.0.6 / v5.1.0 关键 skill 变更 jnMetaCode/superpowers-zh#19

Open

fuleinist mentioned this pull request May 11, 2026

fix: remove stale agents/commands refs from .cursor-plugin/plugin.json #1520

Merged

arittr mentioned this pull request May 12, 2026

requesting-code-review: no routing rule between code-quality-reviewer and mr-complexity-reviewer agents causes inconsistent behavior #1530

Closed

2 tasks

This was referenced May 17, 2026

Clarify simplification review in workflow gates #1567

Closed

Clarify simplification review in workflow gates bnema/softpowers#3

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lift superpowers:code-reviewer agent into the requesting-code-review skill#1299

Lift superpowers:code-reviewer agent into the requesting-code-review skill#1299
arittr merged 1 commit into
devfrom
lift-code-reviewer-agent

obra commented Apr 28, 2026

Uh oh!

arittr left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

obra commented Apr 28, 2026

What problem are you trying to solve?

What does this PR change?

Is this change appropriate for the core library?

What alternatives did you consider?

Does this PR contain multiple unrelated changes?

Existing PRs

Environment tested

Evaluation

Rigor

Human review

Uh oh!

arittr left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants