[BUG] Agent Teams: in-process backend fans one agent name into N concurrent writers → completed task_assignments replayed to a worker (replay storm)

## Summary

In Agent Teams, the **in-process task backend fans a single logical agent name into multiple concurrent transcript writers**, and the harness delivers each pending `task_assignment` **once per live writer of that name**. The surviving writer therefore sees the *same* assignments arrive in waves — including assignments it has **already completed and reported** — producing a "replay storm" where an entire completed subtask set is re-dispatched back-to-back.

This is distinct from (but likely the same root-cause family as) #47930, which observed the **lead** looping on duplicate `task_assignment` echoes. Here the victim is a **worker/teammate**, and the replayed payload is a *whole completed subtask set*, not a single idle-notification echo.

## Version

claude 2.1.175 (Linux). Reproduced on multiple unrelated teammates in the same session (see cross-control below), so it is harness-level, not specific to one agent's code or prompt.

## Observed behavior

On 2026-06-12 ~20:06–20:08, a worker teammate that had **completed and reported** a multi-task assignment (4 subtasks) went idle. Its inbox then re-delivered the ENTIRE completed subtask set, back-to-back, as if freshly dispatched. The worker's own ack-and-skip guard contained it (nothing was redone), but each replay burns a full verify-and-decline round, and at fleet scale this is a recurring token/throughput drain. A less careful agent would have *duplicated* the work (competing PR / duplicate cards).

## Root cause (discriminated, not assumed)

We wrote a forensics tool that reconstructs the event from the session's subagent transcripts and **discriminates the cause** between two hypotheses:

- **H1 (re-loop):** a dispatcher re-reads the task list and re-delivers — would show the *same* task arriving more than once **to the same writer**, and would be sensitive to task status (deleted/completed tasks would stop replaying).
- **H2 (fan-out):** one agent name has multiple concurrent in-process transcript writers; the harness delivers each pending assignment once per writer.

The evidence selected **H2 (fan-out)**:

- **(a)** For the teammate name, **N > 1 concurrent transcript writers share a single start timestamp** (i.e. one logical name, many simultaneous in-process writers).
- **(b)** There is **NO within-writer re-delivery** — no single writer ever received the same task twice. This **falsifies H1** (a true re-loop would re-deliver to the same writer).
- **(c)** **Writer death times == delivery times** — each forked writer consumes one assignment and exits; the replays track the churn of sibling writers, not a polling loop.
- Replayed task ids were **below the delivery high-watermark** and **deleted tasks still replayed** — both inconsistent with a task-status-driven dispatcher re-read.

## Repro / forensics

We are not able to ship the raw transcripts, but the discriminating analysis is fully scripted and runs against any Agent-Teams session's subagent transcript directory:

```
mat948-replay-forensics.sh <session-subagents-dir> <agentType>
```

It prints, per writer: start/end timestamps, the tasks delivered to that writer, and any within-writer duplicates; then the three discriminating checks (a)/(b)/(c) above and a verdict. Running it on the affected teammate produced the H2 verdict; running it on an **unrelated worker in the same session as a cross-control reproduced the identical pattern** (N>1 concurrent writers, no within-writer dup, death-time == delivery-time) — confirming it is a harness-level property of the in-process backend, not our agent code.

To reproduce from scratch: run an Agent Team where a teammate is given a multi-subtask assignment, let it complete + report all subtasks, then let it go idle. Observe the inbox re-deliver the completed set. Inspect the session's subagent transcripts and you will find multiple concurrent writers for that one teammate name sharing a start ts.

## Impact

- Recurring wasted turns (each replay = a full verify-and-decline round) → token/cost drain at fleet scale (cost-relevant, cf. #47930, #47922).
- Correctness hazard: a worker without a robust ack-and-skip guard will **redo completed work** — duplicate PRs / duplicate tracker cards / competing branches.

## Suggested fix direction

Make completed-task re-dispatch impossible at the source: an **idempotency check on task status before delivery** (don't deliver a task already in a terminal/reported state), and/or ensure a single logical agent name maps to a **single in-process transcript writer** rather than fanning out into concurrent writers each of which independently drains the pending-assignment queue.

## Related

- #47930 — lead-session duplicate `task_assignment` echoes (likely same fan-out family, different victim).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Agent Teams: in-process backend fans one agent name into N concurrent writers → completed task_assignments replayed to a worker (replay storm) #68336

Summary

Version

Observed behavior

Root cause (discriminated, not assumed)

Repro / forensics

Impact

Suggested fix direction

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] Agent Teams: in-process backend fans one agent name into N concurrent writers → completed task_assignments replayed to a worker (replay storm) #68336

Description

Summary

Version

Observed behavior

Root cause (discriminated, not assumed)

Repro / forensics

Impact

Suggested fix direction

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions