Summary
In Agent Teams, the in-process task backend fans a single logical agent name into multiple concurrent transcript writers, and the harness delivers each pending task_assignment once per live writer of that name. The surviving writer therefore sees the same assignments arrive in waves — including assignments it has already completed and reported — producing a "replay storm" where an entire completed subtask set is re-dispatched back-to-back.
This is distinct from (but likely the same root-cause family as) #47930, which observed the lead looping on duplicate task_assignment echoes. Here the victim is a worker/teammate, and the replayed payload is a whole completed subtask set, not a single idle-notification echo.
Version
claude 2.1.175 (Linux). Reproduced on multiple unrelated teammates in the same session (see cross-control below), so it is harness-level, not specific to one agent's code or prompt.
Observed behavior
On 2026-06-12 ~20:06–20:08, a worker teammate that had completed and reported a multi-task assignment (4 subtasks) went idle. Its inbox then re-delivered the ENTIRE completed subtask set, back-to-back, as if freshly dispatched. The worker's own ack-and-skip guard contained it (nothing was redone), but each replay burns a full verify-and-decline round, and at fleet scale this is a recurring token/throughput drain. A less careful agent would have duplicated the work (competing PR / duplicate cards).
Root cause (discriminated, not assumed)
We wrote a forensics tool that reconstructs the event from the session's subagent transcripts and discriminates the cause between two hypotheses:
- H1 (re-loop): a dispatcher re-reads the task list and re-delivers — would show the same task arriving more than once to the same writer, and would be sensitive to task status (deleted/completed tasks would stop replaying).
- H2 (fan-out): one agent name has multiple concurrent in-process transcript writers; the harness delivers each pending assignment once per writer.
The evidence selected H2 (fan-out):
- (a) For the teammate name, N > 1 concurrent transcript writers share a single start timestamp (i.e. one logical name, many simultaneous in-process writers).
- (b) There is NO within-writer re-delivery — no single writer ever received the same task twice. This falsifies H1 (a true re-loop would re-deliver to the same writer).
- (c) Writer death times == delivery times — each forked writer consumes one assignment and exits; the replays track the churn of sibling writers, not a polling loop.
- Replayed task ids were below the delivery high-watermark and deleted tasks still replayed — both inconsistent with a task-status-driven dispatcher re-read.
Repro / forensics
We are not able to ship the raw transcripts, but the discriminating analysis is fully scripted and runs against any Agent-Teams session's subagent transcript directory:
mat948-replay-forensics.sh <session-subagents-dir> <agentType>
It prints, per writer: start/end timestamps, the tasks delivered to that writer, and any within-writer duplicates; then the three discriminating checks (a)/(b)/(c) above and a verdict. Running it on the affected teammate produced the H2 verdict; running it on an unrelated worker in the same session as a cross-control reproduced the identical pattern (N>1 concurrent writers, no within-writer dup, death-time == delivery-time) — confirming it is a harness-level property of the in-process backend, not our agent code.
To reproduce from scratch: run an Agent Team where a teammate is given a multi-subtask assignment, let it complete + report all subtasks, then let it go idle. Observe the inbox re-deliver the completed set. Inspect the session's subagent transcripts and you will find multiple concurrent writers for that one teammate name sharing a start ts.
Impact
Suggested fix direction
Make completed-task re-dispatch impossible at the source: an idempotency check on task status before delivery (don't deliver a task already in a terminal/reported state), and/or ensure a single logical agent name maps to a single in-process transcript writer rather than fanning out into concurrent writers each of which independently drains the pending-assignment queue.
Related
Summary
In Agent Teams, the in-process task backend fans a single logical agent name into multiple concurrent transcript writers, and the harness delivers each pending
task_assignmentonce per live writer of that name. The surviving writer therefore sees the same assignments arrive in waves — including assignments it has already completed and reported — producing a "replay storm" where an entire completed subtask set is re-dispatched back-to-back.This is distinct from (but likely the same root-cause family as) #47930, which observed the lead looping on duplicate
task_assignmentechoes. Here the victim is a worker/teammate, and the replayed payload is a whole completed subtask set, not a single idle-notification echo.Version
claude 2.1.175 (Linux). Reproduced on multiple unrelated teammates in the same session (see cross-control below), so it is harness-level, not specific to one agent's code or prompt.
Observed behavior
On 2026-06-12 ~20:06–20:08, a worker teammate that had completed and reported a multi-task assignment (4 subtasks) went idle. Its inbox then re-delivered the ENTIRE completed subtask set, back-to-back, as if freshly dispatched. The worker's own ack-and-skip guard contained it (nothing was redone), but each replay burns a full verify-and-decline round, and at fleet scale this is a recurring token/throughput drain. A less careful agent would have duplicated the work (competing PR / duplicate cards).
Root cause (discriminated, not assumed)
We wrote a forensics tool that reconstructs the event from the session's subagent transcripts and discriminates the cause between two hypotheses:
The evidence selected H2 (fan-out):
Repro / forensics
We are not able to ship the raw transcripts, but the discriminating analysis is fully scripted and runs against any Agent-Teams session's subagent transcript directory:
It prints, per writer: start/end timestamps, the tasks delivered to that writer, and any within-writer duplicates; then the three discriminating checks (a)/(b)/(c) above and a verdict. Running it on the affected teammate produced the H2 verdict; running it on an unrelated worker in the same session as a cross-control reproduced the identical pattern (N>1 concurrent writers, no within-writer dup, death-time == delivery-time) — confirming it is a harness-level property of the in-process backend, not our agent code.
To reproduce from scratch: run an Agent Team where a teammate is given a multi-subtask assignment, let it complete + report all subtasks, then let it go idle. Observe the inbox re-deliver the completed set. Inspect the session's subagent transcripts and you will find multiple concurrent writers for that one teammate name sharing a start ts.
Impact
Suggested fix direction
Make completed-task re-dispatch impossible at the source: an idempotency check on task status before delivery (don't deliver a task already in a terminal/reported state), and/or ensure a single logical agent name maps to a single in-process transcript writer rather than fanning out into concurrent writers each of which independently drains the pending-assignment queue.
Related
task_assignmentechoes (likely same fan-out family, different victim).