Summary
When the model streams a large tool-call argument value (e.g., a 20KB file_text for create), the event stream is functionally silent for 10–30 minutes between tool.execution_start and tool.execution_complete. A stall watchdog watching the event stream cannot distinguish "model is productively building a large output" from "session has hung."
Repro
const session = await client.createSession({ /* ... */ });
session.on((evt) => console.log(evt.type, evt.data));
await session.sendAndWait({
prompt: "Create a file at ./report.json containing a JSON document " +
"with at least 20KB of structured analysis data. Use the create tool.",
});
Observe the event stream between tool.execution_start (with empty arguments) and tool.execution_complete (with the full argument). Empirically, no events fire for the duration of argument construction.
Expected
A heartbeat event during argument streaming — byte count, token count, or even a simple tool.argument_progress ping — so consumers can tell a working session from a hung one.
Actual
tool.execution_start fires once, then silence for the duration of LLM argument streaming, then tool.execution_complete fires once when the full tool invocation finishes (which includes both arg construction AND tool execution).
assistant.streaming_delta exists with cumulative byte counts (totalResponseSizeBytes) and fires during response streaming, but it has not been verified whether it continues to fire during the tool-argument construction phase. This would be useful to clarify: if streaming_delta does cover that phase, the gap is consumer-side documentation; if it doesn't, the gap is upstream.
Evidence (SDK source)
nodejs/src/generated/session-events.ts: the SessionEvent union includes:
assistant.streaming_delta (AssistantStreamingDeltaData.totalResponseSizeBytes) — described as "Streaming response progress with cumulative byte count"
assistant.reasoning_delta, assistant.message_delta — response-side
tool.execution_progress, tool.execution_partial_result — fire AFTER execution_start for tools that emit stdout (bash, powershell)
There is no tool.argument_progress or equivalent event specifically scoped to in-flight tool-argument streaming.
Consumer impact
Consumers with long-running tool calls must either tune stall watchdogs to ceilings that mask real failures (15+ minutes) or route artifact completion through filesystem polling instead of session events.
Suggested fix
Either:
- Confirm + document that
assistant.streaming_delta continues firing during in-flight tool-argument construction. If so, this is a doc-only fix.
- Emit a
tool.argument_progress event during LLM-streamed argument construction carrying byte/token count or a small delta. Even one event every ~10 seconds would let consumers tell working sessions from hung ones.
Related
Environment
- SDK: @github/copilot-sdk@0.3.0
- CLI: @github/copilot@1.0.45
- Node: 22 LTS
- OS: Windows 11
- Model: claude-sonnet-4-6
Summary
When the model streams a large tool-call argument value (e.g., a 20KB
file_textforcreate), the event stream is functionally silent for 10–30 minutes betweentool.execution_startandtool.execution_complete. A stall watchdog watching the event stream cannot distinguish "model is productively building a large output" from "session has hung."Repro
Observe the event stream between
tool.execution_start(with emptyarguments) andtool.execution_complete(with the full argument). Empirically, no events fire for the duration of argument construction.Expected
A heartbeat event during argument streaming — byte count, token count, or even a simple
tool.argument_progressping — so consumers can tell a working session from a hung one.Actual
tool.execution_startfires once, then silence for the duration of LLM argument streaming, thentool.execution_completefires once when the full tool invocation finishes (which includes both arg construction AND tool execution).assistant.streaming_deltaexists with cumulative byte counts (totalResponseSizeBytes) and fires during response streaming, but it has not been verified whether it continues to fire during the tool-argument construction phase. This would be useful to clarify: ifstreaming_deltadoes cover that phase, the gap is consumer-side documentation; if it doesn't, the gap is upstream.Evidence (SDK source)
nodejs/src/generated/session-events.ts: theSessionEventunion includes:assistant.streaming_delta(AssistantStreamingDeltaData.totalResponseSizeBytes) — described as "Streaming response progress with cumulative byte count"assistant.reasoning_delta,assistant.message_delta— response-sidetool.execution_progress,tool.execution_partial_result— fire AFTERexecution_startfor tools that emit stdout (bash,powershell)There is no
tool.argument_progressor equivalent event specifically scoped to in-flight tool-argument streaming.Consumer impact
Consumers with long-running tool calls must either tune stall watchdogs to ceilings that mask real failures (15+ minutes) or route artifact completion through filesystem polling instead of session events.
Suggested fix
Either:
assistant.streaming_deltacontinues firing during in-flight tool-argument construction. If so, this is a doc-only fix.tool.argument_progressevent during LLM-streamed argument construction carrying byte/token count or a small delta. Even one event every ~10 seconds would let consumers tell working sessions from hung ones.Related
session.disconnect()is cooperative; doesn't abortsendAndWaitor close the transport #1273 —session.disconnect()cooperative — the recovery path when this stall does happen is also broken; the two issues compound.Environment