fix: Ban qwen3 tito model and handle non-tool appended messages in TITO dummy assistant#949
fix: Ban qwen3 tito model and handle non-tool appended messages in TITO dummy assistant#949
Conversation
Previously `_build_dummy_assistant` assumed all appended messages were tool responses and always generated matching `tool_calls`. When a non-tool message (e.g. `user`) was appended, the template rendered incorrect turn-transition tokens (`<|observation|>` instead of `<|user|>`). Now the function only generates `tool_calls` for the leading contiguous tool messages. If the first appended message is not a tool role, no `tool_calls` are emitted so the template picks the correct boundary. Made-with: Cursor
There was a problem hiding this comment.
Code Review
This pull request updates the _build_dummy_assistant function to correctly handle leading tool messages within appended_messages, ensuring proper turn-transition tokens are rendered. Feedback suggests restoring the reasoning_content field to maintain consistency with reasoning models and simplifying the tool_calls list comprehension using enumerate on a slice of the messages.
| assistant: dict[str, Any] = {"role": "assistant", "content": ""} | ||
| if num_leading_tools > 0: | ||
| assistant["tool_calls"] = [ | ||
| { | ||
| "id": resp.get("tool_call_id") or f"call0000{i}", | ||
| "id": appended_messages[i].get("tool_call_id") or f"call0000{i}", | ||
| "type": "function", | ||
| "function": { | ||
| "name": resp.get("name") or "dummy_func", | ||
| "name": appended_messages[i].get("name") or "dummy_func", | ||
| "arguments": {}, | ||
| }, | ||
| } | ||
| for i, resp in enumerate(tool_responses) | ||
| ], | ||
| } | ||
| for i in range(num_leading_tools) | ||
| ] |
There was a problem hiding this comment.
The reasoning_content field was removed from the dummy assistant message. This field was present in the previous implementation (line 32) and is often necessary for reasoning models to correctly render turn boundaries (e.g., to ensure the reasoning block is closed). Unless its removal was intentional to fix a specific issue, it should be restored.
Additionally, the tool_calls generation can be simplified using enumerate on a slice of appended_messages.
| assistant: dict[str, Any] = {"role": "assistant", "content": ""} | |
| if num_leading_tools > 0: | |
| assistant["tool_calls"] = [ | |
| { | |
| "id": resp.get("tool_call_id") or f"call0000{i}", | |
| "id": appended_messages[i].get("tool_call_id") or f"call0000{i}", | |
| "type": "function", | |
| "function": { | |
| "name": resp.get("name") or "dummy_func", | |
| "name": appended_messages[i].get("name") or "dummy_func", | |
| "arguments": {}, | |
| }, | |
| } | |
| for i, resp in enumerate(tool_responses) | |
| ], | |
| } | |
| for i in range(num_leading_tools) | |
| ] | |
| assistant: dict[str, Any] = { | |
| "role": "assistant", | |
| "content": "", | |
| "reasoning_content": " ", | |
| } | |
| if num_leading_tools > 0: | |
| assistant["tool_calls"] = [ | |
| { | |
| "id": msg.get("tool_call_id") or f"call0000{i}", | |
| "type": "function", | |
| "function": { | |
| "name": msg.get("name") or "dummy_func", | |
| "arguments": {}, | |
| }, | |
| } | |
| for i, msg in enumerate(appended_messages[:num_leading_tools]) | |
| ] |
…ix stability Without `reasoning_content: " "`, Qwen3's template inserts an empty <think> block only when the assistant is the last message, breaking the prefix-diff invariant. Add test_dummy_prefix_match to guard this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Non-contiguous tool messages (e.g. tool, system, tool) now all get matching tool_calls in the dummy assistant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Qwen3's context-sensitive template (last_query_index / <think> block) breaks the dummy-prefix diff when user messages are appended. Raise ValueError at arg validation if --tito-allowed-append-roles includes 'user' with a non-glm47 --tito-model. Add test_dummy_prefix_match (GLM47-only) to guard the prefix invariant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Qwen3's context-sensitive template (<think> block / last_query_index) breaks the dummy-prefix diff. Ban --tito-model=qwen3 at arg validation, remove the reasoning_content workaround, and scope TITO tests to GLM47. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| All ``tool`` messages in *appended_messages* (not just leading contiguous | ||
| ones) get matching ``tool_calls``. If there are no tool messages the dummy | ||
| assistant has no ``tool_calls`` — so the template renders the correct | ||
| turn-transition tokens (e.g. ``<|user|>`` instead of ``<|observation|>``). |
There was a problem hiding this comment.
For my sanity check, now our TITO would only support GLM style? Can we still keep it flexible somehow?
If it would be too hard to support Qwen3 chat template, I think it's still good here
There was a problem hiding this comment.
And we don't need reasoning_content any more?
There was a problem hiding this comment.
For my sanity check, now our TITO would only support GLM style? Can we still keep it flexible somehow?
If it would be too hard to support Qwen3 chat template, I think it's still good here
This walkaround for glm 4.7 break qwen impl. Need fix to qwen3 chat template. Unblock TITO dev.
Summary
--tito-model=qwen3— Qwen3's context-sensitive template (last_query_index,<think>block) breaks the dummy-prefix diff, producing wrong incremental tokens. RaisesValueErrorat arg validation with a clear message.reasoning_content: " "workaround from_build_dummy_assistant(no longer needed with Qwen3 banned)_build_dummy_assistantnow counts all tool messages in appended (not just leading contiguous), usingtool_indicesfor correct indexingtest_dummy_prefix_match(GLM47-only) to guard the prefix invariantTest plan
ValueErrorraised when--tito-model=qwen3Made with Cursor
🤖 Generated with Claude Code