Skip to content

fix: Ban qwen3 tito model and handle non-tool appended messages in TITO dummy assistant#949

Open
guapisolo wants to merge 9 commits intomainfrom
fix/tito-non-tool-append
Open

fix: Ban qwen3 tito model and handle non-tool appended messages in TITO dummy assistant#949
guapisolo wants to merge 9 commits intomainfrom
fix/tito-non-tool-append

Conversation

@guapisolo
Copy link
Copy Markdown
Collaborator

@guapisolo guapisolo commented Apr 7, 2026

Summary

  • Globally ban --tito-model=qwen3 — Qwen3's context-sensitive template (last_query_index, <think> block) breaks the dummy-prefix diff, producing wrong incremental tokens. Raises ValueError at arg validation with a clear message.
  • Remove the reasoning_content: " " workaround from _build_dummy_assistant (no longer needed with Qwen3 banned)
  • _build_dummy_assistant now counts all tool messages in appended (not just leading contiguous), using tool_indices for correct indexing
  • Adds test_dummy_prefix_match (GLM47-only) to guard the prefix invariant
  • TITO parametrized tests scoped to GLM47 only

Test plan

  • 48 tests pass (GLM47 parametrized + Qwen3 boundary + factory)
  • Verified GLM47 prefix match for all trajectory × split combinations
  • Verified ValueError raised when --tito-model=qwen3

Made with Cursor

🤖 Generated with Claude Code

Previously `_build_dummy_assistant` assumed all appended messages were
tool responses and always generated matching `tool_calls`. When a
non-tool message (e.g. `user`) was appended, the template rendered
incorrect turn-transition tokens (`<|observation|>` instead of
`<|user|>`).

Now the function only generates `tool_calls` for the leading contiguous
tool messages. If the first appended message is not a tool role, no
`tool_calls` are emitted so the template picks the correct boundary.

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the _build_dummy_assistant function to correctly handle leading tool messages within appended_messages, ensuring proper turn-transition tokens are rendered. Feedback suggests restoring the reasoning_content field to maintain consistency with reasoning models and simplifying the tool_calls list comprehension using enumerate on a slice of the messages.

Comment on lines +40 to +52
assistant: dict[str, Any] = {"role": "assistant", "content": ""}
if num_leading_tools > 0:
assistant["tool_calls"] = [
{
"id": resp.get("tool_call_id") or f"call0000{i}",
"id": appended_messages[i].get("tool_call_id") or f"call0000{i}",
"type": "function",
"function": {
"name": resp.get("name") or "dummy_func",
"name": appended_messages[i].get("name") or "dummy_func",
"arguments": {},
},
}
for i, resp in enumerate(tool_responses)
],
}
for i in range(num_leading_tools)
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The reasoning_content field was removed from the dummy assistant message. This field was present in the previous implementation (line 32) and is often necessary for reasoning models to correctly render turn boundaries (e.g., to ensure the reasoning block is closed). Unless its removal was intentional to fix a specific issue, it should be restored.

Additionally, the tool_calls generation can be simplified using enumerate on a slice of appended_messages.

Suggested change
assistant: dict[str, Any] = {"role": "assistant", "content": ""}
if num_leading_tools > 0:
assistant["tool_calls"] = [
{
"id": resp.get("tool_call_id") or f"call0000{i}",
"id": appended_messages[i].get("tool_call_id") or f"call0000{i}",
"type": "function",
"function": {
"name": resp.get("name") or "dummy_func",
"name": appended_messages[i].get("name") or "dummy_func",
"arguments": {},
},
}
for i, resp in enumerate(tool_responses)
],
}
for i in range(num_leading_tools)
]
assistant: dict[str, Any] = {
"role": "assistant",
"content": "",
"reasoning_content": " ",
}
if num_leading_tools > 0:
assistant["tool_calls"] = [
{
"id": msg.get("tool_call_id") or f"call0000{i}",
"type": "function",
"function": {
"name": msg.get("name") or "dummy_func",
"arguments": {},
},
}
for i, msg in enumerate(appended_messages[:num_leading_tools])
]

guapisolo and others added 7 commits April 7, 2026 20:46
…ix stability

Without `reasoning_content: " "`, Qwen3's template inserts an empty
<think> block only when the assistant is the last message, breaking the
prefix-diff invariant. Add test_dummy_prefix_match to guard this.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Non-contiguous tool messages (e.g. tool, system, tool) now all get
matching tool_calls in the dummy assistant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Qwen3's context-sensitive template (last_query_index / <think> block)
breaks the dummy-prefix diff when user messages are appended. Raise
ValueError at arg validation if --tito-allowed-append-roles includes
'user' with a non-glm47 --tito-model. Add test_dummy_prefix_match
(GLM47-only) to guard the prefix invariant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Qwen3's context-sensitive template (<think> block / last_query_index)
breaks the dummy-prefix diff. Ban --tito-model=qwen3 at arg validation,
remove the reasoning_content workaround, and scope TITO tests to GLM47.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@guapisolo guapisolo changed the title fix: handle non-tool appended messages in TITO dummy assistant fix: Ban qwen3 tito model and handle non-tool appended messages in TITO dummy assistant Apr 7, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@guapisolo guapisolo requested a review from yushengsu-thu as a code owner April 7, 2026 23:01
All ``tool`` messages in *appended_messages* (not just leading contiguous
ones) get matching ``tool_calls``. If there are no tool messages the dummy
assistant has no ``tool_calls`` — so the template renders the correct
turn-transition tokens (e.g. ``<|user|>`` instead of ``<|observation|>``).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my sanity check, now our TITO would only support GLM style? Can we still keep it flexible somehow?

If it would be too hard to support Qwen3 chat template, I think it's still good here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we don't need reasoning_content any more?

Copy link
Copy Markdown
Collaborator Author

@guapisolo guapisolo Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my sanity check, now our TITO would only support GLM style? Can we still keep it flexible somehow?

If it would be too hard to support Qwen3 chat template, I think it's still good here

This walkaround for glm 4.7 break qwen impl. Need fix to qwen3 chat template. Unblock TITO dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants