-
Notifications
You must be signed in to change notification settings - Fork 600
fix: qwen3 nonstream parse with no or uncompleted think content #3748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# no </think> in previous or delta, reasoning content continues | ||
return DeltaMessage(reasoning_content=delta_text) | ||
# no <think> in previous or delta, all content | ||
return DeltaMessage(content=delta_text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if the model does not have reasoning ability, but the output becomes reasoning_content?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific model name and model output for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the model does not have reasoning ability, the output should be normal content, not reasoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi, when enable_thinking=False, the parser is disabled in main branch
lmdeploy/lmdeploy/serve/openai/api_server.py
Line 508 in 5f0647f
if VariableInterface.reasoning_parser is not None and request.enable_thinking is not False: |
return reasoning_content, final_output | ||
|
||
@classmethod | ||
def _trim_newlines(cls, text: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why perform _trim_newlines
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<think>\n{reasoning_content}\n</think>
to remove \n
before and after reasoning_content
May provide reproducing code |
@RunningLeon Is it fixed when you supported interns1 reasoning parser? |
The first problem in the below should be fixed.
|
Motivation
Behaviors of qwen3 reasoning parser are fixed:
max_tokens
, parser cannot handle output without</think>
token. (Mentioned in issue [Bug] Qwen3 Reasoning Parser 解析错误 #3664)Modification
qwen_qwq_reasoning_parser.py
stream=False
andenable_thinking=False
, there's no<think>
tag, so the whole model output should be regarded as content, not reasoning_content.stream=False
,enable_thinking=True
andmax_token
is a small value, incomplete think content can be correctly parsed.test_qwen3_parser.py
Checklist