fix: qwen3 nonstream parse with no or uncompleted think content #3748

ywx217 · 2025-07-18T09:35:32Z

Motivation

Behaviors of qwen3 reasoning parser are fixed:

In non stream mode, when thinking is disabled, output is returned as reasoning_content.
In non stream mode, when thinking content is larger than max_tokens, parser cannot handle output without </think> token. (Mentioned in issue [Bug] Qwen3 Reasoning Parser 解析错误 #3664)

Modification

qwen_qwq_reasoning_parser.py
- When stream=False and enable_thinking=False, there's no <think> tag, so the whole model output should be regarded as content, not reasoning_content.
- When stream=False, enable_thinking=True and max_token is a small value, incomplete think content can be correctly parsed.
test_qwen3_parser.py
- relative test cases added

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

RunningLeon · 2025-07-21T06:52:27Z

lmdeploy/serve/openai/reasoning_parser/qwen_qwq_reasoning_parser.py

-                # no </think> in previous or delta, reasoning content continues
-                return DeltaMessage(reasoning_content=delta_text)
+            # no <think> in previous or delta, all content
+            return DeltaMessage(content=delta_text)


what if the model does not have reasoning ability, but the output becomes reasoning_content?

Any specific model name and model output for this case?

When the model does not have reasoning ability, the output should be normal content, not reasoning.

hi, when enable_thinking=False, the parser is disabled in main branch

lmdeploy/lmdeploy/serve/openai/api_server.py

Line 508 in 5f0647f

if VariableInterface.reasoning_parser is not None and request.enable_thinking is not False:

lvhan028 · 2025-07-22T07:10:56Z

lmdeploy/serve/openai/reasoning_parser/qwen_qwq_reasoning_parser.py

        return reasoning_content, final_output
+
+    @classmethod
+    def _trim_newlines(cls, text: str):


why perform _trim_newlines?

<think>\n{reasoning_content}\n</think>

to remove \n before and after reasoning_content

lvhan028 · 2025-07-22T07:14:06Z

May provide reproducing code

lvhan028 · 2025-08-06T06:07:34Z

@RunningLeon Is it fixed when you supported interns1 reasoning parser?

RunningLeon · 2025-08-07T02:21:12Z

@RunningLeon Is it fixed when you supported interns1 reasoning parser?

The first problem in the below should be fixed.
@ywx217 hi, as for the second one, does a user really need the uncomplete thinking_content in the case? If reasonable, this would be included.

When stream=False and enable_thinking=False, there's no <think> tag, so the whole model output should be regarded as content, not reasoning_content.
- When stream=False, enable_thinking=True and max_token is a small value, incomplete think content can be correctly parsed.

ywx217 force-pushed the main branch from 6a43ce3 to 793c699 Compare July 18, 2025 09:44

fix: qwen3 nonstream parse with no or uncompleted think content

14fac32

ywx217 force-pushed the main branch from 793c699 to 14fac32 Compare July 18, 2025 09:52

lvhan028 requested a review from RunningLeon July 21, 2025 03:04

RunningLeon reviewed Jul 21, 2025

View reviewed changes

lvhan028 reviewed Jul 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: qwen3 nonstream parse with no or uncompleted think content #3748

fix: qwen3 nonstream parse with no or uncompleted think content #3748

Uh oh!

ywx217 commented Jul 18, 2025 •

edited

Loading

Uh oh!

RunningLeon Jul 21, 2025 •

edited

Loading

Uh oh!

ywx217 Jul 21, 2025 •

edited

Loading

Uh oh!

ywx217 Jul 21, 2025

Uh oh!

RunningLeon Jul 28, 2025

Uh oh!

lvhan028 Jul 22, 2025

Uh oh!

ywx217 Jul 23, 2025

Uh oh!

lvhan028 commented Jul 22, 2025

Uh oh!

lvhan028 commented Aug 6, 2025

Uh oh!

RunningLeon commented Aug 7, 2025

Uh oh!

Uh oh!

fix: qwen3 nonstream parse with no or uncompleted think content #3748

Are you sure you want to change the base?

fix: qwen3 nonstream parse with no or uncompleted think content #3748

Uh oh!

Conversation

ywx217 commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Checklist

Uh oh!

RunningLeon Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywx217 Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywx217 Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

RunningLeon Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

lvhan028 Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

ywx217 Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

lvhan028 commented Jul 22, 2025

Uh oh!

lvhan028 commented Aug 6, 2025

Uh oh!

RunningLeon commented Aug 7, 2025

Uh oh!

Uh oh!

ywx217 commented Jul 18, 2025 •

edited

Loading

RunningLeon Jul 21, 2025 •

edited

Loading

ywx217 Jul 21, 2025 •

edited

Loading