Skip to content

Garbage Output During Inference with Qwen3-1.7B #2866

@HayrapetyanZhirayr

Description

@HayrapetyanZhirayr

While running inference using the Qwen3-1.7B model via tune run generate, the output is nonsensical and repetitive, filled with garbage tokens (e.g., “$2 an hour” repeated hundreds of times).

Prompt Used:

Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?

Expected Behavior:
The model should return a coherent and correct numerical answer (i.e., something like: “Weng earns $12 per hour. 50 minutes is 5/6 of an hour. So, she earned $10.” with reasoning. The same checkpoint with same inference parameters but in vllm or transformes acts as expected.

Config Used (Partial):

checkpointer:
  checkpoint_dir: ./tune_models/Qwen3-1.7B/
  checkpoint_files:
    - model-00001-of-00002.safetensors
    - model-00002-of-00002.safetensors
  model_type: QWEN3

model:
  _component_: torchtune.models.qwen3.qwen3_1_7b_instruct

tokenizer:
  _component_: torchtune.models.qwen3.qwen3_tokenizer
  path:./tune_models/Qwen3-1.7B/vocab.json
  merges_file:./tune_models/Qwen3-1.7B/merges.txt

temperature: 0.0
top_k: 300
enable_kv_cache: true
dtype: bf16
device: cuda

Response:

<|im_start|>user
Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?<|im_end|>
<|im_start|>assistant
<think>
</think>

The question is asking how much Weng ( earns if she earns $2 an hour for babysitting. Let's break it down:

1. hour = $2 2 an hour = $2 an hour = $2 an hour

So, if she earns $2 an hour, and she works  2 hours, then she earns $22 2 $2 $2 an hour, and $2 an hour.

So, she earns $2 an hour, and she earns $2 $2 an hour, and $2 an hour.

So, $2 an hour, and $2 an hour, and $2 an hour, and $22 $2 an hour, and $2 an hour, and $2 and $2 an hour, and $2, and $2 an hour, and $22 and $2 an hour, and $2, and $2 and $2 an hour, and $2 an hour, and $22 $2 an hour, and $2 and $2 an hour, and $2 an hour, and $2 and $2 an hour, and $2 an hour, and $2 $2 an hour, and $2 an hour, and $2 and $2 an hour, and $2 an hour, and $2 $2 an hour, $2 an hour, and $2 an hour, and $2 and $2 an hour, and $2 an hour, and $2 $2 an hour, and $2 an hour, and $22 an hour, and $2 an hour, and $2 an hour, and $2, and $2 an hour, and $22 $2 an hour, and $2 an hour, and $2 an hour, and $2 $2 an hour, and $22, and $2 an hour, and $2, and $2 an hour, and $2, and $2 an hour, and $2, and $2 an hour, and $2 an hour, and $2 an hour, and $2 an, and $22 $2 an hour, and $2 an hour, and $2 an hour, and $2 $2 an hour, and $22, and $2 an hour, and $2 $2 an hour, and $2 an hour, and $2 an hour, and $2 $2 an hour, and $2, and $2 an hour, and $2 and $2 an hour, and $2 and $2 an hour, and $2 an hour, and $2 $2 an hour, $2 an hour, and $2 an hour, and $2 an hour, and $2 $2 an hour, and $2 an hour, and $2 and $2 an hour, and $2, and $2 an hour, and $22 and $2 an hour, and $2, and $2 an hour, and $2, and $22 $2 an hour, and $2 an hour, and $2 an hour, and $ $2 an hour, and $2 an hour, and $22, and $2 $2 an hour, and $2 an hour, and $2 an hour, and $2 $2 an hour, and $2 an hour, and $2 an hour, and $2 $2 an hour, and $2 and $2 an hour, and $2 an hour, and $2 $2 an hour ...

Possible Causes:

  • As far as checkpoint is okey and acts as expected in other frameworks may be there are some bugs in qwen3 model implementation or torchtune.generation.generate.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions