Skip to content

Commit 4724b86

Browse files
committed
Set float32 MLP output dtype for Qwen3
1 parent 7deeb0f commit 4724b86

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

exllamav3/models/qwen3.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,8 @@ def __init__(
115115
key_gate = "gate_proj",
116116
key_down = "down_proj",
117117
qmap = "block.mlp",
118+
interm_dtype = torch.half,
119+
out_dtype = torch.float,
118120
),
119121
)
120122
for idx in range(config.num_hidden_layers)

0 commit comments

Comments
 (0)