Skip to content

Commit e9c1c1d

Browse files
authored
Fix qk_norm comment (#769)
1 parent b14325e commit e9c1c1d

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

ch05/11_qwen3/standalone-qwen3.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -436,7 +436,7 @@
436436
" \"n_layers\": 28, # Number of layers\n",
437437
" \"hidden_dim\": 3072, # Size of the intermediate dimension in FeedForward\n",
438438
" \"head_dim\": 128, # Size of the heads in GQA\n",
439-
" \"qk_norm\": True, # Whether to normalize queries and values in GQA\n",
439+
" \"qk_norm\": True, # Whether to normalize queries and keys in GQA\n",
440440
" \"n_kv_groups\": 8, # Key-Value groups for grouped-query attention\n",
441441
" \"rope_base\": 1_000_000.0, # The base in RoPE's \"theta\"\n",
442442
" \"dtype\": torch.bfloat16, # Lower-precision dtype to reduce memory usage\n",

pkg/llms_from_scratch/qwen3.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
"n_layers": 28, # Number of layers
2323
"hidden_dim": 3072, # Size of the intermediate dimension in FeedForward
2424
"head_dim": 128, # Size of the heads in GQA
25-
"qk_norm": True, # Whether to normalize queries and values in GQA
25+
"qk_norm": True, # Whether to normalize queries and keys in GQA
2626
"n_kv_groups": 8, # Key-Value groups for grouped-query attention
2727
"rope_base": 1_000_000.0, # The base in RoPE's "theta"
2828
"dtype": torch.bfloat16, # Lower-precision dtype to reduce memory usage

0 commit comments

Comments
 (0)