Skip to content

Commit def4340

Browse files
author
Kartikay Khandelwal
committed
fix
1 parent 61638bb commit def4340

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

recipes/configs/generate.yaml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,8 @@ checkpointer:
1111
pytorch_model-00002-of-00003.bin,
1212
pytorch_model-00003-of-00003.bin
1313
]
14-
recipe_checkpoint: null
1514
output_dir: /tmp/Llama-2-13b-hf/
16-
model_type: MISTRAL
15+
model_type: LLAMA2
1716

1817
device: cuda
1918
dtype: bf16
@@ -22,7 +21,7 @@ seed: 1234
2221

2322
# Tokenizer arguments
2423
tokenizer:
25-
_component_: torchtune.models.mistral.mistral_tokenizer
24+
_component_: torchtune.models.llama2.llama2_tokenizer
2625
path: /tmp/Llama-2-13b-hf/tokenizer.model
2726

2827
# Generation arguments; defaults taken from gpt-fast

recipes/generate.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,13 @@
1818

1919

2020
class InferenceRecipe:
21+
"""
22+
Recipe for generating tokens from a dense Transformer-based LLM.
23+
24+
Currently this recipe support single-GPU generation only. Speculative
25+
decoding is not supported.
26+
"""
27+
2128
def __init__(self, cfg: DictConfig) -> None:
2229
self._device = utils.get_device(device=cfg.device)
2330
self._dtype = utils.get_dtype(dtype=cfg.dtype)

0 commit comments

Comments
 (0)