diff --git a/docs/source/deep_dives/checkpointer.rst b/docs/source/deep_dives/checkpointer.rst index 92dffc878d..7b2d2cdc84 100644 --- a/docs/source/deep_dives/checkpointer.rst +++ b/docs/source/deep_dives/checkpointer.rst @@ -293,8 +293,8 @@ For more details about each file, please check the End-to-End tutorial mentioned │ ├── adapter_model.pt │ ├── adapter_model.safetensors │ ├── config.json - │ ├── ft-model-00001-of-00002.safetensors - │ ├── ft-model-00002-of-00002.safetensors + │ ├── model-00001-of-00002.safetensors + │ ├── model-00002-of-00002.safetensors │ ├── generation_config.json │ ├── LICENSE.txt │ ├── model.safetensors.index.json @@ -313,8 +313,8 @@ For more details about each file, please check the End-to-End tutorial mentioned │ ├── adapter_model.pt │ ├── adapter_model.safetensors │ ├── config.json - │ ├── ft-model-00001-of-00002.safetensors - │ ├── ft-model-00002-of-00002.safetensors + │ ├── model-00001-of-00002.safetensors + │ ├── model-00002-of-00002.safetensors │ ├── generation_config.json │ ├── LICENSE.txt │ ├── model.safetensors.index.json @@ -394,7 +394,7 @@ you'll need to **update** the following fields in your configs: **resume_from_checkpoint**: Set it to True; -**checkpoint_files**: change the path to ``epoch_{YOUR_EPOCH}/ft-model={}-of-{}.safetensors``; +**checkpoint_files**: change the path to ``epoch_{YOUR_EPOCH}/model-{}-of-{}.safetensors``; Notice that we do **not** change our checkpoint_dir or output_dir. Since we are resuming from checkpoint, we know to look for it in the output_dir. @@ -405,8 +405,8 @@ to look for it in the output_dir. # checkpoint files. Note that you will need to update this # section of the config with the intermediate checkpoint files checkpoint_files: [ - epoch_{YOUR_EPOCH}/ft-model-00001-of-00002.safetensors, - epoch_{YOUR_EPOCH}/ft-model-00001-of-00002.safetensors, + epoch_{YOUR_EPOCH}/model-00001-of-00002.safetensors, + epoch_{YOUR_EPOCH}/model-00001-of-00002.safetensors, ] # set to True if restarting training diff --git a/docs/source/tutorials/e2e_flow.rst b/docs/source/tutorials/e2e_flow.rst index 8e3a098d3e..66f1429aad 100644 --- a/docs/source/tutorials/e2e_flow.rst +++ b/docs/source/tutorials/e2e_flow.rst @@ -142,8 +142,8 @@ There are 3 types of folders: │ ├── adapter_model.pt │ ├── adapter_model.safetensors │ ├── config.json - │ ├── ft-model-00001-of-00002.safetensors - │ ├── ft-model-00002-of-00002.safetensors + │ ├── model-00001-of-00002.safetensors + │ ├── model-00002-of-00002.safetensors │ ├── generation_config.json │ ├── LICENSE.txt │ ├── model.safetensors.index.json @@ -168,7 +168,7 @@ There are 3 types of folders: Let's understand the files: - ``adapter_model.safetensors`` and ``adapter_model.pt`` are your LoRA trained adapter weights. We save a duplicated .pt version of it to facilitate resuming from checkpoint. -- ``ft-model-{}-of-{}.safetensors`` are your trained full model weights (not adapters). When LoRA finetuning, these are only present if we set ``save_adapter_weights_only=False``. In that case, we merge the merged base model with trained adapters, making inference easier. +- ``model-{}-of-{}.safetensors`` are your trained full model weights (not adapters). When LoRA finetuning, these are only present if we set ``save_adapter_weights_only=False``. In that case, we merge the merged base model with trained adapters, making inference easier. - ``adapter_config.json`` is used by Huggingface PEFT when loading an adapter (more on that later); - ``model.safetensors.index.json`` is used by Hugging Face ``from_pretrained()`` when loading the model weights (more on that later) - All other files were originally in the checkpoint_dir. They are automatically copied during training. Files over 100MiB and ending on .safetensors, .pth, .pt, .bin are ignored, making it lightweight. @@ -223,8 +223,8 @@ Notice that we are using the merged weights, and not the LoRA adapters. _component_: torchtune.training.FullModelHFCheckpointer checkpoint_dir: ${output_dir} checkpoint_files: [ - ft-model-00001-of-00002.safetensors, - ft-model-00002-of-00002.safetensors, + model-00001-of-00002.safetensors, + model-00002-of-00002.safetensors, ] output_dir: ${output_dir} model_type: LLAMA3_2 @@ -299,8 +299,8 @@ Let's modify ``custom_generation_config.yaml`` to include the following changes. _component_: torchtune.training.FullModelHFCheckpointer checkpoint_dir: ${checkpoint_dir} checkpoint_files: [ - ft-model-00001-of-00002.safetensors, - ft-model-00002-of-00002.safetensors, + model-00001-of-00002.safetensors, + model-00002-of-00002.safetensors, ] output_dir: ${output_dir} model_type: LLAMA3_2