You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,11 +11,11 @@
11
11
12
12
### 📣 Recent updates 📣
13
13
**February 2025*: Multi-node training is officially [open for business in torchtune](https://pytorch.org/torchtune/main/tutorials/multinode.html)! Full finetune on multiple nodes to take advantage of larger batch sizes and models.
14
-
**December 2024*: torchtune now supports **Llama 3.3 70B**! Try it out by following our installation instructions [here](#installation-%EF%B8%8F), then run any of the configs [here](recipes/configs/llama3_3).
14
+
**December 2024*: torchtune now supports **Llama 3.3 70B**! Try it out by following our installation instructions [here](#Installation), then run any of the configs [here](recipes/configs/llama3_3).
15
15
**November 2024*: torchtune has released [v0.4.0](https://github.com/pytorch/torchtune/releases/tag/v0.4.0) which includes stable support for exciting features like activation offloading and multimodal QLoRA
16
16
**November 2024*: torchtune has added [Gemma2](recipes/configs/gemma2) to its models!
17
17
**October 2024*: torchtune added support for Qwen2.5 models - find the configs [here](recipes/configs/qwen2_5/)
18
-
**September 2024*: torchtune has support for **Llama 3.2 11B Vision**, **Llama 3.2 3B**, and **Llama 3.2 1B** models! Try them out by following our installation instructions [here](#installation-%EF%B8%8F), then run any of the text configs [here](recipes/configs/llama3_2) or vision configs [here](recipes/configs/llama3_2_vision).
18
+
**September 2024*: torchtune has support for **Llama 3.2 11B Vision**, **Llama 3.2 3B**, and **Llama 3.2 1B** models! Try them out by following our installation instructions [here](#Installation), then run any of the text configs [here](recipes/configs/llama3_2) or vision configs [here](recipes/configs/llama3_2_vision).
19
19
20
20
21
21
@@ -25,9 +25,9 @@
25
25
26
26
torchtune is a PyTorch library for easily authoring, post-training, and experimenting with LLMs. It provides:
27
27
28
-
- Hackable training recipes for SFT, knowledge distillation, DPO, PPO, GRPO, and quantization-aware training
28
+
- Hackable training recipes for SFT, knowledge distillation, RL and RLHF, and quantization-aware training
29
29
- Simple PyTorch implementations of popular LLMs like Llama, Gemma, Mistral, Phi, Qwen, and more
30
-
-Best-in-class memory efficiency, performance improvements, and scaling, utilizing the latest PyTorch APIs
30
+
-OOTB best-in-class memory efficiency, performance improvements, and scaling, utilizing the latest PyTorch APIs
31
31
- YAML configs for easily configuring training, evaluation, quantization or inference recipes
Copy file name to clipboardExpand all lines: docs/source/tutorials/e2e_flow.rst
+9-8Lines changed: 9 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@ Finetune your model
29
29
-------------------
30
30
31
31
First, let's download a model using the tune CLI. The following command will download the `Llama3.2 3B Instruct <https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/>`_
32
-
model from the Hugging Face Hub and save it to the local filesystem. Hugging Face uploaded the original
32
+
model from the Hugging Face Hub and save it the local filesystem. Hugging Face uploaded the original
33
33
weights (``consolidated.00.pth``) and the weights compatible with the `from_pretrained() <https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#frompretrained>`_ API (``*.safetensors``).
34
34
We don't need both so we'll ignore the original weights when downloading.
35
35
@@ -168,15 +168,15 @@ There are 3 types of folders:
168
168
Let's understand the files:
169
169
170
170
- ``adapter_model.safetensors`` and ``adapter_model.pt`` are your LoRA trained adapter weights. We save a duplicated .pt version of it to facilitate resuming from checkpoint.
171
-
- ``model-{}-of-{}.safetensors`` are your trained full model weights (not adapters). When LoRA finetuning, these are only present if we set ``save_adapter_weights_only=False``. In that case, we merge the base model with trained adapters, making inference easier.
171
+
- ``model-{}-of-{}.safetensors`` are your trained full model weights (not adapters). When LoRA finetuning, these are only present if we set ``save_adapter_weights_only=False``. In that case, we merge the merged base model with trained adapters, making inference easier.
172
172
- ``adapter_config.json`` is used by Huggingface PEFT when loading an adapter (more on that later);
173
173
- ``model.safetensors.index.json`` is used by Hugging Face ``from_pretrained()`` when loading the model weights (more on that later)
174
-
- All other files were originally in the checkpoint_dir. They are automatically copied during training. Files over 100MiB and ending in .safetensors, .pth, .pt, .bin are ignored, making it lightweight.
174
+
- All other files were originally in the checkpoint_dir. They are automatically copied during training. Files over 100MiB and ending on .safetensors, .pth, .pt, .bin are ignored, making it lightweight.
175
175
176
176
Evaluate your model
177
177
-------------------
178
178
179
-
We've fine-tuned a model. But how well does this model really do? Let's determine this through structured evaluation and playing with it.
179
+
We've fine-tuned a model. But how well does this model really do? Let's determine this through structured evaluation and playing around with it.
180
180
181
181
.. _eval_harness_label:
182
182
@@ -364,8 +364,9 @@ to those in the previously-linked table.
364
364
Use your model in the wild
365
365
--------------------------
366
366
367
-
Let's say we're happy with how our model is performing at this point - we want to do something with it! Productionize it for serving, publish on the Hugging Face Hub, etc.
368
-
Since we handle checkpoint conversion, you can directly work with standard formats.
367
+
Let's say we're happy with how our model is performing at this point - we want to do something with it! Productionize for serving, publish on the Hugging Face Hub, etc.
368
+
As we mentioned above, one of the benefits of handling of the checkpoint conversion is that you can directly work with standard formats. This helps
369
+
with interoperability with other libraries since torchtune doesn't add yet another format to the mix.
369
370
370
371
Use with Hugging Face ``from_pretrained()``
371
372
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -438,8 +439,8 @@ Use with vLLM
438
439
`vLLM <https://docs.vllm.ai/en/latest/>`_ is a fast and easy-to-use library for LLM inference and serving. They include a lot of awesome features like
439
440
state-of-the-art serving throughput, continuous batching of incoming requests, quantization, and speculative decoding.
440
441
441
-
The library will load any .safetensors file. Since we already merged the full model weights and adapter weights, we can safely delete the
442
-
adapter weights (or move them) so that vLLM doesn't get confused by those files.
442
+
The library will load any .safetensors file. Since here we mixed both the full model weights and adapter weights, we have to delete the
0 commit comments