Readme Updates #1664

ebsmothers · 2024-09-24T16:21:20Z

Long-overdue updates to our README. Changes include:

Updating memory and tokens/sec numbers to report for Llama3.1 instead of Llama2.
Providing a (n imo) clearer version of our table of supported recipes
Generally reorganizing things. Now the flow is

Intro
- Models
- Recipes
- Memory efficiency and perf
Install
Getting Started
- Downloading models
- Running recipes
- Modifying configs
Llama3 and 3.1
Community
- Community Contributions
Acknowledgments
License

Updating the intro bullet points slightly for clarity.. now the first three more explicitly outline the three tables that follow
Removing the design principles section

A bunch of other small/cosmetic changes, but these are the main ones

pytorch-bot · 2024-09-24T16:21:23Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1664

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b7507ea with merge base b4fea32 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

felipemello1 · 2024-09-24T16:30:16Z

README.md

+Alternatively, you can install a nightly build of torchtune to gain access to the latest features not yet available in the stable release.
+
+```bash
+pip install --pre torchtune --extra-index-url https://download.pytorch.org/whl/nightly/cpu


should we only recommend torchtune nightlies with torch/torchao/torchvision nightlies? In other words, is it possible that someone will install stable torchtune, nightlies torch, and have issues?

Yeah this is a good point. Will update

Fwiw we can support torchtune nightlies + PyTorch stable, but prob won't mention it cause it just complicates things

felipemello1 · 2024-09-24T16:32:44Z

README.md

-&nbsp;

 ### Downloading a model



downloading a model recommends llama 3 instead of 3.1

felipemello1 · 2024-09-24T16:33:52Z

README.md

-## Design Principles
+---
+
+## Llama3 and Llama3.1


why llama 3? I think its fine to just use 3.1. This may get stable soon, so may keeping it just as "llama" is better

Yeah I will drop 3, probably will keep 3.1 in just to be explicit (otherwise it's not clear what we're referring to). One thing is that the tutorial in the docs is still for Llama3, not 3.1. So strictly speaking it's not 100% correct, but maybe that's a minor point

felipemello1 · 2024-09-24T16:36:13Z

README.md

+
+## Llama3 and Llama3.1
+
+torchtune supports fine-tuning for the Llama3/Llama3.1 8B, 70B, and 405B size models. You can fine-tune the 8B model with LoRA, QLoRA and full fine-tunes on one or more GPUs. You can also fine-tune the 70B model with QLoRA on a single device or LoRA and full-finetunes on multiple devices. Finally, you can fine-tune the 405B model on a single node with QLoRA. For all the details, take a look at our [tutorial](https://pytorch.org/torchtune/main/tutorials/llama3.html).


imo, less is more. No need to specify 8b/70b/405b in 3 different lines. Something like this should be good

"torchtune supports fine-tuning for the Llama3/Llama3.1 8B, 70B, and 405B size models with LoRA, QLoRA and full fine-tunes on one or more GPUs, depending on the model size. For all the details, take a look at our tutorial."

RdoubleA · 2024-09-24T16:37:49Z

I think we absolutely need a section in getting started on running on a custom dataset, maybe even listing out the datset types we support and pointing to docs for more details.

I don't see any mention of multimodal except in the install for why we depend on torchvision, we should be highlighting this somewhere

felipemello1 · 2024-09-24T16:40:38Z

README.md

+LoRA 70B
+
+Note that the download command for the Meta-Llama3 70B model slightly differs from download commands for the 8B models. This is because we use the HuggingFace [safetensor](https://huggingface.co/docs/safetensors/en/index) model format to load the model. To download the 70B model, run
+```bash


i dont think that this warning is really necessary. But we should add a link to our models.api, that talks about downloading every model, not only llama. IMO, llama here should just be an example, not a full tutorial to download every llama model.

felipemello1 · 2024-09-24T16:41:49Z

README.md

+
+Note that the download command for the Meta-Llama3 70B model slightly differs from download commands for the 8B models. This is because we use the HuggingFace [safetensor](https://huggingface.co/docs/safetensors/en/index) model format to load the model. To download the 70B model, run
+```bash
+tune download meta-llama/Meta-Llama-3.1-70b --hf-token <> --output-dir /tmp/Meta-Llama-3.1-70b --ignore-patterns "original/consolidated*"


Not sure if i fully follow why for llama 8b we just share tune run, but for 70b, we have also download instructions here.

felipemello1 · 2024-09-24T16:43:08Z

README.md

+```

-torchtune is designed to be easy to understand, use and extend.
+You can find a full list of all our Llama3 configs [here](recipes/configs/llama3) and Llama3.1 configs [here.](recipes/configs/llama3_1)


this is nice, but i dont think that we should share only about llama. Maybe shoutout llama 3.1. and replace llama3 with just recipes/configs.

felipemello1 · 2024-09-24T16:44:32Z

README.md

 &nbsp;

-## Community Contributions
+### Community Contributions


are we keeping this? I remember we had a discussion about it a while ago

Yeah I planned to keep it for now

ebsmothers · 2024-09-24T16:45:46Z

I think we absolutely need a section in getting started on running on a custom dataset, maybe even listing out the datset types we support and pointing to docs for more details.

I don't see any mention of multimodal except in the install for why we depend on torchvision, we should be highlighting this somewhere

I would punt that one to you and/or @pbontrager. The goal of these changes is just to get us from 6 months behind to approximately present-day. And imo custom datasets can get hairy quickly, I think the readme should be simple and to-the-point, leveraging our live docs as a reference. E.g. even axolotl's readme basically only has two sentences on datasets and just points to their docs.

felipemello1 · 2024-09-24T16:50:41Z

README.md

-| 8 x A100     |     LoRA          |  Llama2-70B     |    Batch Size = 4, Seq Length = 4096   | 26.4 GB | 3384  |
-| 8 x A100     |   Full Finetune *   |  Llama2-70B     |    Batch Size = 4, Seq Length = 4096   | 70.4 GB | 2032  |
+| Fine-tuning method                          | Devices | Recipe  | Example config(s) |
+|:-:|:-:|:-:|:-:|


I think its worth adding another column with [Speed optimized, memory optimized], or it may not be clear for the user why examples are duplicated and numbers are different

probably worth also adding bsz and max_seq_len

On the second point, I followed your advice and use fixed bsz and max_seq_len for all examples. I call this out in the note just above the table. I left out the speed vs memory optimized column because it's really only applicable to like 50% of the rows, so it might be redundant. Tried to instead include the hardware type to delineate that one can be run in a more memory-constrained environment

ebsmothers · 2024-09-24T16:58:27Z

README.md

-To get started with fine-tuning your first LLM with torchtune, see our tutorial on [fine-tuning Llama2 7B](https://pytorch.org/torchtune/main/tutorials/first_finetune_tutorial.html). Our [end-to-end workflow](https://pytorch.org/torchtune/main/tutorials/e2e_flow.html) tutorial will show you how to evaluate, quantize and run inference with this model. The rest of this section will provide a quick overview of these steps with Llama2.
+To get started with torchtune, see our [Fine-Tune Your First LLM Tutorial](https://pytorch.org/torchtune/main/tutorials/first_finetune_tutorial.html). Our [end-to-end workflow](https://pytorch.org/torchtune/main/tutorials/e2e_flow.html) tutorial will show you how to evaluate, quantize and run inference with this model. The rest of this section will provide a quick overview of these steps with Llama3.
+
+If you have a more custom workflow or need additional information on torchtune components and recipes, please check out our documentation page


TODO: update

joecummings

Controversial, but I would be in favor of dropping the full section on Llama 3 and Llama 3.1. We already say what models we support and can highlight new model updates at the top of the README in recent updates section

joecummings

meh

codecov-commenter · 2024-09-24T20:28:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.13%. Comparing base (50b24e5) to head (cc4dbff).
Report is 7 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1664      +/-   ##
==========================================
- Coverage   71.11%   69.13%   -1.98%     
==========================================
  Files         297      298       +1     
  Lines       15120    15165      +45     
==========================================
- Hits        10752    10485     -267     
- Misses       4368     4680     +312

Flag	Coverage Δ
	`69.13% <ø> (-1.98%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

readme updates

4a47f3e

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 24, 2024

ebsmothers added 4 commits September 24, 2024 09:22

typo fix

7f79285

add 405b

6b53109

other small fixes

bfcdf67

link fix

a51e348

felipemello1 reviewed Sep 24, 2024

View reviewed changes

clarify a couple statements

69ba61b

felipemello1 reviewed Sep 24, 2024

View reviewed changes

address comments

10c0d96

ebsmothers commented Sep 24, 2024

View reviewed changes

joecummings reviewed Sep 24, 2024

View reviewed changes

ebsmothers added 3 commits September 24, 2024 10:10

remove llama3.1 section

9d3e5cf

comments and cleanup

6c726a9

nits

cc4dbff

joecummings approved these changes Sep 24, 2024

View reviewed changes

capitalization

b7507ea

ebsmothers merged commit 30b8519 into meta-pytorch:main Sep 24, 2024

ebsmothers deleted the readme-updates branch September 24, 2024 21:18


		## Llama3 and Llama3.1

		torchtune supports fine-tuning for the Llama3/Llama3.1 8B, 70B, and 405B size models. You can fine-tune the 8B model with LoRA, QLoRA and full fine-tunes on one or more GPUs. You can also fine-tune the 70B model with QLoRA on a single device or LoRA and full-finetunes on multiple devices. Finally, you can fine-tune the 405B model on a single node with QLoRA. For all the details, take a look at our [tutorial](https://pytorch.org/torchtune/main/tutorials/llama3.html).

Readme Updates #1664

Readme Updates #1664

Uh oh!

Conversation

ebsmothers commented Sep 24, 2024

Uh oh!

pytorch-bot bot commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1664

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RdoubleA commented Sep 24, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebsmothers commented Sep 24, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Sep 24, 2024

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot bot commented Sep 24, 2024 •

edited

Loading