Simplify README and prominently display recipes #2349

joecummings · 2025-02-05T22:09:32Z

Why? Our README was too long and cluttered and did not focus on the strengths of our library.

Changes:

Rework recipes section to clearly define the sections of the post-training lifecycle
Move new recipes section to the front
Remove old models from model table and instead point people to a full list of models if they want to see
Hide large commands
Move citations before license (b/c it's more relevant)

pytorch-bot · 2025-02-05T22:09:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2349

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1eb4ad7 with merge base a965fb0 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

README.md

codecov-commenter · 2025-02-06T16:21:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.12%. Comparing base (7f3e70e) to head (c53fbd4).
Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2349      +/-   ##
==========================================
+ Coverage   65.98%   66.12%   +0.14%     
==========================================
  Files         366      366              
  Lines       21685    21771      +86     
==========================================
+ Hits        14308    14397      +89     
+ Misses       7377     7374       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

README.md

ebsmothers · 2025-02-10T22:03:21Z

README.md

+| Type of Weight Update | 1 Device | >1 Device | >1 Node |
+|-----------------------|:--------:|:---------:|:-------:|
+| Full                  |    ✅    |     ✅    |   ✅    |
+| [LoRA/QLoRA](https://pytorch.org/torchtune/stable/recipes/lora_finetune_single_device.html)            |    ✅    |     ✅    |    ❌    |


wow no love for DoRA

Nope, basically the same as LoRA / QLoRA and we say we support it in our actual docs.

README.md

ebsmothers · 2025-02-10T22:21:16Z

README.md

+| Full                  |    ❌    |     ❌    |    ❌    |
+| LoRA/QLoRA            |    ✅    |     ✅    |    ❌    |


Related to my other comment -- it does make me a little sad that we don't have any links to these. We really need to get at least some minimal docs page for each of our recipes. Do we have an issue tracking that?

We really need to do this, wow.

ebsmothers · 2025-02-10T22:22:40Z

README.md

+
+&nbsp;
+
+### Models


Personally I don't really like the decreased emphasis on models. My assumption is that most people are gonna want to see pretty quickly what we support, and this buries it a bit too much. (It's fine to move it down, but idk why we need to take it out of table format)

I'd vote for something in between. Showing each model family we support and pointing to the latest version, but not as verbose as the current table.

I went with something in-between. Happy?

README.md

SalmanMohammadi · 2025-02-12T17:50:06Z

~~sorry~~

Co-authored-by: salman <[email protected]>

README.md

Co-authored-by: salman <[email protected]>

joecummings · 2025-02-12T17:57:52Z

~~sorry~~

Approval pls.

README.md

SalmanMohammadi · 2025-02-12T18:08:27Z

README.md

+Example: ``tune run knowledge_distillation_distributed --config qwen2/1.5B_to_0.5B_KD_lora_distributed`` <br />
+You can also run e.g. ``tune ls knowledge_distillation_distributed`` for a full list of available configs.
+
+**Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)**


Suggested change

**Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)**

**Preference Learning / Reinforcement Learning**

I agree this is the right way to think about it, but how many people would recognize Preference Learning over the term RLHF? I feel like RLHF is much more recognized for better or worse.

yeah it's your call

SalmanMohammadi · 2025-02-12T18:09:04Z

README.md

+|------------------------------|-----------------------|:--------:|:---------:|:-------:|
+| [DPO](https://pytorch.org/torchtune/stable/recipes/dpo.html)                          | Full                  |    ❌    |     ✅    |    ❌    |
+|                           | LoRA/QLoRA            |    ✅    |     ✅    |    ❌    |
+| PPO                          | Full                  |    ✅    |     ❌    |    ❌    |


Suggested change

| PPO | Full | ✅ | ❌ | ❌ |

| PPO (RLHF) | Full | ✅ | ❌ | ❌ |

up to you

Just curious - why make the distinction here?

our upcoming GRPO implementation != RLHF. GRPO is just another optimization algorithm, the R1-style recipe we're going to land uses verifiable (rather than human rewards), so it's just RL. You could use PPO with verifiable rewards too, and it also wouldn't be RLHF

Ahh I see, clearly have not looked closely enough at our PPO recipe.

SalmanMohammadi

last nits

pbontrager

LGTM! Thanks

Simplify recipes section of readme

440282a

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2025

SalmanMohammadi reviewed Feb 5, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

Idk

b5de3a7

joecummings changed the title ~~Simplify recipes section of readme~~ Rework recipes section of README and simplify models ref Feb 6, 2025

SalmanMohammadi reviewed Feb 6, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

Add link and red X

59bf1da

bogdansalyp reviewed Feb 10, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

ebsmothers reviewed Feb 10, 2025

View reviewed changes

joecummings added 3 commits February 12, 2025 09:35

Comments

c1b6a9c

Update with ls command

038a56f

Add breaks

ae8941a

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

joecummings changed the title ~~Rework recipes section of README and simplify models ref~~ Simplify README and prominently display recipes Feb 12, 2025

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

joecummings and others added 3 commits February 12, 2025 12:51

Oxford comma (1)

de672dc

Co-authored-by: salman <[email protected]>

Oxford comma (2)

19df172

Co-authored-by: salman <[email protected]>

Spelling error

2bb6666

Co-authored-by: salman <[email protected]>

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

joecummings and others added 2 commits February 12, 2025 12:53

Merge branch 'main' into simplify-readme

857eb7a

Oxford comma (3)

84c0f6a

Co-authored-by: salman <[email protected]>

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

README.md Show resolved Hide resolved

Make them headers

ab51131

SalmanMohammadi reviewed Feb 12, 2025

View reviewed changes

SalmanMohammadi approved these changes Feb 12, 2025

View reviewed changes

joecummings added 3 commits February 12, 2025 10:09

Weird merge

0035d57

Weird merge again

dd21b92

Update hidden command

c53fbd4

pbontrager approved these changes Feb 12, 2025

View reviewed changes

Nit

1eb4ad7

joecummings merged commit f67ccda into meta-pytorch:main Feb 13, 2025
17 checks passed

joecummings deleted the simplify-readme branch February 13, 2025 15:42

	Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)
	Preference Learning / Reinforcement Learning

	\| PPO \| Full \| ✅ \| ❌ \| ❌ \|
	\| PPO (RLHF) \| Full \| ✅ \| ❌ \| ❌ \|




		### Models

Simplify README and prominently display recipes #2349

Simplify README and prominently display recipes #2349

Uh oh!

Conversation

joecummings commented Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2349

✅ No Failures

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SalmanMohammadi commented Feb 12, 2025

Uh oh!

Uh oh!

joecummings commented Feb 12, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SalmanMohammadi left a comment

Choose a reason for hiding this comment

Uh oh!

pbontrager left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

joecummings commented Feb 5, 2025 •

edited

Loading

pytorch-bot bot commented Feb 5, 2025 •

edited

Loading

codecov-commenter commented Feb 6, 2025 •

edited

Loading