Skip to content

Conversation

@joecummings
Copy link
Member

@joecummings joecummings commented Feb 5, 2025

Why? Our README was too long and cluttered and did not focus on the strengths of our library.

Changes:

  • Rework recipes section to clearly define the sections of the post-training lifecycle
  • Move new recipes section to the front
  • Remove old models from model table and instead point people to a full list of models if they want to see
  • Hide large commands
  • Move citations before license (b/c it's more relevant)

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 5, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2349

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1eb4ad7 with merge base a965fb0 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 5, 2025
@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.12%. Comparing base (7f3e70e) to head (c53fbd4).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2349      +/-   ##
==========================================
+ Coverage   65.98%   66.12%   +0.14%     
==========================================
  Files         366      366              
  Lines       21685    21771      +86     
==========================================
+ Hits        14308    14397      +89     
+ Misses       7377     7374       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@joecummings joecummings changed the title Simplify recipes section of readme Rework recipes section of README and simplify models ref Feb 6, 2025
| Type of Weight Update | 1 Device | >1 Device | >1 Node |
|-----------------------|:--------:|:---------:|:-------:|
| Full ||||
| [LoRA/QLoRA](https://pytorch.org/torchtune/stable/recipes/lora_finetune_single_device.html) ||||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow no love for DoRA

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, basically the same as LoRA / QLoRA and we say we support it in our actual docs.

Comment on lines +55 to +56
| Full ||||
| LoRA/QLoRA ||||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my other comment -- it does make me a little sad that we don't have any links to these. We really need to get at least some minimal docs page for each of our recipes. Do we have an issue tracking that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need to do this, wow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


 

### Models
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't really like the decreased emphasis on models. My assumption is that most people are gonna want to see pretty quickly what we support, and this buries it a bit too much. (It's fine to move it down, but idk why we need to take it out of table format)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for something in between. Showing each model family we support and pointing to the latest version, but not as verbose as the current table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with something in-between. Happy?

@joecummings joecummings changed the title Rework recipes section of README and simplify models ref Simplify README and prominently display recipes Feb 12, 2025
@SalmanMohammadi
Copy link
Contributor

sorry

joecummings and others added 3 commits February 12, 2025 12:51
Co-authored-by: salman <[email protected]>
Co-authored-by: salman <[email protected]>
Co-authored-by: salman <[email protected]>
@joecummings
Copy link
Member Author

sorry

Approval pls.

README.md Outdated
Example: ``tune run knowledge_distillation_distributed --config qwen2/1.5B_to_0.5B_KD_lora_distributed`` <br />
You can also run e.g. ``tune ls knowledge_distillation_distributed`` for a full list of available configs.

**Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)**
**Preference Learning / Reinforcement Learning**

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is the right way to think about it, but how many people would recognize Preference Learning over the term RLHF? I feel like RLHF is much more recognized for better or worse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's your call

|------------------------------|-----------------------|:--------:|:---------:|:-------:|
| [DPO](https://pytorch.org/torchtune/stable/recipes/dpo.html) | Full ||||
| | LoRA/QLoRA ||||
| PPO | Full ||||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| PPO | Full ||||
| PPO (RLHF) | Full ||||

up to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - why make the distinction here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our upcoming GRPO implementation != RLHF. GRPO is just another optimization algorithm, the R1-style recipe we're going to land uses verifiable (rather than human rewards), so it's just RL. You could use PPO with verifiable rewards too, and it also wouldn't be RLHF

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see, clearly have not looked closely enough at our PPO recipe.

Copy link
Contributor

@SalmanMohammadi SalmanMohammadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last nits

Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@joecummings joecummings merged commit f67ccda into meta-pytorch:main Feb 13, 2025
17 checks passed
@joecummings joecummings deleted the simplify-readme branch February 13, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants