-
Notifications
You must be signed in to change notification settings - Fork 693
Simplify README and prominently display recipes #2349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2349
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1eb4ad7 with merge base a965fb0 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2349 +/- ##
==========================================
+ Coverage 65.98% 66.12% +0.14%
==========================================
Files 366 366
Lines 21685 21771 +86
==========================================
+ Hits 14308 14397 +89
+ Misses 7377 7374 -3 ☔ View full report in Codecov by Sentry. |
| | Type of Weight Update | 1 Device | >1 Device | >1 Node | | ||
| |-----------------------|:--------:|:---------:|:-------:| | ||
| | Full | ✅ | ✅ | ✅ | | ||
| | [LoRA/QLoRA](https://pytorch.org/torchtune/stable/recipes/lora_finetune_single_device.html) | ✅ | ✅ | ❌ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow no love for DoRA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, basically the same as LoRA / QLoRA and we say we support it in our actual docs.
| | Full | ❌ | ❌ | ❌ | | ||
| | LoRA/QLoRA | ✅ | ✅ | ❌ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my other comment -- it does make me a little sad that we don't have any links to these. We really need to get at least some minimal docs page for each of our recipes. Do we have an issue tracking that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to do this, wow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| | ||
|
|
||
| ### Models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I don't really like the decreased emphasis on models. My assumption is that most people are gonna want to see pretty quickly what we support, and this buries it a bit too much. (It's fine to move it down, but idk why we need to take it out of table format)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote for something in between. Showing each model family we support and pointing to the latest version, but not as verbose as the current table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with something in-between. Happy?
|
|
Co-authored-by: salman <[email protected]>
Co-authored-by: salman <[email protected]>
Co-authored-by: salman <[email protected]>
Co-authored-by: salman <[email protected]>
Approval pls. |
README.md
Outdated
| Example: ``tune run knowledge_distillation_distributed --config qwen2/1.5B_to_0.5B_KD_lora_distributed`` <br /> | ||
| You can also run e.g. ``tune ls knowledge_distillation_distributed`` for a full list of available configs. | ||
|
|
||
| **Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| **Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)** | |
| **Preference Learning / Reinforcement Learning** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is the right way to think about it, but how many people would recognize Preference Learning over the term RLHF? I feel like RLHF is much more recognized for better or worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's your call
| |------------------------------|-----------------------|:--------:|:---------:|:-------:| | ||
| | [DPO](https://pytorch.org/torchtune/stable/recipes/dpo.html) | Full | ❌ | ✅ | ❌ | | ||
| | | LoRA/QLoRA | ✅ | ✅ | ❌ | | ||
| | PPO | Full | ✅ | ❌ | ❌ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | PPO | Full | ✅ | ❌ | ❌ | | |
| | PPO (RLHF) | Full | ✅ | ❌ | ❌ | |
up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - why make the distinction here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our upcoming GRPO implementation != RLHF. GRPO is just another optimization algorithm, the R1-style recipe we're going to land uses verifiable (rather than human rewards), so it's just RL. You could use PPO with verifiable rewards too, and it also wouldn't be RLHF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I see, clearly have not looked closely enough at our PPO recipe.
SalmanMohammadi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last nits
pbontrager
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks
Why? Our README was too long and cluttered and did not focus on the strengths of our library.
Changes: