-
Notifications
You must be signed in to change notification settings - Fork 693
update teacher checkpointer paths in KD config #2496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2496
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 3609dba with merge base dab36d2 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
pbontrager
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm uncomfortable with this PR because it's making a lot of assumptions about where things are saved during fine-tuning that might break in the future. It also makes the recipe dependent on having run another recipe which is new. Could you add more context on how common it is to finetune the teacher first?
| teacher_checkpointer: | ||
| _component_: torchtune.training.FullModelHFCheckpointer | ||
| checkpoint_dir: /tmp/Meta-Llama-3.1-8B-Instruct/ | ||
| checkpoint_dir: /tmp/torchtune/llama3_1_8B/lora/epoch_0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is epoch 0 the right default here?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2496 +/- ##
=======================================
Coverage 23.15% 23.15%
=======================================
Files 379 379
Lines 22838 22838
=======================================
Hits 5289 5289
Misses 17549 17549 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@pbontrager these are fair comments. Tbh I'm not sure what the right thing to do here is, but would point to GRPO where we do something pretty similar (though ofc that is just in dev for now). The results are better when the teacher model is finetuned first, as discussed in the blog post. So based on that I claim this is the right thing to do, but understand your point around the usage of |
Fix checkpoint path in one of our KD configs