update teacher checkpointer paths in KD config #2496

ebsmothers · 2025-03-13T16:00:55Z

Fix checkpoint path in one of our KD configs

pytorch-bot · 2025-03-13T16:00:58Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2496

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3609dba with merge base dab36d2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pbontrager

I'm uncomfortable with this PR because it's making a lot of assumptions about where things are saved during fine-tuning that might break in the future. It also makes the recipe dependent on having run another recipe which is new. Could you add more context on how common it is to finetune the teacher first?

pbontrager · 2025-03-13T16:04:41Z

recipes/configs/llama3_2/8B_to_1B_KD_lora_distributed.yaml

 teacher_checkpointer:
  _component_: torchtune.training.FullModelHFCheckpointer
-  checkpoint_dir: /tmp/Meta-Llama-3.1-8B-Instruct/
+  checkpoint_dir: /tmp/torchtune/llama3_1_8B/lora/epoch_0


Why is epoch 0 the right default here?

codecov-commenter · 2025-03-13T16:27:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 23.15%. Comparing base (dab36d2) to head (745a5e9).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2496   +/-   ##
=======================================
  Coverage   23.15%   23.15%           
=======================================
  Files         379      379           
  Lines       22838    22838           
=======================================
  Hits         5289     5289           
  Misses      17549    17549

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ebsmothers · 2025-03-13T18:20:54Z

@pbontrager these are fair comments. Tbh I'm not sure what the right thing to do here is, but would point to GRPO where we do something pretty similar (though ofc that is just in dev for now). The results are better when the teacher model is finetuned first, as discussed in the blog post. So based on that I claim this is the right thing to do, but understand your point around the usage of epoch_0 being a bit finicky.

ebsmothers added 2 commits March 13, 2025 08:52

update teacher checkpointer paths in KD configs

62e80f2

fix filename

745a5e9

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 13, 2025

pbontrager reviewed Mar 13, 2025

View reviewed changes

ebsmothers added 2 commits March 13, 2025 17:16

philip's comments

8796d67

typo

3609dba

ebsmothers changed the title ~~update teacher checkpointer paths in KD configs~~ update teacher checkpointer paths in KD config Mar 14, 2025

felipemello1 approved these changes Mar 14, 2025

View reviewed changes

ebsmothers merged commit ab8c23e into meta-pytorch:main Mar 14, 2025
17 checks passed

pbontrager pushed a commit to pbontrager/torchtune that referenced this pull request Mar 17, 2025

update teacher checkpointer paths in KD config (meta-pytorch#2496)

49a4fb5

pbontrager pushed a commit that referenced this pull request Mar 17, 2025

update teacher checkpointer paths in KD config (#2496)

739d42b

ianbarber pushed a commit to ianbarber/torchtune that referenced this pull request Mar 19, 2025

update teacher checkpointer paths in KD config (meta-pytorch#2496)

a789cff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update teacher checkpointer paths in KD config #2496

update teacher checkpointer paths in KD config #2496

Uh oh!

ebsmothers commented Mar 13, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 13, 2025 •

edited

Loading

Uh oh!

pbontrager left a comment

Uh oh!

pbontrager Mar 13, 2025

Uh oh!

codecov-commenter commented Mar 13, 2025 •

edited

Loading

Uh oh!

ebsmothers commented Mar 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

update teacher checkpointer paths in KD config #2496

update teacher checkpointer paths in KD config #2496

Uh oh!

Conversation

ebsmothers commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2496

✅ No Failures

Uh oh!

pbontrager left a comment

Choose a reason for hiding this comment

Uh oh!

pbontrager Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ebsmothers commented Mar 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ebsmothers commented Mar 13, 2025 •

edited

Loading

pytorch-bot bot commented Mar 13, 2025 •

edited

Loading

codecov-commenter commented Mar 13, 2025 •

edited

Loading