readme updates for full DPO distributed recipe #2363

ebsmothers · 2025-02-07T21:56:23Z

Update our readme to include DPO full finetune distributed recipe now that #2275 has landed

pytorch-bot · 2025-02-07T21:56:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2363

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a0b3a51 with merge base fb52557 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

SalmanMohammadi · 2025-02-08T14:13:23Z

README.md

 | Quantization-Aware Training and LoRA Finetuning | ❌ | ✅ | ❌ | [qat_lora_finetune_distributed](recipes/qat_lora_finetune_distributed.py)| [Llama3 8B QAT](recipes/configs/llama3/8B_qat_lora.yaml)
-| Direct Preference Optimization | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)
+| Direct Preference Optimization: Full Finetuning | ❌ | ✅ | ❌ | [full_dpo_distributed](recipes/full_dpo_distributed.py) | [Llama3.1 8B DPO](recipes/configs/llama3_1/8B_full_dpo.yaml)
+| Direct Preference Optimization with LoRA | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)


Suggested change

| Direct Preference Optimization with LoRA | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)

| LoRA Direct Preference Optimization | ✅ | ✅ | ❌ | [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) | [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)

Also we have more up to date 3.1 8B configs we could point to, if you'd like : )

readme updates for full DPO distributed recipe

bc32679

ebsmothers requested a review from joecummings February 7, 2025 21:56

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2025

ebsmothers requested review from acisseJZhong and felipemello1 February 7, 2025 21:56

SalmanMohammadi reviewed Feb 8, 2025

View reviewed changes

SalmanMohammadi approved these changes Feb 8, 2025

View reviewed changes

felipemello1 approved these changes Feb 8, 2025

View reviewed changes

comments

a0b3a51

ebsmothers merged commit b3964af into meta-pytorch:main Feb 10, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

readme updates for full DPO distributed recipe #2363

readme updates for full DPO distributed recipe #2363

Uh oh!

ebsmothers commented Feb 7, 2025

Uh oh!

pytorch-bot bot commented Feb 7, 2025 •

edited

Loading

Uh oh!

SalmanMohammadi Feb 8, 2025 •

edited

Loading

Uh oh!

SalmanMohammadi Feb 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	\| Direct Preference Optimization with LoRA \| ✅ \| ✅ \| ❌ \| [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) \| [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)
	\| LoRA Direct Preference Optimization \| ✅ \| ✅ \| ❌ \| [lora_dpo_single_device](recipes/lora_dpo_single_device.py) <br> [lora_dpo_distributed](recipes/lora_dpo_distributed.py) \| [Llama2 7B single-device](recipes/configs/llama2/7B_lora_dpo_single_device.yaml) <br> [Llama2 7B distributed](recipes/configs/llama2/7B_lora_dpo.yaml)

readme updates for full DPO distributed recipe #2363

readme updates for full DPO distributed recipe #2363

Uh oh!

Conversation

ebsmothers commented Feb 7, 2025

Uh oh!

pytorch-bot bot commented Feb 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2363

✅ No Failures

Uh oh!

SalmanMohammadi Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SalmanMohammadi Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Feb 7, 2025 •

edited

Loading

SalmanMohammadi Feb 8, 2025 •

edited

Loading