We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent c53fbd4 commit 1eb4ad7Copy full SHA for 1eb4ad7
README.md
@@ -56,7 +56,7 @@ You can also run e.g. ``tune ls lora_finetune_single_device`` for a full list of
56
Example: ``tune run knowledge_distillation_distributed --config qwen2/1.5B_to_0.5B_KD_lora_distributed`` <br />
57
You can also run e.g. ``tune ls knowledge_distillation_distributed`` for a full list of available configs.
58
59
-#### Reinforcement Learning + Reinforcement Learning from Human Feedback (RLHF)
+#### Reinforcement Learning / Reinforcement Learning from Human Feedback (RLHF)
60
61
| Method | Type of Weight Update | 1 Device | >1 Device | >1 Node |
62
|------------------------------|-----------------------|:--------:|:---------:|:-------:|
0 commit comments