Skip to content

Commit 1da83d6

Browse files
RdoubleAkartikayk
andauthored
Add documentation on adding new recipe params (#311)
Co-authored-by: Kartikay Khandelwal <[email protected]>
1 parent cfedcae commit 1da83d6

File tree

4 files changed

+182
-22
lines changed

4 files changed

+182
-22
lines changed

docs/source/api_ref_utilities.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ torchtune.utils
77
.. _dist_label:
88

99
Distributed
10-
------------
10+
-----------
1111

1212
.. autosummary::
1313
:toctree: generated/
@@ -19,8 +19,8 @@ Distributed
1919

2020
.. _mp_label:
2121

22-
Mixed Precsion
23-
--------------
22+
Mixed Precision
23+
---------------
2424

2525
.. autosummary::
2626
:toctree: generated/
@@ -58,7 +58,7 @@ Metric Logging
5858
metric_logging.DiskLogger
5959

6060
Data
61-
-----
61+
----
6262

6363
.. autosummary::
6464
:toctree: generated/
@@ -68,7 +68,7 @@ Data
6868
collate.padded_collate
6969

7070
Checkpoint saving & loading
71-
----------------------------
71+
---------------------------
7272

7373
.. autosummary::
7474
:toctree: generated/

docs/source/examples/configs.rst

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
.. _config_tutorial_label:
2+
3+
=================
4+
Configs Deep-Dive
5+
=================
6+
7+
This tutorial will guide you through writing configs for running recipes and
8+
designing params for custom recipes.
9+
10+
.. grid:: 2
11+
12+
.. grid-item-card:: :octicon:`mortar-board;1em;` What you will learn
13+
14+
* How to write a YAML config and run a recipe with it
15+
* How to create a params dataclass for custom recipe
16+
* How to effectively use configs, CLI overrides, and dataclasses for running recipes
17+
18+
.. grid-item-card:: :octicon:`list-unordered;1em;` Prerequisites
19+
20+
* Be familiar with the :ref:`overview of TorchTune<overview_label>`
21+
* Make sure to :ref:`install TorchTune<install_label>`
22+
* Understand the :ref:`fundamentals of recipes<recipe_deepdive>`
23+
24+
25+
Where do parameters live?
26+
-------------------------
27+
28+
There are two primary entry points for you to configure parameters: **configs** and
29+
**CLI overrides**. Configs are YAML files that define all the
30+
parameters needed to run a recipe within a single location. These can be overridden on the
31+
command-line for quick changes and experimentation without modifying the config.
32+
33+
If you are planning to make a custom recipe, you will need to become familiar
34+
with the **recipe dataclass**, which collects all of your arguments from config and
35+
CLI, and passes it into the recipe itself. Here, we will discuss all three concepts:
36+
**configs**, **CLI**, and **dataclasses**.
37+
38+
39+
Recipe dataclasses
40+
------------------
41+
42+
Parameters should be organized in a single dataclass that is passed into the recipe.
43+
This serves as a single source of truth for the details of a fine-tuning run that can be easily validated in code and shared with collaborators for reproducibility.
44+
45+
.. code-block:: python
46+
47+
class FullFinetuneParams:
48+
# Model
49+
model: str = ""
50+
model_checkpoint: str = ""
51+
52+
In the dataclass, all fields should have defaults assigned to them.
53+
If a reasonable value cannot be assigned or it is a required argument,
54+
use the null value for that data type as the default and ensure that it is set
55+
by the user in the :code:`__post_init__` (see :ref:`Parameter Validation<parameter_validation_label>`).
56+
The dataclass should go in the :code:`recipes/params/` folder and the name of
57+
the file should match the name of the recipe file you are creating.
58+
59+
In general, you should expose the minimal amount of parameters you need to run and experiment with your recipes.
60+
Exposing an excessive number of parameters will lead to bloated configs, which are more error prone, harder to read, and harder to manage.
61+
On the other hand, hardcoding all parameters will prevent quick experimentation without a code change. Only parametrize what is needed.
62+
63+
To link the dataclass object with config and CLI parsing,
64+
you can use the :class:`~torchtune.utils.argparse.TuneArgumentParser` object and
65+
funnel the parsed arguments into your dataclass.
66+
67+
.. code-block:: python
68+
69+
if __name__ == "__main__":
70+
parser = utils.TuneArgumentParser(
71+
description=FullFinetuneParams.__doc__,
72+
formatter_class=argparse.RawDescriptionHelpFormatter,
73+
)
74+
# Get user-specified args from config and CLI and create params for recipe
75+
args, _ = parser.parse_known_args()
76+
args = vars(args)
77+
params = FullFinetuneParams(**args)
78+
79+
logger = utils.get_logger("DEBUG")
80+
logger.info(msg=f"Running finetune_llm.py with parameters {params}")
81+
82+
recipe(params)
83+
84+
.. _parameter_validation_label:
85+
86+
Parameter validation
87+
--------------------
88+
To validate arguments for your dataclass and recipe, use the :code:`__post_init__` method to house any checks and raised exceptions.
89+
90+
.. code-block:: python
91+
92+
def __post_init__(self):
93+
for param in fields(self):
94+
if getattr(self, param.name) == "":
95+
raise TypeError(f"{param.name} needs to be specified")
96+
97+
Writing configs
98+
---------------
99+
Once you've set up a recipe and its params, you need to create a config to run it.
100+
Configs serve as the primary entry point for running recipes in TorchTune. They are
101+
expected to be YAML files and simply list out values for parameters you want to define
102+
for a particular run. The config parameters should be a subset of the dataclass parameters;
103+
there should not be any new fields that are not already in the dataclass. Any parameters that
104+
are not specified in the config will take on the default value defined in the dataclass.
105+
106+
.. code-block:: yaml
107+
108+
dataset: alpaca
109+
seed: null
110+
shuffle: True
111+
model: llama2_7b
112+
...
113+
114+
Command-line overrides
115+
----------------------
116+
To enable quick experimentation, you can specify override values to parameters in your config
117+
via the :code:`tune` command. These should be specified with the flag :code:`--override k1=v1 k2=v2 ...`
118+
119+
For example, to run the :code:`full_finetune` recipe with custom model and tokenizer directories and using GPUs, you can provide overrides:
120+
121+
.. code-block:: bash
122+
123+
tune full_finetune --config alpaca_llama2_full_finetune --override model_directory=/home/my_model_checkpoint tokenizer_directory=/home/my_tokenizer_checkpoint device=cuda
124+
125+
The order of overrides from these parameter sources is as follows, with highest precedence first: CLI, Config, Dataclass defaults
126+
127+
128+
Testing configs
129+
---------------
130+
If you plan on contributing your config to the repo, we recommend adding it to the testing suite. TorchTune has testing for every config added to the library, namely ensuring that it instantiates the dataclass and runs the recipe correctly.
131+
132+
To add your config to this test suite, simply update the dictionary in :code:`recipes/tests/configs/test_configs`.
133+
134+
.. code-block:: python
135+
136+
config_to_params = {
137+
os.path.join(ROOT_DIR, "alpaca_llama2_full_finetune.yaml"): FullFinetuneParams,
138+
...,
139+
}
140+
141+
Linking recipes and configs with :code:`tune`
142+
---------------------------------------------
143+
144+
In order to run your custom recipe and configs with :code:`tune`, you must update the :code:`_RECIPE_LIST`
145+
and :code:`_CONFIG_LISTS` in :code:`recipes/__init__.py`
146+
147+
.. code-block:: python
148+
149+
_RECIPE_LIST = ["full_finetune", "lora_finetune", "alpaca_generate", ...]
150+
_CONFIG_LISTS = {
151+
"full_finetune": ["alpaca_llama2_full_finetune"],
152+
"lora_finetune": ["alpaca_llama2_lora_finetune"],
153+
"alpaca_generate": [],
154+
"<your_recipe>": ["<your_config"],
155+
}
156+
157+
Running your recipe
158+
-------------------
159+
If everything is set up correctly, you should be able to run your recipe just like the existing library recipes using the :code:`tune` command:
160+
161+
.. code-block:: bash
162+
163+
tune <recipe> --config <config> --override ...

docs/source/examples/recipe_deepdive.rst

Lines changed: 13 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -51,38 +51,28 @@ What Recipes are not?
5151

5252
- **Monolithic Trainers.** A recipe is **not** a monolithic trainer meant to support every possible feature through 100s of flags.
5353
- **Genealized entry-points.** A recipe is **not** meant to support every possible model archichtecture or fine-tuning method.
54-
- **Wrappers around external frameworks.** A recipe is **not** meant to be a wrapper around external frameworks. These are fully written in native-PyTorch using TorchTune building blocks. Dependencies are primarily in the form of additional utilities or interoperability with the surrounding ecosystem (eg: EluetherAI's evaluation harness).
55-
56-
57-
Configs
58-
-------
59-
60-
If you're new to TorchTune or to LLMs generally, configs would be the first concept to understand and get familiar with.
61-
If you're an advanced user writing your own recipes, adding config files will improve your experimentation velocity and
62-
ability to collaborate on experiments.
63-
64-
- TODO - point to config tutorial after this is landed
54+
- **Wrappers around external frameworks.** A recipe is **not** meant to be a wrapper around external frameworks. These are fully written in native-PyTorch using TorchTune building blocks. Dependencies are primarily in the form of additional utilities or interoperability with the surrounding ecosystem (eg: EleutherAI's evaluation harness).
6555

6656

6757
Recipe Script
6858
-------------
6959

70-
This is the primary entry point for each recipe and provides the user with control over how the recipe is setup, how models are
60+
This is the primary entry point for each recipe and provides the user with control over how the recipe is set up, how models are
7161
trained and how the subsequent checkpoints are used. This includes:
7262

7363
- Setting up of the environment
7464
- Parsing and validating configs
7565
- Training the model
7666
- Post-training operations such as evaluation, quantization, model export, generation etc
77-
- Setting up multi-stage training (eg: Distillation) using multiple Recipe classes
67+
- Setting up multi-stage training (eg: Distillation) using multiple recipe classes
7868

7969

8070
Scripts should generally structure operations in the following order:
8171

8272
- Extract and validate training params
83-
- Intialize th Recipe Class which in-turn intializes recipe state
73+
- Initialize the recipe class which in-turn initializes recipe state
8474
- Load and Validate checkpoint to update recipe state if resuming training
85-
- Initialize recipe components (model, tokeinzer, optimizer, loss and dataloader) from checkpoint (if applicable)
75+
- Initialize recipe components (model, tokenizer, optimizer, loss and dataloader) from checkpoint (if applicable)
8676
- Train the model
8777
- Clean up recipe state after training is complete
8878

@@ -115,7 +105,7 @@ An example script looks something like this:
115105
Recipe Class
116106
------------
117107

118-
The Recipe Class carries the core logic for training a model. Each class implements a relevant interface and exposes a
108+
The recipe class carries the core logic for training a model. Each class implements a relevant interface and exposes a
119109
set of APIs. For fine-tuning, the structure of this class is as follows:
120110

121111
Initialize recipe state including seed, device, dtype, metric loggers, relevant flags etc:
@@ -154,7 +144,7 @@ Load checkpoint, update recipe state from checkpoint, initialize components and
154144
155145
156146
157-
Run Forward and backward across all epochs and save checkpoint at end of each epoch
147+
Run forward and backward across all epochs and save checkpoint at end of each epoch
158148

159149
.. code-block:: python
160150
@@ -192,3 +182,9 @@ Cleanup recipe state
192182
193183
self.metric_loggers.close()
194184
...
185+
186+
Running Recipes with Configs
187+
----------------------------
188+
189+
To run a recipe with a set of user-defined parameters, you will need to write a config file.
190+
You can learn all about configs in our :ref:`config tutorial<config_tutorial_label>`.

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ TorchTune tutorials.
8383

8484
examples/finetune_llm
8585
examples/recipe_deepdive
86+
examples/configs
8687

8788
.. toctree::
8889
:glob:

0 commit comments

Comments
 (0)