You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/DOCUMENTATION.md
+4-7Lines changed: 4 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -104,12 +104,7 @@ def _build_input_queue(
104
104
###### Model initialization
105
105
106
106
```python
107
-
definit_model_fn(
108
-
self,
109
-
rng: RandomState,
110
-
dropout_rate: Optional[float] =None,
111
-
aux_dropout_rate: Optional[float] =None
112
-
) -> initial model parameters
107
+
definit_model_fn(self, rng: RandomState) -> initial model parameters
113
108
```
114
109
115
110
- Unlike in the *Model Track*, this function that initializes the parameters of the model, is fixed. While it can be called by the submission (e.g. to restart the model after a failed training effort) it cannot be changed.
-`logits_output_batch` is before the output activation
135
131
-`new_model_state` is for batch norm or similar side effects and will only be updated if `update_batch_norm` is set
136
132
-`hyperparameters` will contain only dropout rates, which will be used in the models that support it. These can be tuned or will default to documented model-specific values. Note that adding additional dropout would be considered changing the model, which is not allowed, but the tuning of dropout in existing dropout layers can be considered a regularizer, so we allow it. There should be at most two dropout rates in a model (if there are more than two we will reuse the same values).
133
+
-`dropout_rate` is used in the model forward pass. If not provided, the workload’s default value is used (see below for the list of defaults).
0 commit comments