Skip to content

Hotfix: Set float32 as default dtype for testing tiny models#4770

Merged
albertvillanova merged 2 commits into
huggingface:mainfrom
albertvillanova:fix-4748
Jan 6, 2026
Merged

Hotfix: Set float32 as default dtype for testing tiny models#4770
albertvillanova merged 2 commits into
huggingface:mainfrom
albertvillanova:fix-4748

Conversation

@albertvillanova

@albertvillanova albertvillanova commented Jan 2, 2026

Copy link
Copy Markdown
Member

Set float32 as default dtype for testing tiny models, after the merge in transformers of this PR:

Fix #4748.

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec

Copy link
Copy Markdown
Member

After investigating, here are a few elements that can help understand what's happening here:

Transformers dtype default behavior changed

With transformers<=4.57, if we omit dtype in from_pretrained, the model is loaded in float32 by default. However, if we pass dtype="auto", the dtype follows the model config / checkpoint metadata:

from transformers import AutoModelForCausalLM  # v4.57.2

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B")
model.dtype  # torch.float32

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B", dtype="auto")
model.dtype  # torch.bfloat16

Starting with transformers v5, dtype="auto" appears to be the new default (which is better IMO), so models may now be loaded directly in bf16/fp16 depending on the model/config:

from transformers import AutoModelForCausalLM  # v5.0.0

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B")
model.dtype  # torch.bfloat16

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B", dtype="auto")
model.dtype  # torch.bfloat16

This explains why some tests that previously ran in float32 now run in float16/bfloat16, and as you pointed out, this can lead to situations where some parameters are not updated.

Longer-term: TRL should provide training-oriented defaults

More broadly, I think TRL should aim to provide safe and stable defaults for training.

In particular, we should distinguish between:

  • weight dtype at load time (how parameters are stored)
  • compute dtype during training (forward/backward autocast, grad scaling, etc.)

From a training stability perspective, the most robust default is usually:

  1. load weights in float32 by default, unless the user explicitly requests otherwise
  2. use mixed precision as a training-time optimization (bf16)

The second point is already aligned with TRL defaults (e.g. enabling mixed precision in configs):

bf16: bool | None = field(
default=None,
metadata={
"help": "Whether to use bf16 (mixed) precision instead of 32-bit. Requires Ampere or higher NVIDIA "
"architecture or Intel XPU or using CPU (use_cpu) or Ascend NPU. If not set, it defaults to `True` if "
"`fp16` is not set."
},
)

self.bf16 = not (self.fp16) if self.bf16 is None else self.bf16

However, it looks like the load dtype often follows the model dtype, which can implicitly put users/tests into fp16/bf16 without intent:

dtype = kwargs.get("dtype", "auto")

Proposal

A longer-term solution could be:

  1. Make the default load dtype fp32: when the user passes a model ID
  2. In tests that manually load models (e.g. this one, explicitly set dtype=float32 so the tests don’t depend on upstream defaults

The key idea is: we should not end up training in the model dtype unless it’s intentional, especially in tests that are not meant to validate this specific (and likely unstable) case.

@albertvillanova

albertvillanova commented Jan 5, 2026

Copy link
Copy Markdown
Member Author

Thanks for your review, @qgallouedec: I totally agree.

In this PR I was preliminary testing that setting float32 as the default precision at loading time was indeed fixing the CI failures: as it actually does: https://github.com/huggingface/trl/actions/runs/20663612268/job/59331232203?pr=4770

As an alignment with your long-term proposal, I agree we should set float32 as the default precision at loading time.

@qgallouedec qgallouedec left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm then!

@albertvillanova albertvillanova changed the title Set float32 as default dtype for testing tiny models Hotfix: Set float32 as default dtype for testing tiny models Jan 6, 2026
@albertvillanova albertvillanova merged commit ca16441 into huggingface:main Jan 6, 2026
8 of 9 checks passed
albertvillanova added a commit to albertvillanova/trl that referenced this pull request Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI fails with test dependencies: AssertionError: Parameter has not changed

3 participants