[fully_async, trainer] fix: sync optimizer total steps before trainer initialization by mikequan0425 · Pull Request #6684 · verl-project/verl

mikequan0425 · 2026-06-10T11:08:46Z

What does this PR do?

Add concise overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review.

It was observed in experiments that the learning rate (lr) was always 0. This issue does not occur when starting the script via main_ppo, and only emerges under fully async mode. Refer to the following for detailed problem description:#6683

In short, optim.total_training_steps is not assigned correctly during the initialization of trainer optim.

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

The simplest approach is to pass values through by configuring actor_rollout_ref.actor.optim.total_training_steps, yet the actual trainer step should be calculated as total_rollout_steps / (required_samples * trigger_parameter_sync_step)

Therefore, this PR provides a method to assign the corresponding value to optim.total_training_steps when creating the trainer.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: https://github.com/verl-project/verl/pulls?q=is%3Apr+is%3Aopen+fully+async+lr
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, veomni, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward, fully_async, one_step_off
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)
If your PR is related to the recipe submodule, please also update the reference to the submodule commit via git submodule update --remote or cd recipe && git pull origin main.

gemini-code-assist

Code Review

This pull request refactors FullyAsyncTrainer to resolve and set the total training steps in the configuration prior to worker initialization. It extracts the configuration-setting logic into _set_total_training_steps_in_config, adds a helper method _resolve_total_training_steps_before_init to compute the steps dynamically, and invokes this resolution during init_workers. There are no review comments, and no additional feedback is provided.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

CLAassistant · 2026-06-10T11:15:53Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Luosuu

Found two issues in the pre-init total step calculation that should be addressed before this is safe.

Luosuu · 2026-06-14T08:21:42Z

+
+        required_samples = (
+            self.config.actor_rollout_ref.actor.ppo_mini_batch_size * self.config.async_training.require_batches
+        )


optim.total_training_steps is consumed by the LR scheduler, and the scheduler is stepped on every actor update (_fit_update_actor -> update_actor), not only when parameters are synced. Dividing by trigger_parameter_sync_step makes the schedule finish trigger_parameter_sync_step times too early; if the progress bar/checkpoint version wants sync steps, please keep that separate from the optimizer scheduler steps.

Luosuu · 2026-06-14T08:21:42Z

@@ -266,7 +271,16 @@ def set_total_train_steps(self, total_training_steps):
        except Exception as e:
            print(f"Warning: Could not set total_training_steps in config. Structure missing? Error: {e}")



This uses the raw configured rollout.total_rollout_steps, but the rollouter later computes the effective count as min(config.rollout.total_rollout_steps, len(train_dataloader) * total_epochs) and also supports None. Since the optimizer is already constructed during init_workers, the later set_total_train_steps() call cannot rebuild the scheduler, so dataset-limited or unset configs still get the wrong LR schedule here.

Set total training steps before trainer initialization.

79130e4

mikequan0425 requested review from ArronHZG and wuxibin89 as code owners June 10, 2026 11:08

gemini-code-assist Bot reviewed Jun 10, 2026

View reviewed changes

fix pre-commit

74235c2

Luosuu reviewed Jun 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fully_async, trainer] fix: sync optimizer total steps before trainer initialization#6684

[fully_async, trainer] fix: sync optimizer total steps before trainer initialization#6684
mikequan0425 wants to merge 2 commits into
verl-project:mainfrom
mikequan0425:async_optim_lr_fix

mikequan0425 commented Jun 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

CLAassistant commented Jun 10, 2026

Uh oh!

Luosuu left a comment

Uh oh!

Luosuu Jun 14, 2026

Uh oh!

Luosuu Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -266,7 +271,16 @@ def set_total_train_steps(self, total_training_steps):
		except Exception as e:
		print(f"Warning: Could not set total_training_steps in config. Structure missing? Error: {e}")

Conversation

mikequan0425 commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Design & Code Changes

Checklist Before Starting

Test

API and Usage Example

Checklist Before Submitting

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

CLAassistant commented Jun 10, 2026

Uh oh!

Luosuu left a comment

Choose a reason for hiding this comment

Uh oh!

Luosuu Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Luosuu Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mikequan0425 commented Jun 10, 2026 •

edited

Loading