VeRL-Omni

Easy, fast, and stable RL training for diffusion and omni-modality models

VeRL-Omni is a general RL training framework focused on multimodal generative models, built on top of verl.

It originated from the multi-modal generation RL effort in verl, and now has a dedicated home so it can evolve in a more focused way.

News 🔥

[2026-06] DiffusionNFT and Diffusion DPO are integrated with verified recipes on Qwen-Image/SD3.5. Wan2.2 is now supported for video generation tasks.

Why `VeRL-Omni`

Multimodal generative RL training differs from text-only LLM RL not only in model structure, but also in I/O patterns, compute characteristics, and runtime bottlenecks. As this space grows, it deserves a dedicated training repository that can evolve quickly around its own constraints.

Scope

VeRL-Omni targets RL post-training for three families of generative models:

Diffusion generative models for image, video, and audio — e.g., Qwen-Image, Wan2.2.
Unified multimodal understanding + generation models — e.g., BAGEL, HunyuanImage-3.0.
Omni-modality models that jointly handle text, image, audio, and video — e.g., Qwen3-Omni.

What we focus on

Optimized rollout: vLLM-Omni as a rollout backend for high-throughput multimodal generation.
Flexible and async multi-reward serving: Support for multi-reward serving (HPSv3, GenRM-OCR, UnifiedReward, etc.), HTTP scorer, and asynchronous reward computation to overlap the rollout phase.
Modular training backends: Selectable VeOmni and FSDP2 backends with combinable parallelism (USP/TP/DP) for distributed training.
Stability tools: Improved diffusion RL stability with rollout correction and deterministic rollout/reward/trainer.
End-to-end examples and benchmarks: Validated recipes for co-located sync and fully-async RL on the model families above.
High training throughput: On our reference Qwen-Image FlowGRPO setup, VeRL-Omni achieves ~25% higher end-to-end throughput than the diffusers-based flow_grpo implementation, driven by vLLM-Omni rollout, FSDP2 trainer, overlapped reward computation (asynchronous), etc.

Getting Started 🚀

Visit our documentation to learn more.

Model and Algorithm Support 🎨

Model	Category	Modality	Algorithm	Status
Qwen-Image	Diffusion generator	Text → Image	FlowGRPO (+ CPS/SDE)	✅
			MixGRPO	✅
			GRPO-Guard	✅
			DiffusionNFT	✅
			DPO	✅
Wan2.2	Diffusion generator	Text → Video	DanceGRPO	✅
LTX2.3	Diffusion generator	Text → Video + Audio	FlowGRPO	WIP
BAGEL	Unified understand + gen	Text + Image	FlowGRPO	✅
HunyuanImage-3.0	Unified understand + gen	Text + Image	MixGRPO	Planned
HunyuanImage-3.0	Unified understand + gen	Text + Image	SRPO	Planned
Qwen3-Omni-Thinker	Omni-modality	Text / Image / Video / Audio	GSPO	WIP
SD3.5	Diffusion generator	Text → Image	DPO	✅

Ascend NPU Support 💠

VeRL-Omni now supports Ascend NPU. For instructions on how to install and get started with FlowGRPO training on Ascend NPU, please refer to our Ascend NPU Quickstart Guide.

Roadmap 🗺

Future work is tracked here:

RFC: Multi-modal Generation RL 2026Q2 Roadmap

Contributing 🤝

Contributions are welcome.

See the contribution guide.

Acknowledgement 🌟

verl-omni builds on the engineering foundations developed in verl and is closely aligned with multimodal inference systems such as vLLM-Omni.

Citation 📚

If you find the project helpful, please cite:

@misc{verlomni_github,
  title        = {{VeRL-Omni: Easy, Fast, and Stable RL Training for Diffusion and Omni-Modality Models}},
  author       = {Yongxiang Huang and Cheung Kawai and Jingan Zhou and Yingshu Chen and {openYuanrong Team} and Xibin Wu},
  year         = {2026},
  howpublished = {\url{https://github.com/verl-project/verl-omni}},
  urldate      = {2026-04-28}
}

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
tests		tests
verl_omni		verl_omni
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VeRL-Omni

Easy, fast, and stable RL training for diffusion and omni-modality models

News 🔥

Why `VeRL-Omni`

Scope

What we focus on

Getting Started 🚀

Model and Algorithm Support 🎨

Ascend NPU Support 💠

Roadmap 🗺

Contributing 🤝

Acknowledgement 🌟

Citation 📚

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VeRL-Omni

Easy, fast, and stable RL training for diffusion and omni-modality models

News 🔥

Why VeRL-Omni

Scope

What we focus on

Getting Started 🚀

Model and Algorithm Support 🎨

Ascend NPU Support 💠

Roadmap 🗺

Contributing 🤝

Acknowledgement 🌟

Citation 📚

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Why `VeRL-Omni`

Packages