Skip to content

feat(rollout): conditionally enable tower connector LoRA for VLMs#6670

Open
luoshijiang wants to merge 1 commit into
verl-project:mainfrom
luoshijiang:feat/vlm-tower-connector-lora
Open

feat(rollout): conditionally enable tower connector LoRA for VLMs#6670
luoshijiang wants to merge 1 commit into
verl-project:mainfrom
luoshijiang:feat/vlm-tower-connector-lora

Conversation

@luoshijiang

Copy link
Copy Markdown

Summary

This PR adds conditional support for enabling vLLM's enable_tower_connector_lora flag in the vLLM rollout server, allowing LoRA training on vision components (vision tower + vision projection) of vision-language models (VLMs) during RL training.

Motivation

By default, verl only supports LoRA for the language model. When users want to train the vision part of a VLM via LoRA (by setting freeze_vision_model=False / freeze_vision_projection=False), the vLLM rollout server also needs to be aware of this via the enable_tower_connector_lora engine arg. Previously this had to be hard-coded, which would incorrectly apply to non-VLM models as well.

Changes

  • Modified verl/workers/rollout/vllm_rollout/vllm_async_server.py to conditionally add enable_tower_connector_lora=True when:
    1. The model is a VLM (detected via hasattr(hf_config, "vision_config"), consistent with trtllm_async_server.py)
    2. At least one of freeze_vision_model or freeze_vision_projection is set to False

Backward Compatibility

  • Non-VLM models: Unaffected — no vision_config attribute, so the flag is never set
  • VLMs with frozen vision (default): Unaffected — both freeze flags default to True per megatron_peft.py
  • VLMs with trainable vision: Now correctly passes enable_tower_connector_lora=True to vLLM

Usage Example

actor_rollout_ref:
  actor:
    freeze_vision_tower: False
  model:
    lora:
      freeze_vision_model: False
      freeze_vision_projection: False

Related

Detection pattern follows existing VLM detection in:

  • verl/workers/rollout/trtllm_rollout/trtllm_async_server.py:93-95
  • verl/workers/engine/fsdp/transformer_impl.py:919

Checklist

  • Code follows existing code style
  • Tested with a VLM (e.g., Qwen3.5-9B, a native VLM) with trainable vision LoRA
  • Tested with a VLM with frozen vision tower (default behavior preserved)

Co-Authored-By: XuAn@cpic

Add conditional logic in vllm_async_server.py to set enable_tower_connector_lora only when the model is a VLM (detected via vision_config) and vision components are not frozen. Non-VLM models and VLMs with frozen vision towers remain unaffected.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates vllm_async_server.py to enable tower connector LoRA for Vision-Language Models (VLMs) when vision components are trainable. It detects VLMs by checking for a vision_config attribute in hf_config and enables enable_tower_connector_lora if the vision model or projection is not frozen. I have no feedback to provide as there are no review comments.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

is_vlm = hasattr(self.model_config.hf_config, "vision_config")
if is_vlm:
vision_frozen = self.model_config.lora.get("freeze_vision_model", True) and \
self.model_config.lora.get("freeze_vision_projection", True)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a vLLM version/feature gate, or the PR should bump the minimum supported vLLM version. setup.py still allows vllm>=0.8.5, and vLLM 0.8.5 does not define enable_tower_connector_lora, so any VLM run that reaches this branch will fail during CLI parsing with an unknown argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants