Skip to content

[v0.5.10][4] Fix _no_split_modules set not subscriptable in transformers >=5.0#931

Open
yueming-yuan wants to merge 3 commits intofix/processor-return-tensorsfrom
fix/fsdp-no-split-modules-set
Open

[v0.5.10][4] Fix _no_split_modules set not subscriptable in transformers >=5.0#931
yueming-yuan wants to merge 3 commits intofix/processor-return-tensorsfrom
fix/fsdp-no-split-modules-set

Conversation

@yueming-yuan
Copy link
Copy Markdown
Collaborator

Summary

  • model._no_split_modules changed from list to set in transformers >=5.0
  • set doesn't support [0] indexing, causing TypeError: 'set' object is not subscriptable
  • All FSDP CI tests fail at model initialization

Test plan

  • FSDP CI tests pass

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the apply_fsdp2 function to handle _no_split_modules as either a list or a set, ensuring compatibility with different versions of the transformers library. The reviewer suggested explicitly converting layer_cls_to_wrap to a set to improve the efficiency of membership checks in subsequent operations.


layer_cls_to_wrap = model._no_split_modules
assert len(layer_cls_to_wrap) > 0 and layer_cls_to_wrap[0] is not None
assert len(layer_cls_to_wrap) > 0 and next(iter(layer_cls_to_wrap)) is not None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While next(iter(layer_cls_to_wrap)) correctly handles the compatibility between list and set types for _no_split_modules in transformers >= 5.0, it is more idiomatic and efficient to ensure layer_cls_to_wrap is a set once at the beginning. This would also optimize the membership check in the subsequent loop (line 689) for cases where transformers < 5.0 provides a list (making it $O(1)$ instead of $O(N)$).

Suggested change
assert len(layer_cls_to_wrap) > 0 and next(iter(layer_cls_to_wrap)) is not None
layer_cls_to_wrap = set(model._no_split_modules)
assert layer_cls_to_wrap and next(iter(layer_cls_to_wrap)) is not None

@yueming-yuan yueming-yuan changed the base branch from bump-sglang-v0.5.10 to fix/processor-return-tensors April 8, 2026 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants