Skip to content

fix: rope_scaling unhashable dict error with transformers>=5.1.0#182

Open
chenwenxiaolive wants to merge 2 commits into
GeeeekExplorer:mainfrom
chenwenxiaolive:fix/rope-scaling-unhashable-dict
Open

fix: rope_scaling unhashable dict error with transformers>=5.1.0#182
chenwenxiaolive wants to merge 2 commits into
GeeeekExplorer:mainfrom
chenwenxiaolive:fix/rope-scaling-unhashable-dict

Conversation

@chenwenxiaolive

@chenwenxiaolive chenwenxiaolive commented Mar 8, 2026

Copy link
Copy Markdown

Summary

  • Fix TypeError: unhashable type: 'dict' when using Qwen3 with transformers>=5.1.0
  • Newer transformers versions initialize rope_scaling as a dict (e.g. {'rope_theta': 1000000, 'rope_type': 'default'}) instead of None, which breaks @lru_cache in get_rope
  • Explicitly pass rope_scaling=None in Qwen3Attention since RoPE scaling is not yet implemented

Fixes #167

Test plan

  • Verified with Qwen3-0.6B model, generation works correctly after the fix

Newer versions of transformers (>=5.1.0) initialize rope_scaling as a
dict instead of None, which causes TypeError with @lru_cache since
dicts are unhashable. Since RoPE scaling is not yet implemented,
explicitly pass None to avoid the error.

Fixes GeeeekExplorer#167

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@chenwenxiaolive chenwenxiaolive left a comment

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Seas0

Seas0 commented Mar 27, 2026

Copy link
Copy Markdown

I think this param shall be processed instead of directly ignored...

@chenwenxiaolive

Copy link
Copy Markdown
Author

@Seas0 Now handled rope_scaling explicitly instead of ignoring it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rope argument inconsistent

2 participants