Add support for Qwen3-Omni-30B-A3B-Thinking #677

ajrasane · 2025-12-11T22:34:05Z

What does this PR do?

Type of change:
Model support

Overview:

Support quantization of Qwen3-Omni-30B-A3B-Thinking

Usage

python hf_ptq.py \
    --pyt_ckpt_path Qwen/Qwen3-Omni-30B-A3B-Thinking \
    --qformat fp8 \
    --calib_size 512 \
    --export_path ./qwen3_omni_30b_fp8 \
    --trust_remote_code \
    --batch_size 2 \
    --calib_size 2 \
    --attn_implementation flash_attention_2

Testing

Able to quantize model and generate output

example outputs before ptq: ['<think>\nGot it, which states are we talking about? Wait, the user didn\'t list any states. Oh, maybe the problem is missing the list
? Wait, no, maybe this is a standard question where the options are implied? Wait, no, the user probably forgot to include the options. Wait, but maybe in the orig
inal context, there were states listed, but here it\'s cut off. Wait, no, looking back: the user says "Which of these states is farthest north?" but didn\'t provid
e the "these states" part. Oh, maybe this is a common question where the options are like Maine, Florida, etc. Wait, but maybe the user made a mistake. Wait, no, m
aybe in the problem, the states are implied by the context. Wait, no, let\'s think: the farthest north state in the US is Alaska, but if it\'s contiguous US, it\'s
 Minnesota or North Dakota? Wait, no, North Dakota is farther north than Minnesota. Wait, but maybe the options are different. Wait, but the user didn\'t list the 
states. Wait, maybe this is a trick question where the answer is Alaska, but let\'s check. Wait, no, the user probably forgot to include the options. Wait, but maybe in the original problem, the states are given, but here it\'s missing. Wait, no, maybe the user is referring to a standard set. Wait, let\'s think: common states for such questions: Alaska, Maine, North Dakota, Minnesota, etc. Alaska is the northernmost state, with its northernmost point at 71°23\' N latitude. The contiguous US has North Dakota as the northernmost, but Alaska is a state. So if Alaska is an option, it\'s Alaska. But since the user didn\'t list the states, maybe they expect Alaska. Wait, but maybe the question is from a specific set. Wait, no, the user probably made a mistake, but in standard US geography, the northernmost state is Alaska. Let\'s confirm: Alaska\'s northernmost point is Cape Prince of Wales at 71°23\' N, while the contiguous US has North Dakota at about 49° N, so Alaska is way farther north. So if Alaska is one of the options, it\'s Alaska. Since the user didn\'t list the states, but this is a common question, the answer is Alaska.\n</think>\n\nTo determine which state is farthest north, we analyze the **geographic latitude** of U.S. states. Among all U.S. states, **Alaska** is the northernmost. Its northernmost point (Cape Prince of Wales) lies at approximately **71°23′ N latitude**, far surpassing the northern limits of contiguous states like North Dakota (≈49° N). Even if the question refers to contiguous states only, North Dakota is the northernmost, but since Alaska is a state and the question does not specify "contiguous," **Alaska** is the correct answer.  \n\n**Answer:** Alaska']
--------
example outputs after ptq: ['<think>\nGot it, ```json\n{\n  "question": "Which of these states is farthest north?",\n  "answer": "Alaska"\n}\n```\n</think>\n\nAlaska']

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: No
Did you add or update any necessary documentation?: No
Did you update Changelog?: No

copy-pr-bot · 2025-12-11T22:34:08Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Signed-off-by: ajrasane <[email protected]>

cjluo-nv · 2025-12-15T21:24:58Z

examples/llm_ptq/hf_ptq.py

+                "qwen3omni only supports one dataset for calibration, can extend this in the future"
+            )
+            assert processor is not None, "The processor must be set for qwen3omni model."
+            dataset_name = args.dataset[0] if args.dataset else "scienceqa"


do we still recommend scienceqa as the default calib dataset?

Changed this to cnn_dailymail

cjluo-nv · 2025-12-15T21:26:08Z

examples/llm_ptq/hf_ptq.py

                num_samples=args.calib_size[0],
            )
+        elif model_type == "qwen3omni":
+            assert len(args.calib_size) == 1, (


for this part, I think we may want to host it in a model specific python file/module. E.g. llm_ptq/models/qwen3omni.py.

@shengliangxu WDYT?

We do not need to do it for now, I'll come up with a full design doc and then we can convert the whole repo afterwards. Even if we separate things out now, we may still refactor these anyway.

cjluo-nv · 2025-12-15T21:26:32Z

examples/llm_ptq/hf_ptq.py

+            # if args.verbose:
+            #     mtq.print_quant_summary(full_model)
+
+            import contextlib


move to the top

cjluo-nv · 2025-12-15T21:27:49Z

modelopt/torch/utils/dataset_utils.py

    torch.cuda.empty_cache()

    free_mem_before, max_allocated_before = _get_free_gpu_mem()
    is_enc_dec = model_type_is_enc_dec(model)


can we merge this into _model_requires_generate?

cjluo-nv · 2025-12-15T21:28:49Z

modelopt/torch/utils/image_processor.py

        self.tokenizer = tokenizer
+        # Handle invalid device values that can come from multi-GPU models with device_map="auto"
+        if device is None or str(device) in ("auto", "meta", "cpu"):
+            device = "cuda"


maybe print a warning?

And does it mean if "cuda" not in str(device): device="cuda"?

Signed-off-by: ajrasane <[email protected]>

cjluo-nv · 2025-12-17T01:07:43Z

examples/llm_ptq/hf_ptq.py

    model_is_already_quantized = is_quantized(model)

    model_type = get_model_type(model)
+    if model_type == "qwen3omni" and os.environ.get("DISABLE_TALKER", "0") == "1":


I think we probably need to find a better way for configurations like this

Signed-off-by: ajrasane <[email protected]>

Comment out import and registration of Qwen3OmniMoe classes. Signed-off-by: Chenjie Luo <[email protected]>

Signed-off-by: ajrasane <[email protected]>

Add support for Qwen3-Omni-30B-A3B-Thinking

3a3af92

ajrasane self-assigned this Dec 11, 2025

Add the finevideo dataset for calibration

7857728

Signed-off-by: ajrasane <[email protected]>

cjluo-nv mentioned this pull request Dec 15, 2025

Support PTQ for Qwen3 Omni #647

Open

cjluo-nv requested a review from shengliangxu December 15, 2025 21:23

cjluo-nv reviewed Dec 15, 2025

View reviewed changes

ajrasane added 5 commits December 16, 2025 20:26

Add option to disable talker

ae13469

Signed-off-by: ajrasane <[email protected]>

Add quantization configs for the model

5746ea0

Signed-off-by: ajrasane <[email protected]>

Register Qwen3 thinker and talker sparse moe blocks in quant module

9ec99a0

Signed-off-by: ajrasane <[email protected]>

remove first_n and last_n configs

156f7ee

Signed-off-by: ajrasane <[email protected]>

Update quantization modes to stack on top of one another

f4ca285

Signed-off-by: ajrasane <[email protected]>

cjluo-nv reviewed Dec 17, 2025

View reviewed changes

ajrasane and others added 2 commits December 17, 2025 03:37

Add a text processor for text datasets

c5f2fce

Signed-off-by: ajrasane <[email protected]>

Disable Qwen3OmniMoe class registration

e4b374a

Comment out import and registration of Qwen3OmniMoe classes. Signed-off-by: Chenjie Luo <[email protected]>

ajrasane force-pushed the ajrasane/qwen3-omni-30B branch from 8410674 to 7f80e6f Compare December 17, 2025 08:27

Update logic to disable quantizers

0c4b38f

Signed-off-by: ajrasane <[email protected]>

ajrasane force-pushed the ajrasane/qwen3-omni-30B branch from 7f80e6f to 0c4b38f Compare December 17, 2025 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for Qwen3-Omni-30B-A3B-Thinking #677

Add support for Qwen3-Omni-30B-A3B-Thinking #677

Uh oh!

ajrasane commented Dec 11, 2025

Uh oh!

copy-pr-bot bot commented Dec 11, 2025

Uh oh!

cjluo-nv Dec 15, 2025

Uh oh!

ajrasane Dec 17, 2025

Uh oh!

cjluo-nv Dec 15, 2025

Uh oh!

shengliangxu Dec 16, 2025

Uh oh!

cjluo-nv Dec 15, 2025

Uh oh!

ajrasane Dec 17, 2025

Uh oh!

cjluo-nv Dec 15, 2025

Uh oh!

cjluo-nv Dec 15, 2025

Uh oh!

cjluo-nv Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add support for Qwen3-Omni-30B-A3B-Thinking #677

Are you sure you want to change the base?

Add support for Qwen3-Omni-30B-A3B-Thinking #677

Uh oh!

Conversation

ajrasane commented Dec 11, 2025

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Uh oh!

copy-pr-bot bot commented Dec 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants