Skip to content

Conversation

@ajrasane
Copy link
Contributor

What does this PR do?

Type of change:
Model support

Overview:

Usage

python hf_ptq.py \
    --pyt_ckpt_path Qwen/Qwen3-Omni-30B-A3B-Thinking \
    --qformat fp8 \
    --calib_size 512 \
    --export_path ./qwen3_omni_30b_fp8 \
    --trust_remote_code \
    --batch_size 2 \
    --calib_size 2 \
    --attn_implementation flash_attention_2

Testing

Able to quantize model and generate output

example outputs before ptq: ['<think>\nGot it, which states are we talking about? Wait, the user didn\'t list any states. Oh, maybe the problem is missing the list
? Wait, no, maybe this is a standard question where the options are implied? Wait, no, the user probably forgot to include the options. Wait, but maybe in the orig
inal context, there were states listed, but here it\'s cut off. Wait, no, looking back: the user says "Which of these states is farthest north?" but didn\'t provid
e the "these states" part. Oh, maybe this is a common question where the options are like Maine, Florida, etc. Wait, but maybe the user made a mistake. Wait, no, m
aybe in the problem, the states are implied by the context. Wait, no, let\'s think: the farthest north state in the US is Alaska, but if it\'s contiguous US, it\'s
 Minnesota or North Dakota? Wait, no, North Dakota is farther north than Minnesota. Wait, but maybe the options are different. Wait, but the user didn\'t list the 
states. Wait, maybe this is a trick question where the answer is Alaska, but let\'s check. Wait, no, the user probably forgot to include the options. Wait, but maybe in the original problem, the states are given, but here it\'s missing. Wait, no, maybe the user is referring to a standard set. Wait, let\'s think: common states for such questions: Alaska, Maine, North Dakota, Minnesota, etc. Alaska is the northernmost state, with its northernmost point at 71°23\' N latitude. The contiguous US has North Dakota as the northernmost, but Alaska is a state. So if Alaska is an option, it\'s Alaska. But since the user didn\'t list the states, maybe they expect Alaska. Wait, but maybe the question is from a specific set. Wait, no, the user probably made a mistake, but in standard US geography, the northernmost state is Alaska. Let\'s confirm: Alaska\'s northernmost point is Cape Prince of Wales at 71°23\' N, while the contiguous US has North Dakota at about 49° N, so Alaska is way farther north. So if Alaska is one of the options, it\'s Alaska. Since the user didn\'t list the states, but this is a common question, the answer is Alaska.\n</think>\n\nTo determine which state is farthest north, we analyze the **geographic latitude** of U.S. states. Among all U.S. states, **Alaska** is the northernmost. Its northernmost point (Cape Prince of Wales) lies at approximately **71°23′ N latitude**, far surpassing the northern limits of contiguous states like North Dakota (≈49° N). Even if the question refers to contiguous states only, North Dakota is the northernmost, but since Alaska is a state and the question does not specify "contiguous," **Alaska** is the correct answer.  \n\n**Answer:** Alaska']
--------
example outputs after ptq: ['<think>\nGot it, ```json\n{\n  "question": "Which of these states is farthest north?",\n  "answer": "Alaska"\n}\n```\n</think>\n\nAlaska']

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: No
  • Did you update Changelog?: No

@ajrasane ajrasane self-assigned this Dec 11, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 11, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

"qwen3omni only supports one dataset for calibration, can extend this in the future"
)
assert processor is not None, "The processor must be set for qwen3omni model."
dataset_name = args.dataset[0] if args.dataset else "scienceqa"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still recommend scienceqa as the default calib dataset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to cnn_dailymail

num_samples=args.calib_size[0],
)
elif model_type == "qwen3omni":
assert len(args.calib_size) == 1, (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this part, I think we may want to host it in a model specific python file/module. E.g. llm_ptq/models/qwen3omni.py.

@shengliangxu WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to do it for now, I'll come up with a full design doc and then we can convert the whole repo afterwards. Even if we separate things out now, we may still refactor these anyway.

# if args.verbose:
# mtq.print_quant_summary(full_model)

import contextlib
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to the top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

torch.cuda.empty_cache()

free_mem_before, max_allocated_before = _get_free_gpu_mem()
is_enc_dec = model_type_is_enc_dec(model)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we merge this into _model_requires_generate?

self.tokenizer = tokenizer
# Handle invalid device values that can come from multi-GPU models with device_map="auto"
if device is None or str(device) in ("auto", "meta", "cpu"):
device = "cuda"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe print a warning?

And does it mean if "cuda" not in str(device): device="cuda"?

model_is_already_quantized = is_quantized(model)

model_type = get_model_type(model)
if model_type == "qwen3omni" and os.environ.get("DISABLE_TALKER", "0") == "1":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably need to find a better way for configurations like this

ajrasane and others added 2 commits December 17, 2025 03:37
Comment out import and registration of Qwen3OmniMoe classes.

Signed-off-by: Chenjie Luo <[email protected]>
@ajrasane ajrasane force-pushed the ajrasane/qwen3-omni-30B branch from 8410674 to 7f80e6f Compare December 17, 2025 08:27
@ajrasane ajrasane force-pushed the ajrasane/qwen3-omni-30B branch from 7f80e6f to 0c4b38f Compare December 17, 2025 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants