sycl: unified semantics of block offset calculation #14814

Alcpz · 2025-07-22T11:44:28Z

The original intent of having these block structs was to avoid calculating block indexes within the mmvq kernels to get the offset of the weights and its scales.

This PR refactors the block index calculation in the reordered mmvq kernels for q4_K and q6_K to have the same behavior as q4_0.

Also, it changes traits:: internally in the struct to match the style of Q4_K and Q6_K, which was agreed to be cleaner.

ggml/src/ggml-sycl/vecdotq.hpp

s-Nick

Thank you for cleaning it up. LGTM

* origin/master: docs : update HOWTO‑add‑model.md for ModelBase and new model classes (ggml-org#14874) ggml : remove invalid portPos specifiers from dot files (ggml-org#14838) context : restore preemptive sched reset when LLAMA_SET_ROWS=0 (ggml-org#14870) mtmd : fix 32-bit narrowing issue in export-lora and mtmd clip (ggml-org#14503) rpc : check for null buffers in get/set/copy tensor endpoints (ggml-org#14868) sched : fix multiple evaluations of the same graph with pipeline parallelism (ggml-org#14855) musa: upgrade musa sdk to rc4.2.0 (ggml-org#14498) sync : ggml cmake : fix usage issues (ggml/1257) ggml-cpu : remove stdlib include from repack.cpp (ggml/1276) context : perform output reorder lazily upon access after sync (ggml-org#14853) chat : fix kimi-k2 chat template (ggml-org#14852) sycl: fixed semantics of block offset calculation (ggml-org#14814) llama : fix MiniCPM inference after Granite Four changes (ggml-org#14850) docs: add libcurl-dev install hint for Linux distros (ggml-org#14801) metal : fix fusion across different encoders (ggml-org#14849) sycl: fix undefined variable in work group size check (ggml-org#14843) convert : text-only support for GLM-4.1V-9B-Thinking (ggml-org#14823) CUDA: fix overflow in FA, tune performance (ggml-org#14840) CUDA: fix compilation with GGML_CUDA_F16 (ggml-org#14837)

sycl: fixed semantics of block offset calculation

eca0e47

Alcpz requested review from s-Nick and Rbiessy July 22, 2025 11:44

github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jul 22, 2025

Rbiessy reviewed Jul 23, 2025

View reviewed changes

ggml/src/ggml-sycl/vecdotq.hpp Show resolved Hide resolved

s-Nick approved these changes Jul 24, 2025

View reviewed changes

AD2605 approved these changes Jul 24, 2025

View reviewed changes

Rbiessy approved these changes Jul 24, 2025

View reviewed changes

Alcpz merged commit cb4a63a into ggml-org:master Jul 24, 2025
47 checks passed

taronaeo pushed a commit to taronaeo/llama.cpp-s390x that referenced this pull request Jul 25, 2025

sycl: fixed semantics of block offset calculation (ggml-org#14814)

07a4930

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sycl: unified semantics of block offset calculation #14814

sycl: unified semantics of block offset calculation #14814

Uh oh!

Alcpz commented Jul 22, 2025

Uh oh!

Uh oh!

s-Nick left a comment

Uh oh!

Uh oh!

Uh oh!

sycl: unified semantics of block offset calculation #14814

sycl: unified semantics of block offset calculation #14814

Uh oh!

Conversation

Alcpz commented Jul 22, 2025

Uh oh!

Uh oh!

s-Nick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!