embeddings: fix extraction of CLS pooling results #14927

iamlemec · 2025-07-28T19:46:20Z

This fixes #14848. The code that computes the embedding output positions in llm_graph_input_cls::set_input was combining the CLS and RANK cases, while it makes more sense to leave RANK on its own and combine CLS and LAST.

ggerganov

Have you verified the results? I am a little doubt because in #14848 it is stated that #14217 caused the difference and before it, we were also handling CLS and RANK like we do now. So if this fix here is correct, I don't see why the results before #14217 would have been considered OK.

iamlemec · 2025-07-29T14:24:09Z

Tried a few different models with various pooling types and they match up to the pre-#14217 numbers. I think the crux of it is that in the CLS/RANK path

if (pos == 0) {
    data[seq_id] = s*n_seq_tokens + i;
}

got replaced with

data[seq_idx] = i;

which effectively turns CLS/RANK into LAST given how tokens are usually ordered. The behavior with this patch is that CLS is now effectively FIRST. This is slightly different from pos=0, but I don't believe this would ever make a difference, as any model using CLS is going to be non-causal and be confined to single-batch processing.

After looking into RANK a bit more, I can see that it's basically the same as CLS in terms of what input positions its looking at. In that case, it would also make sense to merge RANK into the CLS/LAST path as well, right? The other option would be to go back to the old structure and just reintroduce a pos=0 check.

ggerganov · 2025-07-29T15:55:54Z

After looking into RANK a bit more, I can see that it's basically the same as CLS in terms of what input positions its looking at. In that case, it would also make sense to merge RANK into the CLS/LAST path as well, right?

I think I remember that at some point I concluded that RANK pooling is redundant, so probably you are right. If you think it's ok to merge them, then go ahead.

iamlemec · 2025-07-29T21:51:22Z

Great! Merged the RANK case into the CLS path now too.

embeddings: fix extraction of CLS pooling results

c7968a4

ggerganov approved these changes Jul 29, 2025

View reviewed changes

merge RANK pooling into CLS case for inputs

cb9ea7e

ggerganov merged commit a118d80 into ggml-org:master Jul 30, 2025
47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

embeddings: fix extraction of CLS pooling results #14927

embeddings: fix extraction of CLS pooling results #14927

Uh oh!

iamlemec commented Jul 28, 2025

Uh oh!

ggerganov left a comment •

edited

Loading

Uh oh!

iamlemec commented Jul 29, 2025

Uh oh!

ggerganov commented Jul 29, 2025

Uh oh!

iamlemec commented Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

embeddings: fix extraction of CLS pooling results #14927

embeddings: fix extraction of CLS pooling results #14927

Uh oh!

Conversation

iamlemec commented Jul 28, 2025

Uh oh!

ggerganov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iamlemec commented Jul 29, 2025

Uh oh!

ggerganov commented Jul 29, 2025

Uh oh!

iamlemec commented Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

ggerganov left a comment •

edited

Loading