Skip to content

embeddings: fix extraction of CLS pooling results #14927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 30, 2025

Conversation

iamlemec
Copy link
Collaborator

This fixes #14848. The code that computes the embedding output positions in llm_graph_input_cls::set_input was combining the CLS and RANK cases, while it makes more sense to leave RANK on its own and combine CLS and LAST.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified the results? I am a little doubt because in #14848 it is stated that #14217 caused the difference and before it, we were also handling CLS and RANK like we do now. So if this fix here is correct, I don't see why the results before #14217 would have been considered OK.

@iamlemec
Copy link
Collaborator Author

Tried a few different models with various pooling types and they match up to the pre-#14217 numbers. I think the crux of it is that in the CLS/RANK path

if (pos == 0) {
    data[seq_id] = s*n_seq_tokens + i;
}

got replaced with

data[seq_idx] = i;

which effectively turns CLS/RANK into LAST given how tokens are usually ordered. The behavior with this patch is that CLS is now effectively FIRST. This is slightly different from pos=0, but I don't believe this would ever make a difference, as any model using CLS is going to be non-causal and be confined to single-batch processing.

After looking into RANK a bit more, I can see that it's basically the same as CLS in terms of what input positions its looking at. In that case, it would also make sense to merge RANK into the CLS/LAST path as well, right? The other option would be to go back to the old structure and just reintroduce a pos=0 check.

@ggerganov
Copy link
Member

After looking into RANK a bit more, I can see that it's basically the same as CLS in terms of what input positions its looking at. In that case, it would also make sense to merge RANK into the CLS/LAST path as well, right?

I think I remember that at some point I concluded that RANK pooling is redundant, so probably you are right. If you think it's ok to merge them, then go ahead.

@iamlemec
Copy link
Collaborator Author

Great! Merged the RANK case into the CLS path now too.

@ggerganov ggerganov merged commit a118d80 into ggml-org:master Jul 30, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eval bug: Embedding output differs significantly between b4712 and b4713
2 participants