Skip to content

Linear retriever top level option for normalizer #129693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mridula-s109
Copy link
Contributor

@mridula-s109 mridula-s109 commented Jun 19, 2025

Summary

This PR extends the rank-rrf plugin’s Linear Retriever with:

  • A top-level normalizer parameter in the simplified syntax.

Key Changes

Parsing & Builder

  • LinearRetrieverBuilder
    • Added boolean explicitNormalizer to distinguish default vs. user-supplied normalizers.
    • Parser always sets a non-null normalizer (defaults to identity) and records explicitness.
    • Validation rules
      • Require a normalizer when query is provided and none supplied.
      • Forbid a non-identity top-level normalizer when custom retrievers are present.
    • Legacy constructors retained → full BWC.

@mridula-s109 mridula-s109 self-assigned this Jun 19, 2025
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 labels Jun 19, 2025
@mridula-s109 mridula-s109 changed the title Search 1027 linear retriever top level option for normalizer Linear retriever top level option for normalizer Jun 19, 2025
@mridula-s109 mridula-s109 requested review from a team as code owners June 25, 2025 08:17
@mridula-s109 mridula-s109 force-pushed the SEARCH-1027-linear-retriever-top-level-option-for-normalizer branch from 6750bc8 to 4df0e99 Compare June 25, 2025 10:28
@mridula-s109 mridula-s109 reopened this Jun 25, 2025
@mridula-s109 mridula-s109 added >enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team auto-backport Automatically create backport pull requests when merged v8.19.0 labels Jun 25, 2025
@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jun 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Collaborator

Hi @mridula-s109, I've created a changelog YAML for you.

Copy link
Contributor

github-actions bot commented Jun 26, 2025

🔍 Preview links for changed docs:

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

@mridula-s109 mridula-s109 requested a review from a team June 26, 2025 12:12
@mridula-s109 mridula-s109 marked this pull request as ready for review June 26, 2025 12:12
@mridula-s109 mridula-s109 removed request for a team June 26, 2025 12:32
Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mridula-s109 thanks for this work. I think we need to keep backwards compatibility for when the top level normalizer was not added. Let's generate some examples of how this should work. Yaml tests - which I see we haven't added any - are a good way to go through this exercise. Thanks!

@@ -0,0 +1,5 @@
pr: 129693
summary: Linear retriever top level option for normalizer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick:

Suggested change
summary: Linear retriever top level option for normalizer
summary: Add top level normalizer for linear retriever

@@ -290,6 +296,12 @@ Each entry specifies the following parameters:

* `l2_norm` : An `L2ScoreNormalizer` that normalizes scores using the L2 norm of the score values.

::::{note}
Since 9.0 the `normalizer` field must be provided at the top level of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This describes a breaking change (and also I don't think the version number is correct here)? We need to discuss this, we need to have backward compatibility.

ScoreNormalizer normalizer = args[3] == null ? null : ScoreNormalizer.valueOf((String) args[3]);
String normalizerStr = (String) args[3];
boolean explicitNormalizer = normalizerStr != null;
ScoreNormalizer normalizer = normalizerStr == null ? IdentityScoreNormalizer.INSTANCE : ScoreNormalizer.valueOf(normalizerStr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we change the logic of how to compute default normalizers here?

}

@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.field(RETRIEVER_FIELD.getPreferredName(), retriever);
builder.field(WEIGHT_FIELD.getPreferredName(), weight);
builder.field(NORMALIZER_FIELD.getPreferredName(), normalizer.getName());
if (normalizer != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we add this null check?

NORMALIZER_FIELD,
ObjectParser.ValueType.STRING
);
PARSER.declareString(optionalConstructorArg(), NORMALIZER_FIELD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we change the serialization here?

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, introducing a breaking change with this is not ok. I think there's a simpler way to do this with less invasive changes and no breaking changes. I see some complications with duplicate serialization of normalizer in the sub-retrievers, but we can deal with that secondarily.

@mridula-s109 mridula-s109 marked this pull request as draft June 26, 2025 12:53
@mridula-s109
Copy link
Contributor Author

mridula-s109 commented Jun 26, 2025

@Mikep86 @kderusso Thanks for the comments, i will have a relook at the Implementation. Thus converting it back to draft until then.

Copy link
Contributor

github-actions bot commented Jul 3, 2025

🔍 Preview links for changed docs

More links …

@mridula-s109 mridula-s109 force-pushed the SEARCH-1027-linear-retriever-top-level-option-for-normalizer branch from 07965d5 to d4b1ced Compare July 3, 2025 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.19.0 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants