Skip to content

Support enrich ANY mode in cross clusters query #104840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jan 30, 2024

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jan 27, 2024

Note to reviewers: I apologize for the size of this PR, but I have tried to minimize its scope. You can skip most of the changes except for the EnrichPolicyResolver class, which contains the main changes.

This PR enables the resolution of enrich policies from multiple clusters, involving the following steps:

  1. Calculate the policies that need to resolve for each cluster.
  2. Send out LookupRequest to each cluster to resolve policies. Internally, a remote cluster handles the lookup in two steps:
    • Ensure the caller has permission to access the enrich policies (security tests will be added later).
    • For each found enrich policy, use IndexResolver to resolve the mappings of the concrete enrich index.
  3. For each unresolved policy, combines the lookup results to compute the actual enrich policy and mappings depending on the enrich mode.

I considered alternative approaches, but this approach requires at most one cross-cluster call for each cluster.

With this capability, the ENRICH mode ANY now works across clusters. I have enabled enrich in MultiClusterSpecIT and added small IT tests. Additional tests will be added in subsequent PRs.

I plan to have several follow-ups to fully support enrich in cross-cluster queries.

@dnhatn dnhatn force-pushed the enrich-any-mode branch 3 times, most recently from 909753d to bbf39a7 Compare January 28, 2024 07:44
@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn dnhatn marked this pull request as ready for review January 29, 2024 04:34
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 29, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Left some minor comments.

@dnhatn dnhatn requested a review from astefan January 29, 2024 17:06
@dnhatn
Copy link
Member Author

dnhatn commented Jan 29, 2024

@astefan Thanks for the detailed review. It's ready again.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Left few very small comments.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments from my side - thanks for the large suite of tests!

@@ -148,7 +148,7 @@ public void testNonExistentEnrichPolicy() throws IOException {
);
assertThat(
EntityUtils.toString(re.getResponse().getEntity()),
containsString("unresolved enrich policy [countris], did you mean [countries]?")
containsString("enrich policy [countris] doesn't exist, did you mean [countries]?")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there might a security setting (thus the policy can exist but not be visible) might be better to have the "cannot find enrich policy [countries], did you mean [...]).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can happen in the future, but the current security model is all or nothing for enrich policies. I will look into it later, as I am working on the coordinator mode based on this PR.

@Before
public void setupHostsEnrich() {
// the hosts policy are identical on every node
Map<String, String> allHosts = Map.of(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there's value in externalizing the content to a .properties file and which can simply read afterwards.

@@ -23,11 +23,11 @@ public static class PreAnalysis {
public static final PreAnalysis EMPTY = new PreAnalysis(emptyList(), emptyList());

public final List<TableInfo> indices;
public final List<String> policyNames;
public final List<Enrich> unresolvedEnriches;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is unresolved in the preAnalyzer - I suggest using policies or enrichPolicies instead or enriches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, I have renamed this to enriches in 11d4d7e

Comment on lines +178 to +179
if (PlannerUtils.hasUnsupportedEnrich(physicalPlan)) {
listener.onFailure(new IllegalArgumentException("Enrich modes COORDINATOR and REMOTE are not supported yet"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better handled in the Verifier - no need to hold the PR for it though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be removed shortly.

@dnhatn
Copy link
Member Author

dnhatn commented Jan 30, 2024

@astefan @nik9000 @costin Thanks for reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants