Skip to content

Conversation

martijnvg
Copy link
Member

Both top level and inline inner hits are now covered by InnerHitBuilder.
Although there are differences between top level and inline inner hits,
they now make use of the same builder logic.

The parsing of top level inner hits slightly changed to be more readable.
Before the nested path or parent/child type had to be specified as encapsuting
json object, now these settings are simple fields. Before this was required
to allow streaming parsing of inner hits without missing contextual information.

Once some issues are fixed with inline inner hits (around multi level hierachy of inner hits),
top level inner hits will be deprecated and removed in the next major version.

Also I think some SearchParseElement implementations can be removed, but I haven't done that yet.

this.innerHitsBuilders = new HashMap<>();
}

public InnerHitsBuilder(Map<String, InnerHitBuilder> innerHitsBuilders) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be private.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also check for null here?

@cbuescher
Copy link
Member

@martijnvg this looks great, even though I didn't compare it completely to the way it was before the refactoring, I have the feeling this change simplifies a lot. Left a few minor comments, only thing that was causing some trouble with the tests on my side was the size shadowing issue in SearchSourceBuilder#readFrom(), but will run the whole tests again now with a local fix. Other than that, maybe it would be good to also have a basic test for InnerHitsBuilder, although that one doesn't have much logic of its own.
Its also a quiet a big PR, maybe somebody else involved in the search refactoring should take a second look after this round. I'll ask around.

@martijnvg martijnvg force-pushed the inner_hits_search_refactoring branch from d9081b5 to 1b36a9e Compare March 24, 2016 23:07
@martijnvg
Copy link
Member Author

@cbuescher Thanks for looking at this. I updated the PR based on your comment and fixed the size shadowing issue.

@dimfeld
Copy link
Contributor

dimfeld commented Mar 25, 2016

With top-level inner_hits going away, is there a recommended way to replace the functionality? My usual case here involves using a has_child in the query to look for parents with a particular value in a child, and then a top-level inner_hits to return the top few child objects of the same type, ranked by some value using a function_score.

I'm thinking that something like this will work; is there a better way?

{  query:  {  bool: {
   must: // main query with "real" has_child filter and whatever else goes here
   should: [
      { has_child: { type: childType, query: {function_score query}, inner_hits: {} } 
    ]
} } } 

@martijnvg
Copy link
Member Author

@dimfeld Yes, that should work too. Also I'm thinking of also allowing the query option inside inline inner hits (which would then overwrite the actual inner query), so that you can do something like this:

{  query:  {  bool: {
   must: [
      { has_child: { type: childType, query: {real query}, inner_hits: {query: {function_score query}} } 
    ]
} } } 

This would be better, because it would be more efficient as only one has_child query would need to be specified.

The goal with replacing inner hits is that same things can be done with inline inner hits as top level inner hits.

@martijnvg martijnvg force-pushed the inner_hits_search_refactoring branch from 1b36a9e to d453f70 Compare March 25, 2016 09:01
@dimfeld
Copy link
Contributor

dimfeld commented Mar 25, 2016

Sounds good. Thanks @martijnvg!

@cbuescher
Copy link
Member

@martijnvg thanks, changes look great and all works for me now, but maybe @colings86, @javanna or @s1monw want to take a second look at this before it gets in.

return from;
}

public InnerHitBuilder setFrom(int from) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add validation check to these setter methods like we do in the query builders? So for this one check that the value is >= 0 and for setName() make sure the name passed in is not null.

@colings86
Copy link
Contributor

@martijnvg I left a few minor comments but otherwise it LGTM

sSource = source.toString();
} catch (Throwable e1) {
// ignore
queryShardContext.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about why this reset (and the one in the finally block) are done. I saw this pattern in other places, just trying to figure out why this is done. In case its important, maybe worth a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same logic in QueryShardContext#toQuery() where we reset also twice. Personally I think a reset at the end is sufficient, but I kept the logic consistent with this method.

The reason we need to invoke reset at all here is because inner hits modifies the QueryShardContext#nestedScope, so we need to reset at the end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, I've been looking at resets in shard contexts recently and this helps my understanding.

@cbuescher
Copy link
Member

@martijnvg I did another quick round, left a few minor suggestions and one question for my own understanding.

@martijnvg martijnvg force-pushed the inner_hits_search_refactoring branch from d453f70 to b4f3512 Compare March 30, 2016 11:05
@martijnvg
Copy link
Member Author

@cbuescher @colings86 I've updated the PR.

@@ -202,7 +200,7 @@ public String getName() {
}

public InnerHitBuilder setName(String name) {
this.name = name;
this.name = Objects.requireNonNull(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid for an inner hits builder to have no name, nestedPath and/or parentChildType set? If not then maybe we should remove the setters for these three fields and have them instead set in the constructor so that it is not possible to construct an InnerHitsBuilder that is not valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name is optional, if the name isn't specified it will default to either nestedPath or parentChildType.
I'll add a constructor that accepts either a nestedPath or a parentChildType.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scratch that last line... the reason I didn't go with constructor arguments was because: the nested path / parent child type is known while parsing the xcontent for top level inner hits. After we have removed top level inner hits we can make this change easily.

(also name is just a label for reading back the inner hits in the response, lets say you have two nested queries for the same path that have inner hits enabled)

@martijnvg martijnvg force-pushed the inner_hits_search_refactoring branch from 238e86c to 7ef852b Compare March 30, 2016 13:01
@cbuescher
Copy link
Member

Thanks, looked at the last changes, LGTM.

@martijnvg martijnvg removed the review label Mar 30, 2016
@martijnvg martijnvg force-pushed the inner_hits_search_refactoring branch from 7ef852b to b25d555 Compare March 30, 2016 13:06
Both top level and inline inner hits are now covered by InnerHitBuilder.
Although there are differences between top level and inline inner hits,
they now make use of the same builder logic.

The parsing of top level inner hits slightly changed to be more readable.
Before the nested path or parent/child type had to be specified as encapsuting
json object, now these settings are simple fields. Before this was required
to allow streaming parsing of inner hits without missing contextual information.

Once some issues are fixed with inline inner hits (around multi level hierachy of inner hits),
top level inner hits will be deprecated and removed in the next major version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking-java :Search/Search Search-related issues that do not fall into other categories v5.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants