Closed
Description
Java API client version
8.14.3
Java version
17
Elasticsearch Version
8.14.3
Problem description
Hello! I'm running into a MissingRequiredPropertyException
when trying to create indices that use ngram/edge ngram tokenizers with the Java client.
Minimal code repro - settings are the same as docs but with token_chars
omitted:
String settingsJson =
"{\"settings\": {\"analysis\": {\"analyzer\": {\"my_analyzer\": {\"tokenizer\": \"my_tokenizer\"}},\"tokenizer\": {\"my_tokenizer\": {\"type\": \"ngram\",\"min_gram\": 3,\"max_gram\": 3}}}}}";
IndexSettings settings = IndexSettings.of(i -> i.withJson(new StringReader(settingsJson)));
throws
Exception in thread "main" co.elastic.clients.json.JsonpMappingException: Error deserializing co.elastic.clients.elasticsearch._types.analysis.TokenizerDefinition: co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'NGramTokenizer.tokenChars' (JSON path: settings.analysis.tokenizer.my_tokenizer) (line no=1, column no=162, offset=161)
at co.elastic.clients.json.JsonpMappingException.from0(JsonpMappingException.java:134)
at co.elastic.clients.json.JsonpMappingException.from(JsonpMappingException.java:121)
...
at co.elastic.clients.elasticsearch.indices.IndexSettings.of(IndexSettings.java:308)
at Scratch.main(Scratch.java:11)
Caused by: co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'NGramTokenizer.tokenChars'
at co.elastic.clients.util.ApiTypeHelper.requireNonNull(ApiTypeHelper.java:76)
at co.elastic.clients.util.ApiTypeHelper.unmodifiableRequired(ApiTypeHelper.java:141)
at co.elastic.clients.elasticsearch._types.analysis.NGramTokenizer.<init>(NGramTokenizer.java:79)
Similar example throwing for no maxGram if min_gram/max_gram/token_chars are all omitted
String settingsJson =
"{\"settings\": {\"analysis\": {\"analyzer\": {\"my_analyzer\": {\"tokenizer\": \"my_tokenizer\"}},\"tokenizer\": {\"my_tokenizer\": {\"type\": \"ngram\"}}}}}";
IndexSettings settings = IndexSettings.of(i -> i.withJson(new StringReader(settingsJson)));
Exception in thread "main" co.elastic.clients.json.JsonpMappingException: Error deserializing co.elastic.clients.elasticsearch._types.analysis.TokenizerDefinition: co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'NGramTokenizer.maxGram' (JSON path: settings.analysis.tokenizer.my_tokenizer) (line no=1, column no=134, offset=133)
at co.elastic.clients.json.JsonpMappingException.from0(JsonpMappingException.java:134)
at co.elastic.clients.json.JsonpMappingException.from(JsonpMappingException.java:121)
...
at co.elastic.clients.elasticsearch.indices.IndexSettings.of(IndexSettings.java:308)
at Scratch.main(Scratch.java:11)
Caused by: co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'NGramTokenizer.maxGram'
at co.elastic.clients.util.ApiTypeHelper.requireNonNull(ApiTypeHelper.java:76)
at co.elastic.clients.elasticsearch._types.analysis.NGramTokenizer.<init>(NGramTokenizer.java:77)
This seems to be because the spec that's used to generate the Java client requires min_gram
, max_gram
, and token_chars
for ngram/edge ngram tokenizers, even though they have defaults in docs (also supported by server code and Lucene defaults).
I can also confirm that creating an index via curl without specifying min_gram / max_gram / token_chars works.
kubectl exec es8-data-0 -- curl -XPUT "https://localhost:9200/test-index" -H "Content-Type: application/json" -d '{"settings": {"analysis": {"analyzer": {"my_analyzer": {"tokenizer": "my_tokenizer"}},"tokenizer": {"my_tokenizer": {"type": "ngram"}}}}}'
returns
{"acknowledged":true,"shards_acknowledged":true,"index":"test-index"}
The same is true for "type": "edge_ngram"
as well.