With the Lucene backend, we used to allow using a specific path as both a composite object and a concrete object (string, long, etc). For instance, if we have something like this:
myComposite.leaf=foo
myComposite=bar
Then myComposite may hold both a concrete value ("bar") and sub-fields (myComposite.leaf). A concrete case when this could happen is when you have a property annotated with both @Field and @IndexedEmbedded:
@IndexedEmbedded(prefix = "myComposite.")
@Field(name = "myComposite")
private MyComposite myComposite;
This works alright with Lucene, because there is no such thing as a composite object: the data is flattened and stored as a big bag of key/value pairs. But with Elasticsearch, this is not the case, at least not when interacting through the APIs: composite object have dedicated types (the object and nested datatypes), and those do not allow storing concrete values (only sub-fields). Currently though, we do not check for such cases, and when an error happens, it's rather cryptic because it comes from Elasticsearch and does not have much context. We should make sure that we throw clear exceptions:
- when generating the mapping, we should throw an exception with a clear message when an ES property has multiple types (object and something else). Currently, the last encountered wins, and we may end up with properties with the long datatype (for instance) that also have their own properties, which will make ES cry.
- when mapping generation is disabled, we should make sure to throw an exception anyway. We may do this at runtime (when using conflicting fields in projections, sorts or queries) or when boostrapping, whichever is easier.
The limitations have already been documented as part of
HSEARCH-2396 Pull Request Sent . |