Yoann Rodière (
https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%...
) *created* an issue
Hibernate Search (
https://hibernate.atlassian.net/browse/HSEARCH?atlOrigin=eyJpIjoiZTU2NDc1...
) / Bug (
https://hibernate.atlassian.net/browse/HSEARCH-3905?atlOrigin=eyJpIjoiZTU...
) HSEARCH-3905 (
https://hibernate.atlassian.net/browse/HSEARCH-3905?atlOrigin=eyJpIjoiZTU...
) exists() predicate ignores dynamic fields among children of the targeted object field
with the Lucene backend (
https://hibernate.atlassian.net/browse/HSEARCH-3905?atlOrigin=eyJpIjoiZTU...
)
Issue Type: Bug Affects Versions: 6.0.0.Beta7 Assignee: Unassigned Components:
backend-lucene Created: 30/Apr/2020 00:36 AM Fix Versions: 6.0.0.Beta-backlog-low-priority
Priority: Major Reporter: Yoann Rodière (
https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%...
)
With the Lucene backend, we don't have any idea of what dynamic fields have been added
to the index before the last restart of the application; we just know of dynamic fields
that have been mentioned by the user (during indexing/search) since the last restart.
When we build an exists predicate for an object field, what we do internally is building a
boolean query with should clauses, where each clauses tests if a "leaf" field
exists. When there are dynamic fields, we don't know the full list of leaf fields, and
thus we cannot properly build the exists predicate: the dynamic fields are ignored.
Solution 1: persisted metamodel
-------------------------------
The most obvious solution would be to persist a list of indexed dynamic fields *somewhere*
, and read that list on bootstrap. In short, introduce a persisted metamodel for the
Lucene backend.
I'm not a fan of this approach because of the added complexity for just one single
feature.
Solution 2: relaxed exists() matching rules
-------------------------------------------
A perhaps easier solution would be to relax the exists() matching rules, and declare that
exists() matches an object field if it was non-null when indexing. Basically:
* For nested object fields we would just run a MatchAllDocs() query within the join: if
there is a nested document, the field exists.
* For flattened object fields we would have to store the list of object fields added to a
given document in a specific field, and query that field. I suppose there would be an
overhead at indexing time, but we already do that for other field types; see the uses of
org.hibernate.search.backend.lucene.lowlevel.common.impl.MetadataFields#fieldNamesFieldName().
As an added benefit, this would immediately solve HSEARCH-3904 (
https://hibernate.atlassian.net/browse/HSEARCH-3904 ) Open (take into account dynamic
fields in exists() predicate on object fields) for the Lucene backend.
The main drawback is that the behavior would be different from that of Elasticsearch,
which only matches object fields when they have at least one non-null non-object child.
But in a way, isn't that just a limitation of Elasticsearch?
(
https://hibernate.atlassian.net/browse/HSEARCH-3905#add-comment?atlOrigin...
) Add Comment (
https://hibernate.atlassian.net/browse/HSEARCH-3905#add-comment?atlOrigin...
)
Get Jira notifications on your phone! Download the Jira Cloud app for Android (
https://play.google.com/store/apps/details?id=com.atlassian.android.jira....
) or iOS (
https://itunes.apple.com/app/apple-store/id1006972087?pt=696495&ct=Em...
) This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100125- sha1:c543463 )