[hibernate-issues] [JIRA] (HSEARCH-3824) Automatically filter search results based on provided routing keys

Yoann Rodière (JIRA) jira at hibernate.atlassian.net
Thu Feb 20 06:46:43 EST 2020


Yoann Rodière ( https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%3A58fa1ced-171a-4c00-97e8-5d70d442cc4b ) *updated* an issue

Hibernate Search ( https://hibernate.atlassian.net/browse/HSEARCH?atlOrigin=eyJpIjoiNmU0Y2IzNDIzNTc1NGZlMTgzODk4MGM0NDhmNGMwZTIiLCJwIjoiaiJ9 ) / Improvement ( https://hibernate.atlassian.net/browse/HSEARCH-3824?atlOrigin=eyJpIjoiNmU0Y2IzNDIzNTc1NGZlMTgzODk4MGM0NDhmNGMwZTIiLCJwIjoiaiJ9 ) HSEARCH-3824 ( https://hibernate.atlassian.net/browse/HSEARCH-3824?atlOrigin=eyJpIjoiNmU0Y2IzNDIzNTc1NGZlMTgzODk4MGM0NDhmNGMwZTIiLCJwIjoiaiJ9 ) Automatically filter search results based on provided routing keys ( https://hibernate.atlassian.net/browse/HSEARCH-3824?atlOrigin=eyJpIjoiNmU0Y2IzNDIzNTc1NGZlMTgzODk4MGM0NDhmNGMwZTIiLCJwIjoiaiJ9 )

Change By: Yoann Rodière ( https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%3A58fa1ced-171a-4c00-97e8-5d70d442cc4b )

Currently, when a routing key is specified in a search query, we take care of targeting only the shards that can actually contain documents with the given routing keys.

However, since a shard may contain documents with different routing keys, it is possible that some matching documents found in these shards actually used a different routing key.

The only reason we don't currently apply a filter automatically is performance: users defining routing keys are likely to already filter their results based on an indexed field with the same value as the routing key.

However, I don't think it would be very expensive to also create an indexed meta-field holding the routing key, and to automatically add a filter on that field for all search queries that define routing keys explicitly.
The field already exists in ES:  {{ _routing }} ), and it's indexed. For Lucene, we would need to add it.

Out of the top of my head, here are the changes we would need. They're actually quite reasonable:

* For the Lucene backend, we'd need to index the routing key: currently it's just used for routing, not indexed.
* For the Lucene and Elasticsearch backends, we'd need to automatically add a filter to the query when routing keys are specified.
* - For the Lucene and Elasticsearch backends, we'd need to offer a way to retrieve the routing key of a particular search hit... maybe? Not sure you need this. - => No
* - For the Lucene backend, we may want to introduce a new (default) sharding strategy where routing keys are enabled but only used as discriminators, not for actual sharding.
* In mapper APIs - => Not necessary , we'll need to add a way to purge specific routing keys the default sharding strategy works just fine for that. (e.g.  {{ SearchWorkspace.purge( "key1", "key2" ) }} )

( https://hibernate.atlassian.net/browse/HSEARCH-3824#add-comment?atlOrigin=eyJpIjoiNmU0Y2IzNDIzNTc1NGZlMTgzODk4MGM0NDhmNGMwZTIiLCJwIjoiaiJ9 ) Add Comment ( https://hibernate.atlassian.net/browse/HSEARCH-3824#add-comment?atlOrigin=eyJpIjoiNmU0Y2IzNDIzNTc1NGZlMTgzODk4MGM0NDhmNGMwZTIiLCJwIjoiaiJ9 )

Get Jira notifications on your phone! Download the Jira Cloud app for Android ( https://play.google.com/store/apps/details?id=com.atlassian.android.jira.core&referrer=utm_source%3DNotificationLink%26utm_medium%3DEmail ) or iOS ( https://itunes.apple.com/app/apple-store/id1006972087?pt=696495&ct=EmailNotificationLink&mt=8 ) This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100121- sha1:96a3924 )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-issues/attachments/20200220/e716ec55/attachment.html 


More information about the hibernate-issues mailing list