|
Lucene 3.6 introduces the notion of "Query Time Join": a way to relate Documents from different indexes and filter content and retrieve fields. This approach comes at a runtime costs as an extra pass is involved in processing the query.
The idea is basically that if you search on e.g. Posts and you need the photo of the User that is part of the Post, you can keep this information separate and retrieve the User on the fly. This way you can ensure that fields that change in the User don't require a reindexing of all the related Comments.
http://www.searchworkings.org/blog/-/blogs/412000
Query time joining in Lucene is pretty straight forward, and entirely encapsulated in JoinUtil.createJoinQuery. It requires the following arguments:
fromField. The entity field to join in the entity being queried: e.g. user.id
toField. The entity field in the related index to join on: e.g. id.
fromQuery. The query executed to collect the from terms.
fromSearcher. The search on where the fromQuery is executed.
multipleValuesPerDocument. Whether the fromField contains more than one value per document (multivalued field). If this option is set to true the from terms can be collected in a more efficient manner.
Since this doesn't require indexing changes and just affects what is returned, it can simply be implemented as an extension to the querybuilder.
QueryBuilder qbGroup = fts.getSearchFactory().buildQueryBuilder().forEntity(DiscussionGroup.class).get();
QueryBuilder qbPost = fts.getSearchFactory().buildQueryBuilder().forEntity(Post.class).get();
Query joinQuery = qbGroup.join().onFields("user.id","id").must(
qbGroup.keyword().onField("title").matching("lucene").createQuery()
).createQuery();
qbPost.bool().must(
joinQuery
).createQuery();
I'm not sure at this point but I believe that Query Joining doesn't actually retrieve the related document. Which would be a nice feature also.
|