Hi Marc,
sorry for the late reply. I forgot to answer :-(
On 5 Jan 2014, at 19:11, Marc Schipperheyn <m.schipperheyn(a)gmail.com> wrote:
Current situation
* Loading Wallposts directly from Lucene index.
* Posts references properties that change relatively often or have cascading indexing
impact if they would be saved within the Post document, e.g. User (photo, name, title,
link), Classified (photo, title, decription), Group (photo, title, description), Comments
(User, user.name, user.link)
* Posts, Classifieds and Groups can be "Liked", which requires per user post
processing because we want to show people liking stuff that are close to you
* Of groups you can be a member or not so, it also requires post processing on a per user
basis
* In addition, some of these related objects have specific properties such as Event =>
eventDate, RSVP
So, right now, we're optimizing this post processing process so that the impact is as
low as possible. However, because all of these related objects are stored in different
indexes it requires multiple hits against the index which has an impact on performance.
Desired situation
Since a lot of these related objects have common properties: photo, link, title,
description and in this case of the Post display purposes we only need a small subset of
the properties, it would be desirable to be able to query a single index in a one pass
retrieval querying for (object type - id).
Got you.
The idea of combining all these objects into a single index somehow
feels wrong
Why? I would for example assume that in the case of an application using Lucene directly,
that there is often just one single index. I agree that it feels natural in Hibernate
Search to separate indexes per entity, but
sharing an index in your use case seems reasonable as well. Of course you get a bigger
index size which might effect search performance, but on the other hand you now ply target
a single index, instead of multiple.
This might even outweigh any performance penalty you get for having a single index.
and since I don't have any experience doing this, it's hard
to oversee the impact of this, including potential bugs because this is not a common use
case.
Sure, the only way to know for sure is to actually try and run performance tests. A lot of
the required changes would be on a configuration level, so it might not bee too hard to
give it a go.
If I would be able to create a limited "combined secondary
index" that would actually meet my use case.
So, this brought me to my original question: can I project properties to a separate
index. Which, given my new understanding of combining entities into a single index can
also be read as:
"Can I create two indexes based on a single entity, targeting only a subset of
properties, basically duplicating index content”.
The answer to this particular question is no. I don’t think this would also not be so easy
to achieve with the current design of the Search codebase. You would somehow need
new/additional
configuration options and in the engine itself we probably would have to do a fair bit of
shuffling to make it work.
—Hardy