[infinispan-dev] Design change in Infinispan Query

Wed Apr 11 06:25:21 EDT 2012

On 10 Apr 2012, at 18:10, Sanne Grinovero wrote:

> Hello all,
> currently Infinispan Query is an interceptor registering on the
> specific Cache instance which has indexing enabled; one such
> interceptor is doing all what it needs to do in the sole scope of the
> cache it was registered in.
> 
> If you enable indexing - for example - on 3 different caches, there
> will be 3 different Hibernate Search engines started in background,
> and they are all unaware of each other.
> 
> After some design discussions with Ales for CapeDwarf, but also
> calling attention on something that bothered me since some time, I'd
> evaluate the option to have a single Hibernate Search Engine
> registered in the CacheManager, and have it shared across indexed
> caches.
> 
> Current design limitations:
> 
>  A- If they are all configured to use the same base directory to
> store indexes, and happen to have same-named indexes, they'll share
> the index without being aware of each other. This is going to break
> unless the user configures some tricky parameters, and even so
> performance won't be great: instances will lock each other out, or at
> best write in alternate turns.
>  B- The search engine isn't particularly "heavy", still it would be
> nice to share some components and internal services.
>  C- Configuration details which need some care - like injecting a
> JGroups channel for clustering - needs to be done right isolating each
> instance (so large parts of configuration would be quite similar but
> not totally equal)
>  D- Incoming messages into a JGroups Receiver need to be routed not
> only among indexes, but also among Engine instances. This prevents
> Query to reuse code from Hibernate Search.
> 
> Problems with a unified Hibernate Search Engine:
> 
>   1#- Isolation of types / indexes. If the same indexed class is
> stored in different (indexed) caches, they'll share the same index. Is
> it a problem? I'm tempted to consider this a good thing, but wonder if
> it would surprise some users. Would you expect that?

Alternatively you could namespace the index based on cache name?

>   2#- configuration format overhaul: indexing options won't be set on
> the cache section but in the global section. I'm looking forward to
> use the schema extensions anyway to provide a better configuration
> experience than the current <properties />.

+1

>   3#- Assuming 1# is fine, when a search hit is found I'd need to be
> able to figure out from which cache the value should be loaded.
>      3#A  we could have the cache name encoded in the index, as part
> of the identifier: {PK,cacheName}
>      3#B  we actually shard the index, keeping a physically separate
> index per cache. This would mean searching on the joint index view but
> extracting hits from specific indexes to keep track of "which index"..
> I think we can do that but it's definitely tricky.
> 
> It's likely easier to keep indexed values from different caches in
> different indexes. that would mean to reject #1 and mess with the user
> defined index name, to add for example the cache name to the user
> defined string.

This could be generated, as I suggested above?

Sounds good though, you're right that the Hibernate Search engine could and should be shared across caches.

Cheers
Manik 
--
Manik Surtani
manik at jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org