[infinispan-dev] Design change in Infinispan Query

Wed Jan 15 08:42:02 EST 2014

By the way, people looking for that feature are also asking for a unified Cache API accessing these several caches right? Otherwise I am not fully understanding why they ask for a unified query.
Do you have written detailed use cases somewhere for me to better understand what is really requested?

Emmanuel

On 14 Jan 2014, at 12:59, Sanne Grinovero <sanne at infinispan.org> wrote:

> Up this: it was proposed again today ad a face to face meeting.
> Apparently multiple parties have been asking to be able to run
> cross-cache queries.
> 
> Sanne
> 
> On 11 April 2012 12:47, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>> 
>> On 10 avr. 2012, at 19:10, Sanne Grinovero wrote:
>> 
>>> Hello all,
>>> currently Infinispan Query is an interceptor registering on the
>>> specific Cache instance which has indexing enabled; one such
>>> interceptor is doing all what it needs to do in the sole scope of the
>>> cache it was registered in.
>>> 
>>> If you enable indexing - for example - on 3 different caches, there
>>> will be 3 different Hibernate Search engines started in background,
>>> and they are all unaware of each other.
>>> 
>>> After some design discussions with Ales for CapeDwarf, but also
>>> calling attention on something that bothered me since some time, I'd
>>> evaluate the option to have a single Hibernate Search Engine
>>> registered in the CacheManager, and have it shared across indexed
>>> caches.
>>> 
>>> Current design limitations:
>>> 
>>> A- If they are all configured to use the same base directory to
>>> store indexes, and happen to have same-named indexes, they'll share
>>> the index without being aware of each other. This is going to break
>>> unless the user configures some tricky parameters, and even so
>>> performance won't be great: instances will lock each other out, or at
>>> best write in alternate turns.
>>> B- The search engine isn't particularly "heavy", still it would be
>>> nice to share some components and internal services.
>>> C- Configuration details which need some care - like injecting a
>>> JGroups channel for clustering - needs to be done right isolating each
>>> instance (so large parts of configuration would be quite similar but
>>> not totally equal)
>>> D- Incoming messages into a JGroups Receiver need to be routed not
>>> only among indexes, but also among Engine instances. This prevents
>>> Query to reuse code from Hibernate Search.
>>> 
>>> Problems with a unified Hibernate Search Engine:
>>> 
>>>  1#- Isolation of types / indexes. If the same indexed class is
>>> stored in different (indexed) caches, they'll share the same index. Is
>>> it a problem? I'm tempted to consider this a good thing, but wonder if
>>> it would surprise some users. Would you expect that?
>> 
>> I would not expect that. Unicity in Hibernate Search is not defined per identity but per class + provided id.
>> I can see people reusing the same class as partial DTO and willing to index that. I can even see people
>> using the Hibernate Search programmatic API to index the "DTO" stored in cache 2 differently than the
>> domain class stored in cache 1.
>> I can concede that I am pushing a bit the use case towards bad-ish design approaches.
>> 
>>>  2#- configuration format overhaul: indexing options won't be set on
>>> the cache section but in the global section. I'm looking forward to
>>> use the schema extensions anyway to provide a better configuration
>>> experience than the current <properties />.
>>>  3#- Assuming 1# is fine, when a search hit is found I'd need to be
>>> able to figure out from which cache the value should be loaded.
>>>     3#A  we could have the cache name encoded in the index, as part
>>> of the identifier: {PK,cacheName}
>>>     3#B  we actually shard the index, keeping a physically separate
>>> index per cache. This would mean searching on the joint index view but
>>> extracting hits from specific indexes to keep track of "which index"..
>>> I think we can do that but it's definitely tricky.
>>> 
>>> It's likely easier to keep indexed values from different caches in
>>> different indexes. that would mean to reject #1 and mess with the user
>>> defined index name, to add for example the cache name to the user
>>> defined string.
>>> 
>>> Any comment?
>>> 
>>> Cheers,
>>> Sanne
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev