[infinispan-dev] Design change in Infinispan Query

Mircea Markus mmarkus at redhat.com
Wed Feb 5 10:53:38 EST 2014


On Feb 3, 2014, at 9:32 AM, Emmanuel Bernard <emmanuel at hibernate.org> wrote:

> Sure searching for any cache is useful. What I was advocating is that if you search for more than one cache transparently, then you probably need to CRUD for more than one cache transparently as well. And this is not being discussed. 

Not sure what you mean by CRUD over multiple caches? ATM one can run a TX over multiple caches, but I think there's something else you have in mind :-)

> 
> I have to admit that having to add a cache name to the stored elements of the index documents makes me a bit sad.

sad because of the increased index size? 

> I was already unhappy when I had to do it for class names. Renaming a cache will be a heavy operation too. 
> Sanne, if we know that we don't share the semi index for different caches, can we avoid the need to store the cache name in each document?
> 
> BTW, this discussion should be in the open. 

+1

> 
> On 31 janv. 2014, at 18:04, Adrian Nistor <anistor at gmail.com> wrote:
> 
>> I think it conceptually makes sense to have one entity type per cache but this should be a good practice rather than an enforced constraint. It would be a bit late and difficult to add such a constraint now.
>> 
>> The design change we are talking about is being able to search across caches. That can easily be implemented regardless of this. We can move the SearchManager from Cache scope to CacheManager scope. Indexes are bound to types not to caches anyway, so same-type entities from multiple caches can end up in the same index, we just need to store an extra hidden field: the name of the originating cache. This move would also allow us to share some lucene/hsearch resources.  
>> 
>> We can easily continue to support Search.getSearchManager(cache) so old api usages continue to work. This would return a delegating/decorating SearchManager that creates queries that are automatically restricted to the scope of the given cache.
>> 
>> Piece of cake? :)
>> 
>> 
>> 
>> On Thu, Jan 30, 2014 at 9:56 PM, Mircea Markus <mmarkus at redhat.com> wrote:
>> curious to see your thoughts on this: it is a recurring topic and will affects the way we design things in future in a significant way.
>> E.g. if we think (recommend) that a distinct cache should be used for each entity, then we'll need querying to work between caches. Also some cache stores can be built along these lines (e.g. for the JPA cache store we only need it to support a single entity type).
>> 
>> Begin forwarded message:
>> 
>> > On Jan 30, 2014, at 9:42 AM, Galder Zamarreño <galder at redhat.com> wrote:
>> >
>> >>
>> >> On Jan 21, 2014, at 11:52 PM, Mircea Markus <mmarkus at redhat.com> wrote:
>> >>
>> >>>
>> >>> On Jan 15, 2014, at 1:42 PM, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>> >>>
>> >>>> By the way, people looking for that feature are also asking for a unified Cache API accessing these several caches right? Otherwise I am not fully understanding why they ask for a unified query.
>> >>>> Do you have written detailed use cases somewhere for me to better understand what is really requested?
>> >>>
>> >>> IMO from a user perspective, being able to run queries spreading several caches makes the programming simplifies the programming model: each cache corresponding to a single entity type, with potentially different configuration.
>> >>
>> >> Not sure if it simplifies things TBH if the configuration is the same. IMO, it just adds clutter.
>> >
>> > Not sure I follow: having a cache that contains both Cars and Persons sound more cluttering to me. I think it's cumbersome to write any kind of querying with an heterogenous  cache, e.g. Map/Reduce tasks that need to count all the green Cars would need to be aware of Persons and ignore them. Not only it is harder to write, but discourages code reuse and makes it hard to maintain (if you'll add Pets in the same cache in future you need to update the M/R code as well). And of course there are also different cache-based configuration options that are not immediately obvious (at design time) but will be in the future (there are more Persons than Cars, they live longer/expiry etc): mixing everything together in the same cache from the begging is a design decision that might bite you in the future.
>> >
>> > The way I see it - and very curious to see your opinion on this - following an database analogy, the CacheManager corresponds to an Database and the Cache to a Table. Hence my thought that queries spreading multiple caches are both useful and needed (same as query spreading over multiple tables).
>> >
>> >
>> 
>> Cheers,
>> --
>> Mircea Markus
>> Infinispan lead (www.infinispan.org)
>> 
>> 
>> 
>> 
>> 

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)







More information about the infinispan-dev mailing list