[infinispan-dev] Design change in Infinispan Query

Mon Feb 17 12:36:39 EST 2014

On 31 Jan 2014, at 09:28, Emmanuel Bernard <emmanuel at hibernate.org> wrote:

> 
> 
>> On 30 janv. 2014, at 20:51, Mircea Markus <mmarkus at redhat.com> wrote:
>> 
>> 
>>> On Jan 30, 2014, at 9:42 AM, Galder Zamarreño <galder at redhat.com> wrote:
>>> 
>>> 
>>>> On Jan 21, 2014, at 11:52 PM, Mircea Markus <mmarkus at redhat.com> wrote:
>>>> 
>>>> 
>>>>> On Jan 15, 2014, at 1:42 PM, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>>>> 
>>>>> By the way, people looking for that feature are also asking for a unified Cache API accessing these several caches right? Otherwise I am not fully understanding why they ask for a unified query.
>>>>> Do you have written detailed use cases somewhere for me to better understand what is really requested?
>>>> 
>>>> IMO from a user perspective, being able to run queries spreading several caches makes the programming simplifies the programming model: each cache corresponding to a single entity type, with potentially different configuration.
>>> 
>>> Not sure if it simplifies things TBH if the configuration is the same. IMO, it just adds clutter.
>> 
>> Not sure I follow: having a cache that contains both Cars and Persons sound more cluttering to me. I think it's cumbersome to write any kind of querying with an heterogenous  cache, e.g. Map/Reduce tasks that need to count all the green Cars would need to be aware of Persons and ignore them. Not only it is harder to write, but discourages code reuse and makes it hard to maintain (if you'll add Pets in the same cache in future you need to update the M/R code as well). And of course there are also different cache-based configuration options that are not immediately obvious (at design time) but will be in the future (there are more Persons than Cars, they live longer/expiry etc): mixing everything together in the same cache from the begging is a design decision that might bite you in the future.
>> 
>> The way I see it - and very curious to see your opinion on this - following an database analogy, the CacheManager corresponds to an Database and the Cache to a Table. Hence my thought that queries spreading multiple caches are both useful and needed (same as query spreading over multiple tables).
> 
> I know Sanne and you are keen to have one entity type per cache to be able to fine tune the configuration. I am a little more skeptical but I don't have strong opinions on the subject. 
> 
> However, I don't think you can forbid the case where people want to store heterogenous types in the same cache:
> 
> - it's easy to start with
> - configuration is indeed simpler
> - when you work in the same service with cats, dogs, owners, addresses and refuges, juggling between these n Cache instances begins to be fugly I suspect - should write some application code to confirm
> - people will add to the grid types unknown at configuration time. They might want a single bucket. 

+100

> 
> Btw with the distributed execution engine, it looks reasonably simple to migrate data from one cache to another. I imagine you can also focus only on the keys whose node is primary which should limit data transfers. Am I missing something?
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
galder at redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org