On 30 janv. 2014, at 20:51, Mircea Markus <mmarkus(a)redhat.com>
wrote:
> On Jan 30, 2014, at 9:42 AM, Galder ZamarreƱo <galder(a)redhat.com> wrote:
>
>
>> On Jan 21, 2014, at 11:52 PM, Mircea Markus <mmarkus(a)redhat.com> wrote:
>>
>>
>>> On Jan 15, 2014, at 1:42 PM, Emmanuel Bernard <emmanuel(a)hibernate.org>
wrote:
>>>
>>> By the way, people looking for that feature are also asking for a unified
Cache API accessing these several caches right? Otherwise I am not fully understanding why
they ask for a unified query.
>>> Do you have written detailed use cases somewhere for me to better understand
what is really requested?
>>
>> IMO from a user perspective, being able to run queries spreading several caches
makes the programming simplifies the programming model: each cache corresponding to a
single entity type, with potentially different configuration.
>
> Not sure if it simplifies things TBH if the configuration is the same. IMO, it just
adds clutter.
Not sure I follow: having a cache that contains both Cars and Persons sound more
cluttering to me. I think it's cumbersome to write any kind of querying with an
heterogenous cache, e.g. Map/Reduce tasks that need to count all the green Cars would
need to be aware of Persons and ignore them. Not only it is harder to write, but
discourages code reuse and makes it hard to maintain (if you'll add Pets in the same
cache in future you need to update the M/R code as well). And of course there are also
different cache-based configuration options that are not immediately obvious (at design
time) but will be in the future (there are more Persons than Cars, they live longer/expiry
etc): mixing everything together in the same cache from the begging is a design decision
that might bite you in the future.
The way I see it - and very curious to see your opinion on this - following an database
analogy, the CacheManager corresponds to an Database and the Cache to a Table. Hence my
thought that queries spreading multiple caches are both useful and needed (same as query
spreading over multiple tables).
I know Sanne and you are keen to have one entity type per cache to be able to fine tune
the configuration. I am a little more skeptical but I don't have strong opinions on
the subject.
However, I don't think you can forbid the case where people want to store heterogenous
types in the same cache:
- it's easy to start with
- configuration is indeed simpler
- when you work in the same service with cats, dogs, owners, addresses and refuges,
juggling between these n Cache instances begins to be fugly I suspect - should write some
application code to confirm
- people will add to the grid types unknown at configuration time. They might want a
single bucket.
Btw with the distributed execution engine, it looks reasonably simple to migrate data from
one cache to another. I imagine you can also focus only on the keys whose node is primary
which should limit data transfers. Am I missing something?