[infinispan-dev] Design change in Infinispan Query

Thu Mar 13 06:03:28 EDT 2014

On Mar 12, 2014, at 17:37, Galder Zamarreño <galder at redhat.com> wrote:

> 
> On 04 Mar 2014, at 19:02, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
> 
>>> </snip>
>>>> 
>>>> To anecdotally answer your specific example, yes different configs for different entities is an interesting benefit but it has to outweigh the drawbacks.
>>> 
>>> Using a single cache for all the types is practical at all :-) Just to expand my idea, people prefer using different caches for many reasons:
>>> - security: Account cache has a different security requirements than the News cache
>>> - data consistency: News is a non-transactional cache, Account require pessimistic XA transactions
>>> - expiry: expire last year's news from the system. Not the same for Accounts
>>> - availability: I want the Accounts cache to be backed up to another site. I don't want that for the News cache
>>> - logical data grouping: mixing Accounts with News doesn't make sense. I might want to know which account appeared in the news, though.
>> 
>> This kind of reasons reminds me in the RDBMS world of why people use different databases.
>> In fact, I have had experience where literally News was a different database than Accounts.
>> 
>> But again in this model, in one database, you have many tables.
>> 
>>> 
>>>> If you have to do a map reduce for tasks so simple as age > 18, I think you system better have to be prepared to run gazillions of M/R jobs.
>>> 
>>> I want to run a simple M/R job in the evening to determine who turns 18 tomorrow, to congratulate them. Once a day, not gazzilions of times, and I don't need to index the age filed just for that. Also when it comes to Map/Reduce, the drawback of holding all the data in a single cache is two-folded:
>>> - performance: you iterate over the data that is not related to your query. 
> 
> @Mircea: when we talked about mixing up data in a cache, we talked that you’d get a view of the cache, say for a particular type, and iterators, map/reduce functions…etc, would only iterate over those. Hence, you’d avoid iterating over stuff not relevant to you. 

It depends how you implement the view cache: if it just a filtering decorator (no state) around the actual cache then the the M/R will still iterate over all the entries and ignore them. If the view has state (the filtered keys only), the map reduce iteration would work indeed faster. 

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)