[infinispan-dev] blog on new cache store API

Shane Johnson shjohnso at redhat.com
Tue Sep 17 12:07:07 EDT 2013


Right.

Regarding the cache store, while I'm not fully aware of its implementation, one of the primary reasons to use LevelDB is that the entire key set no longer has to be in memory. That's why I asked. Because if it was designed like that, and I have no idea if it is, m/r would not have the entire key set beforehand.

Shane

----- Original Message -----
From: "Vladimir Blagojevic" <vblagoje at redhat.com>
To: "Shane Johnson" <shjohnso at redhat.com>
Cc: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>
Sent: Tuesday, September 17, 2013 10:59:18 AM
Subject: Re: [infinispan-dev] blog on new cache store API

On 13-09-17 11:50 AM, Shane Johnson wrote:
> Right. I'm familiar with the map/reduce process and the proposed improvements.
>
> This part of the blog threw me off:
>
> "as the map/reduce tasks now run in parallel over both the nodes in the cluster and within the same node (multiple threads)"
>
> To me, it implies that there are now multiple map threads per node. Further, I thought that the map / reduce 'working set' was limited to what was in memory. I did not realize that map / reduce would iterate over all of the data both in memory and on disk. That is good to hear, though I'm curious if it will apply to all cache stores (e.g. LevelDB) and how ISPN map / reduce handles a data set that is greater than the available memory. A lot in-memory stores face this limitation when backed by on-disk stores. If the data is retrieved one entry at a time, I don't see how multiple threads will help. However, if it is retrieved in bulk I can see how it might. Not entirely sure.
>
The implementation in MapReduceManagerImpl.java is cache store agnostic. 
Algorithm loads all keys (pinned to that owner node) and iterates over 
all values one value at at time.

Now that we are breaking this down into details I am not sure how 
multiple threads in cache store would help either. Mircea?

Regards,
Vladimir


More information about the infinispan-dev mailing list