Re: [infinispan-dev] blog on new cache store API

Tuesday, 17 September 2013

On 13-09-17 11:50 AM, Shane Johnson wrote:
...
 Right. I'm familiar with the map/reduce process and the proposed
improvements.

 This part of the blog threw me off:

 "as the map/reduce tasks now run in parallel over both the nodes in the cluster and
within the same node (multiple threads)"

 To me, it implies that there are now multiple map threads per node. Further, I thought
that the map / reduce 'working set' was limited to what was in memory. I did not
realize that map / reduce would iterate over all of the data both in memory and on disk.
That is good to hear, though I'm curious if it will apply to all cache stores (e.g.
LevelDB) and how ISPN map / reduce handles a data set that is greater than the available
memory. A lot in-memory stores face this limitation when backed by on-disk stores. If the
data is retrieved one entry at a time, I don't see how multiple threads will help.
However, if it is retrieved in bulk I can see how it might. Not entirely sure.
 The implementation in MapReduceManagerImpl.java is cache store agnostic. 
Algorithm loads all keys (pinned to that owner node) and iterates over 
all values one value at at time.

Now that we are breaking this down into details I am not sure how 
multiple threads in cache store would help either. Mircea?

Regards,
Vladimir

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] blog on new cache store API