[infinispan-dev] MapReduce limitations and suggestions.

Dan Berindei dan.berindei at gmail.com
Tue Feb 18 13:41:52 EST 2014


On Tue, Feb 18, 2014 at 5:46 PM, Evangelos Vazaios <vagvaz at gmail.com> wrote:

> On 02/18/2014 05:36 PM, Vladimir Blagojevic wrote:
> > On 2/18/2014, 4:59 AM, Dan Berindei wrote:
> >>
> >> The limitation we have now is that in the reduce phase, the entire
> >> list of values for one intermediate key must be in memory at once. I
> >> think Hadoop only loads a block of intermediate values in memory at
> >> once, and can even sort the intermediate values (with a user-supplied
> >> comparison function) so that the reduce function can work on a sorted
> >> list without loading the values in memory itself.
> >>
> >>
> > Dan and others,
> >
> > This is where Sanne's idea comes into play. Why collect entire list of
> > intermediate values for each intermediate key and then invoke reduce on
> > those values when we can invoke reduce each time new intermediate value
> > gets inserted?
> >
> Because you cant. What you are saying is more like combining than
> reducing. If there is a combiner in the MapReduceTask you can execute
> the combiner on a subset (in your case 2)  values with the same key and
> output one. But, this is not possible always.
>

In theory we could stream each intermediate value independently to the
combiner and then to the node of the reducer, and the reducer could start
up immediately on the reducer node instead of waiting for the mapping phase
to finish on all the mapping nodes (blocking when it doesn't have any more
values to process). But I imagine that would be kind of tricky to implement.


> > https://issues.jboss.org/browse/ISPN-3999
> >
> > Cheers,
> > Vladimir
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20140218/b6a52c75/attachment-0001.html 


More information about the infinispan-dev mailing list