Re: [infinispan-dev] MapReduce limitations and suggestions.

Tuesday, 18 February 2014

On 18/Feb/2014, at 10:59 , Dan Berindei <dan.berindei(a)gmail.com&gt; wrote:
...
 I think Hadoop only loads a block of intermediate values in memory at
once, and can even sort the intermediate values (with a user-supplied comparison function)
so that the reduce function can work on a sorted list without loading the values in memory
itself. 
Actually, Hadoop sorts in the map node, the last two steps being sort and combine. Reduce
nodes fetch partitions from the map nodes and just merges them. Such partitions are
fetched incrementally, and whenever a given key ends in all partially fetched partitions,
reduce() is called.

Cheers, MP
--
Marcelo Pasin
Université de Neuchâtel · Institut d'informatique
rue Emile-Argand 11 · Case postale 158 · 2000 Neuchâtel · Switzerland

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] MapReduce limitations and suggestions.