On 2 Jan 2012, at 16:49, Vladimir Blagojevic wrote:
Hi Ondra,
On 12-01-02 9:40 AM, Ondra Nevelik wrote:
> Hi all,
> I was supposed to write an arbitrary app with Infinispan so I wanted to rewrite one
of my programs that is implemented using Hadoop MapReduce to have a comparison of
performance between the two.
>
> However the way MapReduce is done in Infinispan right now greatly limits the number
of problems that can be solved with it - there is a reduce phase on local data on each of
the compute nodes to decrease the amount of data transferred. There is a
"global" reduce after that. This means that the types of input keys/values has
to be the same as output types and that differs from the original MapReduce concept.
This is simply not true!
Vladimir, I think Ondra refers to the fact that
Reduce.reduce function consumes and returns the same type of objects: VOut [1]
IIRC the reason behind this restrictive signature was to allow all sorts of internal
optimisations, e.g. parallelising the reduce phase.
[1]
http://bit.ly/vO0GjY