Re: [infinispan-dev] [inifnispan-dev] MapReduce enhancement

Monday, 2 January 2012

Hi Ondra,

On 12-01-02 9:40 AM, Ondra Nevelik wrote:
...
 Hi all,
 I was supposed to write an arbitrary app with Infinispan so I wanted to rewrite one of my
programs that is implemented using Hadoop MapReduce to have a comparison of performance
between the two.

 However the way MapReduce is done in Infinispan right now greatly limits the number of
problems that can be solved with it - there is a reduce phase on local data on each of the
compute nodes to decrease the amount of data transferred. There is a "global"
reduce after that. This means that the types of input keys/values has to be the same as
output types and that differs from the original MapReduce concept. This is simply
not true!
...

 A possible solution would be to use a "combiner function" (see [1]) instead of
the local reduce phase so that the amount of data transferred could still be reduced(if
applicable)(e.g. the WordCount example would still use the reduce function as the reducer)
but it will be possible to have different input and output types. As I went briefly
through the code of classes from mapreduce package I think there even won't be much
work needed.

 What do you think? Is this idea worth implementing? 
Possibly. I have not looked into combiner function. Sanne has mentioned 
it before and he might have further comments!

Regards,
Vladimir
...

 [1] part 4.1 of
http://www.mendeley.com/research/mapreducemerge-simplified-relational-dat...

 Ondrej Nevelik
 EDG QE

 Red Hat Czech s.r.o.
 Purkynova 99 612 45 Brno, Czech Republic
 mobile: +420 724 520 140

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] [inifnispan-dev] MapReduce enhancement