As per the feedback from Ondra and Brent the Reducer's API constraint, as described
below, seem to be too limiting for our users.
AFAIR the main reason for Reducer.reduce to consume and return same types of objects was
for some internal optimisations, but the Combiner as described in the paper Ondra sent
seem to solve at least some of these performance concerns.
+1 for enhancing this as it seems to be too limiting for our users
On 3 Jan 2012, at 00:48, Brent Douglas wrote:
Hi Ondra & Vladimir,
I am also very keen to see Reducer.reduce be allowed to return a different type than it
accepts if it is possible. I was previously looking at using infinispan's map reduce
to replace hazelcast to run reports in a seam 2 app but was put off by this and another
issue that I think has been resolved now.
Sincerely,
Brent Douglas
On Tue, Jan 3, 2012 at 1:49 AM, Vladimir Blagojevic <vblagoje(a)redhat.com> wrote:
Hi Ondra,
On 12-01-02 9:40 AM, Ondra Nevelik wrote:
> Hi all,
> I was supposed to write an arbitrary app with Infinispan so I wanted to rewrite one
of my programs that is implemented using Hadoop MapReduce to have a comparison of
performance between the two.
>
> However the way MapReduce is done in Infinispan right now greatly limits the number
of problems that can be solved with it - there is a reduce phase on local data on each of
the compute nodes to decrease the amount of data transferred. There is a
"global" reduce after that. This means that the types of input keys/values has
to be the same as output types and that differs from the original MapReduce concept.
This is simply not true!
>
> A possible solution would be to use a "combiner function" (see [1])
instead of the local reduce phase so that the amount of data transferred could still be
reduced(if applicable)(e.g. the WordCount example would still use the reduce function as
the reducer) but it will be possible to have different input and output types. As I went
briefly through the code of classes from mapreduce package I think there even won't be
much work needed.
>
> What do you think? Is this idea worth implementing?
Possibly. I have not looked into combiner function. Sanne has mentioned
it before and he might have further comments!
Regards,
Vladimir
>
> [1] part 4.1 of
http://www.mendeley.com/research/mapreducemerge-simplified-relational-dat...
>
> Ondrej Nevelik
> EDG QE
>
> Red Hat Czech s.r.o.
> Purkynova 99 612 45 Brno, Czech Republic
> mobile: +420 724 520 140
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev