Guys,

I was looking at this again recently and I still do not understand how combiner could have different interface than Reducer! Hadoop forces a user to implement combiner as a Reducer http://developer.yahoo.com/hadoop/tutorial/module4.html#functionality and http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#setCombinerClass%28java.lang.Class%29 In addition, the original paper does not mention any change of types.

What we have admittedly done wrong is to apply Reducer on individual Mapper without checking if a reduce function is both commutative and associative! This can lead to problems: http://philippeadjiman.com/blog/2010/01/14/hadoop-tutorial-series-issue-4-to-use-or-not-to-use-a-combiner/

So yes, I am all for adding Combiner (it should do the optional reducer per mapper we do automatically now) but I do not see why we have to change the interface!


Regards,
Vladimir