[infinispan-issues] [JBoss JIRA] (ISPN-4372) Map/Reduce performance is dependent on cache value size
Dan Berindei (JIRA)
issues at jboss.org
Mon Jun 9 10:34:17 EDT 2014
[ https://issues.jboss.org/browse/ISPN-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974634#comment-12974634 ]
Dan Berindei commented on ISPN-4372:
------------------------------------
I don't think it's fair to say that M/R performance depends on the input cache's value size. There is also another factor involved: {{WordCountMapperEmitPerValue}} coalesces all the occurrences of the same word in the value, so the number of intermediary values {{emit()}}ed by the mapper decreases a lot as the cache value size increases.
We should confirm that the same behaviour occurs with the basic {{WordCountMapper}}, otherwise it would be more fair to say that M/R performance is depended on the number of the number of intermediary entries emitted by the Mapper, which is to be expected.
However, it is surprising that M/R throughput doesn't continue increasing as the cache value size increases (and the number of intermediary values decreases) past 32KB. This is definitely worth investigating.
> Map/Reduce performance is dependent on cache value size
> -------------------------------------------------------
>
> Key: ISPN-4372
> URL: https://issues.jboss.org/browse/ISPN-4372
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 7.0.0.Alpha4
> Reporter: Alan Field
> Assignee: Dan Berindei
> Labels: performance
>
> Performance testing the Map/Reduce changes has shown that the performance improvements vary based on the size of the values in the cache. [1] Using values from 8kB to 128kB shows a large performance increase over Infinispan 6, but smaller and larger values are the same or slower than Infinispan 6.
> http://blog.infinispan.org/2014/06/mapreduce-performance-improvements.html
--
This message was sent by Atlassian JIRA
(v6.2.3#6260)
More information about the infinispan-issues
mailing list