[infinispan-issues] [JBoss JIRA] (ISPN-4440) Remove setMaxCollectorSize API from MapReduceTask
Vladimir Blagojevic (JIRA)
issues at jboss.org
Mon Jul 7 10:24:25 EDT 2014
[ https://issues.jboss.org/browse/ISPN-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Blagojevic resolved ISPN-4440.
---------------------------------------
Fix Version/s: 7.0.0.Alpha5
Resolution: Done
> Remove setMaxCollectorSize API from MapReduceTask
> -------------------------------------------------
>
> Key: ISPN-4440
> URL: https://issues.jboss.org/browse/ISPN-4440
> Project: Infinispan
> Issue Type: Task
> Security Level: Public(Everyone can see)
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 7.0.0.Alpha4
> Reporter: Vladimir Blagojevic
> Assignee: Vladimir Blagojevic
> Priority: Minor
> Fix For: 7.0.0.Alpha5
>
>
> During the refinement of parallel execution of M/R algorithm we introduced an abstraction maxCollectorSize on the level of MapReduceTask. The ideas was that during execution of map/combine phase, number of intermediate keys/values collected in a Collector could potentially become very large. By limiting size of collector, intermediate key/values are transferred to intermediate cache in batches before reduce phase is executed and OutOfMemoryError issues are avoided as well.
> However, during the extensive performance phase Alan Field, Dan Berindei and I have concluded that maxCollectorSize set to 10000 entries gives the best trade off between performance and memory use. Therefore there is no need to expose this value to MapReduceTask users.
> Having said that there might be some uses cases where holding 10000 intermediate large memory footprint objects might lead to OOM, and in such cases users should allocate more heap to MapReduceTasks. We might consider introducing again this API should such a need arise.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
More information about the infinispan-issues
mailing list