[infinispan-issues] [JBoss JIRA] (ISPN-4022) M/R: Run the combiner concurrently with the mapper

Vladimir Blagojevic (JIRA) issues at jboss.org
Wed Mar 5 11:45:35 EST 2014


    [ https://issues.jboss.org/browse/ISPN-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12950351#comment-12950351 ] 

Vladimir Blagojevic commented on ISPN-4022:
-------------------------------------------

[~dan.berindei] and [~mircea.markus] I think I have a really good solution that will do parallel iteration of container's key/value pairs while doing batch combine in a staggered approach. Just as Dan suggested the algorithm will work for cases with or without combiner. I have enabled this algorithm for large tasks (distributed map/reduce, no input keys specified) See details at https://github.com/infinispan/infinispan/pull/2413
                
> M/R: Run the combiner concurrently with the mapper
> --------------------------------------------------
>
>                 Key: ISPN-4022
>                 URL: https://issues.jboss.org/browse/ISPN-4022
>             Project: Infinispan
>          Issue Type: Feature Request
>          Components: Core, Distributed Execution and Map/Reduce
>    Affects Versions: 6.0.1.Final
>            Reporter: Dan Berindei
>            Assignee: Vladimir Blagojevic
>             Fix For: 7.0.0.Final
>
>
> Because we only run the combiner after we finished the mapping phase, we need to keep all the results of the mapping phase in memory at once. We should split the output of the mapper into chunks and allow the combiner to process chunks while the mapper is still running, relieving some of the memory pressure. Maybe even block the mapper if there are too many chunks in-flight.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list