[
https://issues.jboss.org/browse/ISPN-4022?page=com.atlassian.jira.plugin....
]
Vladimir Blagojevic commented on ISPN-4022:
-------------------------------------------
[~dan.berindei] and [~mircea.markus] I thought about this a bit further and I concluded
there are additional benefits to this approach. I would call this enhancement
"staggered combine". Just as Dan suggests we should invoke combine on certain
thresholds (say 1K entries in a combiner) during map phase and move intermediate KOut/VOut
values around cluster as these thresholds are reached. The benefit is that not only we
will relieve memory pressure, we will also never run out of RAM storing map output in a
Collector. In addition staggered migration of KOut/VOut to intermediate cache should
alleviate some of the insertion stress we have observed in performance tests.
If we are able to combine this feature with ISPN-3999 Sanne suggested this should be
awesome! WDYT?
M/R: Run the combiner concurrently with the mapper
--------------------------------------------------
Key: ISPN-4022
URL:
https://issues.jboss.org/browse/ISPN-4022
Project: Infinispan
Issue Type: Feature Request
Components: Core, Distributed Execution and Map/Reduce
Affects Versions: 6.0.1.Final
Reporter: Dan Berindei
Assignee: Vladimir Blagojevic
Fix For: 7.0.0.Final
Because we only run the combiner after we finished the mapping phase, we need to keep all
the results of the mapping phase in memory at once. We should split the output of the
mapper into chunks and allow the combiner to process chunks while the mapper is still
running, relieving some of the memory pressure. Maybe even block the mapper if there are
too many chunks in-flight.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira