[infinispan-dev] Parallel M/R

Vladimir Blagojevic vblagoje at redhat.com
Mon Dec 9 10:21:28 EST 2013


Radim, these are some very good ideas. And I think we should put them on 
the roadmap.

Also, I like your ExecutorAllCompletionService, however, I think it will 
not work in this case as we often do not have exclusive access to the 
underlying executor service used in ExecutorAllCompletionService. There 
might be some other Infinispan part using the same executor service 
which will screw up the logic of incrementing and decrementing tasks. It 
seems like you need to do counting logic based on some id or smth similar.

On 12/9/2013, 3:09 AM, Radim Vansa wrote:
> There is one thing I really don't like about the current implementation:
> DefaultCollector. And any other collection that keeps one (or more)
> object per entry.
> We can't assume that if you double the number of objects in memory (and
> in fact, if you map entry to bigger object, you do that), they'd still
> fit into it. Moreover, if you map the objects from cache store as well.
> I believe we have to use Collector implemented as bounded queue, and
> start reduction phase on the entries that have been mapped in parallel
> to the mapper phase. Otherwise, say hello to OOME.
>
> Cheers
>
> Radim
>
> PS: And don't keep all the futures just to check that all tasks have
> been finished - use ExecutorAllCompletionService instead.
>
>



More information about the infinispan-dev mailing list