[infinispan-dev] Parallel M/R

Radim Vansa rvansa at redhat.com
Mon Dec 9 11:33:31 EST 2013


On 12/09/2013 04:21 PM, Vladimir Blagojevic wrote:
> Radim, these are some very good ideas. And I think we should put them on
> the roadmap.
Do you have any JIRA where this could be marked down?
>
> Also, I like your ExecutorAllCompletionService, however, I think it will
> not work in this case as we often do not have exclusive access to the
> underlying executor service used in ExecutorAllCompletionService. There
> might be some other Infinispan part using the same executor service
> which will screw up the logic of incrementing and decrementing tasks. It
> seems like you need to do counting logic based on some id or smth similar.
You're right - I am not really sure which threadpools does M/R use. And 
in fact it might happen that currently multiple executions using the 
persistence executor would interfere. I should fix the service to add 
the ID and multiplex internally! Created [1] for that.
Shouldn't there be separate threadpools for map and reduce tasks? I 
mean, two threadpools, because used mapper pool should not block another 
reduce tasks to be executed (otherwise there might be a deadlock between 
the threadpools and bounded queue in collector).

Radim

[1] https://issues.jboss.org/browse/ISPN-3800
>
> On 12/9/2013, 3:09 AM, Radim Vansa wrote:
>> There is one thing I really don't like about the current implementation:
>> DefaultCollector. And any other collection that keeps one (or more)
>> object per entry.
>> We can't assume that if you double the number of objects in memory (and
>> in fact, if you map entry to bigger object, you do that), they'd still
>> fit into it. Moreover, if you map the objects from cache store as well.
>> I believe we have to use Collector implemented as bounded queue, and
>> start reduction phase on the entries that have been mapped in parallel
>> to the mapper phase. Otherwise, say hello to OOME.
>>
>> Cheers
>>
>> Radim
>>
>> PS: And don't keep all the futures just to check that all tasks have
>> been finished - use ExecutorAllCompletionService instead.
>>
>>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Radim Vansa <rvansa at redhat.com>
JBoss DataGrid QA



More information about the infinispan-dev mailing list