[infinispan-dev] Map/Reduce or other batch processing on CacheLoader stored entries

Tristan Tarrant ttarrant at redhat.com
Wed May 9 14:16:40 EDT 2012


On 05/09/2012 04:58 PM, Sanne Grinovero wrote:
> 2) Mapper and Reducer should work taking advantage of multiple cores
> even on the same node .. so not just divide&    conquer across multiple
>
> nodes but also locally. Was this done already?
>> Mapping is done across the entire cluster. Reduction is done only on one
>> node. We want to change that soon
>> https://community.jboss.org/wiki/Infinispan60-MapReduceEnhancements
> I meant I'd need a way to process all cache entries with a single Map
> instance but taking advantage of all CPU cores of a system: The
> Mappers should be cloned and passed to different Runnables, each one
> being fed a partition of the data (and ideally the first one to finish
> should steal some work from the others).
>
I need to find my Master's thesis work (look at [1] for an overview). I 
implemented a compiler for a Higher-Order parallel programming language 
which went a bit further than map + reduce. You basically build programs 
out of implicitly parallel functions, tell it the number of nodes and 
the compiler maps the functions on the nodes, implementing "sub-reduces" 
in the appropriate places. The theory is probably more interesting than 
the implementation itself (C++ w/MPI), but it's probably still worth a 
read and might inspire some new directions where we can push 
Infinispan's distexec module.

Tristan

[1] http://www.dcs.ed.ac.uk/home/mic/dagstuhl/roopa/dagstuhl-talk.ps.gz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20120509/66d82b0c/attachment.html 


More information about the infinispan-dev mailing list