<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 05/09/2012 04:58 PM, Sanne Grinovero wrote:
<blockquote
cite="mid:CAFm4XO36F=EyMC77tHWiNnoW_Rwj=6rz393rsYFTEFBRkUVxUA@mail.gmail.com"
type="cite">
<pre wrap="">2) Mapper and Reducer should work taking advantage of multiple cores
even on the same node .. so not just divide& Â conquer across multiple
nodes but also locally. Was this done already?
</pre>
<blockquote type="cite">
<pre wrap="">
Mapping is done across the entire cluster. Reduction is done only on one
node. We want to change that soon
<a class="moz-txt-link-freetext" href="https://community.jboss.org/wiki/Infinispan60-MapReduceEnhancements">https://community.jboss.org/wiki/Infinispan60-MapReduceEnhancements</a>
</pre>
</blockquote>
<pre wrap="">
I meant I'd need a way to process all cache entries with a single Map
instance but taking advantage of all CPU cores of a system: The
Mappers should be cloned and passed to different Runnables, each one
being fed a partition of the data (and ideally the first one to finish
should steal some work from the others).
</pre>
</blockquote>
I need to find my Master's thesis work (look at [1] for an
overview). I implemented a compiler for a Higher-Order parallel
programming language which went a bit further than map + reduce. You
basically build programs out of implicitly parallel functions, tell
it the number of nodes and the compiler maps the functions on the
nodes, implementing "sub-reduces" in the appropriate places. The
theory is probably more interesting than the implementation itself
(C++ w/MPI), but it's probably still worth a read and might inspire
some new directions where we can push Infinispan's distexec module.<br>
<br>
Tristan<br>
<br>
[1]
<a class="moz-txt-link-freetext" href="http://www.dcs.ed.ac.uk/home/mic/dagstuhl/roopa/dagstuhl-talk.ps.gz">http://www.dcs.ed.ac.uk/home/mic/dagstuhl/roopa/dagstuhl-talk.ps.gz</a><cite></cite>
</body>
</html>