Re: [infinispan-dev] Map/Reduce or other batch processing on CacheLoader stored entries

Wednesday, 9 May 2012

On 05/09/2012 04:58 PM, Sanne Grinovero wrote:
...
 2) Mapper and Reducer should work taking advantage of multiple cores
 even on the same node .. so not just divide&    conquer across multiple

 nodes but also locally. Was this done already?
> Mapping is done across the entire cluster. Reduction is done only on one
> node. We want to change that soon
> https://community.jboss.org/wiki/Infinispan60-MapReduceEnhancements
 I meant I'd need a way to process all cache entries with a single Map
 instance but taking advantage of all CPU cores of a system: The
 Mappers should be cloned and passed to different Runnables, each one
 being fed a partition of the data (and ideally the first one to finish
 should steal some work from the others).
 I need to find my Master's thesis work (look at [1] for an overview). I 
implemented a compiler for a Higher-Order parallel programming language 
which went a bit further than map + reduce. You basically build programs 
out of implicitly parallel functions, tell it the number of nodes and 
the compiler maps the functions on the nodes, implementing "sub-reduces" 
in the appropriate places. The theory is probably more interesting than 
the implementation itself (C++ w/MPI), but it's probably still worth a 
read and might inspire some new directions where we can push 
Infinispan's distexec module.

Tristan

[1] http://www.dcs.ed.ac.uk/home/mic/dagstuhl/roopa/dagstuhl-talk.ps.gz

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Map/Reduce or other batch processing on CacheLoader stored entries