[infinispan-dev] Map/Reduce or other batch processing on CacheLoader stored entries

Wed May 9 06:10:57 EDT 2012

Hi all,
ideally I would like to build the MassIndexer for Infinispan Query on
top of Map/Reduce as it seems it should provide all what I need "out
of the box", but technically I'd need it to support some more
features:

1) Some resource injection in the Mapper / Reducer. I know Vladimir
has been working on a CDI approach, is there some similar approach
which could work for me when not having CDI in the environment?
Having a reference to the Cache would be fine - I remember this was
discussed a while back but forgot if some work was done on it.

2) Mapper and Reducer should work taking advantage of multiple cores
even on the same node .. so not just divide & conquer across multiple
nodes but also locally. Was this done already?

3) I need the Map/Reduce to execute also on all entries stored in the
CacheLoader entries. I don't believe that's the case today.. and even
if I wanted to use just the Executor, I believe the CacheLoader API
needs to provide some option to load all keys in a stream form
appropriate for batch processing: using loadAll will likely get me
into an OOM.

Even if we decide to perform such jobs without using Map/Reduce but
any other form, we need to support some form of "process on all
entries", what should we suggest as best way to iterate and perform
modifications to all entries in the data container?

Cheers,
Sanne