[infinispan-dev] Intermediate cache in M/R API

Vladimir Blagojevic vblagoje at redhat.com
Mon Mar 17 10:58:55 EDT 2014


Guys,

We need some input on how to design API regarding use of intermediate 
caches [1]. As you might know one of the requirements for improving our 
M/R is allowing applications to use custom defined intermediate 
key/value cache used to store keys/values of map/combine phase before 
being reduced in reduced phase.


Currently we have a constructor where one can specify whether to use 
shared or per-task intermediate cache. And now we wanted to add an 
additional method:

usingIntermediateCache(String cacheName, String cacheConfigurationName);

that will enable use of custom intermediate cache.

Now, Dan, and rightly so, thought this was a bit confusing. Are we 
referring to intermediate shared or per-task intermediate cache when 
using the above mentioned method.

His proposal is touse a per-task intermediate cache with our default 
specified intermediate cache configuration. Remove the constructor 
parameter in MapReduceTask regarding shared or non shared cache and add 
configuration methods for both caches:


     usingIntermediateCache(String configName) - use a per-task 
intermediate cache with the given configuration
     usingSharedIntermediateCache(String cache) - use a shared cache 
with our default configuration
     usingSharedIntermediateCache(String cache, String configName) - use 
a shared cache with the given configuration


Note that we need a name for shared cache because we want to enable 
application to easily remove/inspect that cache after all m/r tasks 
sharing that intermediate cache have been executed.

What are your thoughts here?

Vladimir



[1] https://issues.jboss.org/browse/ISPN-4021


More information about the infinispan-dev mailing list