[infinispan-dev] Distributed execution framework - API proposal(s)

Manik Surtani manik at jboss.org
Tue Jan 11 09:45:11 EST 2011


And for the distributed executor API, how's this as an alternative (suggested by Eduardo earlier)?

DistributedExecutorService extends ExecutorService [1], adds overloaded methods of submit(Callable) with submit(Callable, K...).  In keeping with the ExecutorService contract, submit() will ensure the Callable is executed just once, on any one node.  We could add a submitEverywhere(Callable) and submitEverywhere(Callable, K...) to mean execution on _all_ nodes.

Also ships with a DistributedCallable which extends Callable, adds setCache() and setEmbeddedCacheManager() setters.  If the callable is a DistributedCallable, then these setters are called before Callable.call(); otherwise just do Callable.call().

WDYT?

[1] http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html

On 11 Jan 2011, at 14:18, Vladimir Blagojevic wrote:

> On 11-01-11 10:28 AM, Manik Surtani wrote:
>> Hi there.
>> 
>> Lets take the 2 API sets (M/R and distributed exec) separately:
>> 
>> M/R
>> * Why does MapReduceTask extend Set?  It probably shouldn't even 
>> extend DistributedRemoteTask, since some methods such as 
>> execute(Callable) doesn't make sense for MapReduceTask.  Perhaps a 
>> common super-interface?  That alone should simplify the generics...
> 
> Leftover that I did not catch. I meant Map<K1,K2> instead of 
> Set<Map.Entry<K2, V2>>; it does not extend Set it just constraints one 
> of the type parameters :-)
> 
> Completely agree on common parent that has the common setters for 
> various task aspects.
> 
>> * Why does Mapper.map() return a Map? [1] Shouldn't it return a single 
>> transformation for a given K and V?
> 
> Yes, I thought so too but how would we then match intermediate keys for 
> reduce phase. Our plumbing has to find all the same 
> transformed/intermediate keys returned from a map phase and then invoke 
> the reduce phase.
> 
>> * Reducer.reduce()'s return type should not necessarily be the same 
>> type as the mapped transformation.  Also, does the reducer really need 
>> the key? Surely just the transformed version of each K/V pair.
> 
> Yeah, when we solve the above this should fall into place.
> 
>> * Similarly, Collator.add() should just need the address and the 
>> reduced result from each node (each node would only produce 1 result!)
>> 
>> DistExec
>> * Why do you have execute() and executeAsync() with no params?  What 
>> do these methods do?
> 
> In case user provided Factory for DistributedCallable. We have to handle 
> cases where simple no parameter constructor for callables is not 
> sufficient - hence factory. Now that I think about it maybe adding K... 
> input as a parameter to call function of DistributedCallable makes more 
> sense, or no? WDYT?
> 
>> * Not sure what DistributedTaskContext is all about.  What does read() 
>> and write() do?  Is this meant to be a "layer" in front of the cache 
>> in question?  Why not just pass in a ref to the cache?
> 
> True.
> 
>> * What does Factory do?
> 
> Creates DistributedCallables. See above.
> 
>> 
>> Almost there!  :-)
>> 
>> Cheers
>> Manik
>> 
>> [1] How many times can I say "map" in one sentence?  :)
>> 
>> 
> Thanks Manik. Almost there :-)
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20110111/127a5e29/attachment.html 


More information about the infinispan-dev mailing list