[infinispan-dev] Distributed execution framework - API proposal(s)

Galder Zamarreño galder at redhat.com
Wed Jan 5 05:30:48 EST 2011


Something else to add here. Earlier today I was looking at GridGain and the functional way to define map/reduce functions and seemed quite appealing, particularly cos they had managed to do it for Java as well:
"Native Java & Scala Support" section in http://www.gridgain.com/product.html

Having contracted the Scala virus last year, this way of coding functions looks more appealing (to me) than the way they did in 2.0 which is fairly close to what is currently being proposed.

Maybe something not for the first version, but something definitely worth keeping in mind.

On Dec 27, 2010, at 5:25 PM, Vladimir Blagojevic wrote:

> Hey,
> 
> I spent the last week working on concrete API proposals for distributed 
> execution framework. I believe that we are close to finalize the 
> proposal and your input and feedback is important now! Here are the main 
> ideas where I think we made progress since we last talked.
> 
> 
> Access to multiple caches during task execution
> 
> While we have agreed to allow access to multiple caches during task 
> execution including this logic into task API complicates it greatly. The 
> compromise I found is to focus all API on to a one specific cache but 
> allow access to other caches through DistributedTaskContext API. The 
> focus on one specific cache and its input keys will allows us to 
> properly CH map task units across Infinispan cluster and will cover most 
> of the use cases. DistributedTaskContext can also easily be mapped to a 
> single cache. See DistributedTask and DistributedTaskContext for more 
> details.
> 
> 
> DistributedTask and DistributedCallable
> 
> I found it useful to separate task characteristics in general and actual 
> work/computation details. Therefore the main task characteristics are 
> specified through DistributedTask API and details of actual task 
> computation are specified through DistributedCallable API. 
> DistributedTask specifies coarse task details, the failover policy, the 
> task splitting policy, cancellation policy and so on while in 
> DistributedCallable API implementers focus on actual details of a 
> computation/work unit.
> 
> 
> I have updated the original document [1] to reflect API update. You can 
> see the actual proposal in git here [2] and I have also included the 
> variation of this approach [3] that separates map and reduce task phases 
> with separate interfaces and removes DistributedCallable interaface. I 
> have also kept Trustin's ideas in another proposal [4] since I would 
> like to include them as well if possible.
> 
> Regards,
> Vladimir
> 
> 
> [1] http://community.jboss.org/wiki/InfinispanDistributedExecutionFramework
> [2] https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop1
> [3] https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop2
> [4] https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop3
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list