[infinispan-dev] Distributed execution framework - API proposal(s)

Galder Zamarreño galder at redhat.com
Mon Jan 3 04:16:11 EST 2011


On Dec 27, 2010, at 5:25 PM, Vladimir Blagojevic wrote:

> Hey,
> 
> I spent the last week working on concrete API proposals for distributed 
> execution framework. I believe that we are close to finalize the 
> proposal and your input and feedback is important now! Here are the main 
> ideas where I think we made progress since we last talked.
> 
> 
> Access to multiple caches during task execution
> 
> While we have agreed to allow access to multiple caches during task 
> execution including this logic into task API complicates it greatly. The 
> compromise I found is to focus all API on to a one specific cache but 
> allow access to other caches through DistributedTaskContext API. The 
> focus on one specific cache and its input keys will allows us to 
> properly CH map task units across Infinispan cluster and will cover most 
> of the use cases. DistributedTaskContext can also easily be mapped to a 
> single cache. See DistributedTask and DistributedTaskContext for more 
> details.

Maybe I'm reading this wrong but are you saying that multiple caches cause problem with mapping of task units to nodes in cluster? 

Or are you just doing it not to clutter the API?

I think DistributedTaskContext extending CacheContainer is rather confusing, particularly when DistributedTaskContext has K,V parameters that generally are associated with Cache rather than CacheContainer.

Also, why is a context iterable? Iterates the contents of a CacheContainer? extends generally means that "is something". AFAIK, you'd be able to iterate a Map or Cache, but not a CacheContainer.

> 
> 
> DistributedTask and DistributedCallable
> 
> I found it useful to separate task characteristics in general and actual 
> work/computation details. Therefore the main task characteristics are 
> specified through DistributedTask API and details of actual task 
> computation are specified through DistributedCallable API. 
> DistributedTask specifies coarse task details, the failover policy, the 
> task splitting policy, cancellation policy and so on while in 
> DistributedCallable API implementers focus on actual details of a 
> computation/work unit.

Personally, I think DistributedTask has too many generics (K, V, T, R) and it's hard to read. IMO, only T and R should only exist. I would also try to stick to Callable conventions that takes a V.

I don't like to see things like this, reminds me of EJB 2.1 where you were forced to implement a method to simply get hold of a ctx. There're much nicer ways to do things like this, if completely necessary (see EJB3) :

      @Override
      public void mapped(DistributedTaskContext<String, String> ctx) {
         this.ctx = ctx;
      }

Looking at the example provided, it seems to me that all DistributedTaskContext is used for is to navigate the Cache contents from a user defined callable, in which case I would limit its scope.

Finally, what is the aim of the write() methods in DTC? I don't see them in use in the given example. If they're not meant to be used by the user and are only for internal fwk use, I would not leak them.

> 
> 
> I have updated the original document [1] to reflect API update. You can 
> see the actual proposal in git here [2] and I have also included the 
> variation of this approach [3] that separates map and reduce task phases 
> with separate interfaces and removes DistributedCallable interaface. I 
> have also kept Trustin's ideas in another proposal [4] since I would 
> like to include them as well if possible.
> 
> Regards,
> Vladimir
> 
> 
> [1] http://community.jboss.org/wiki/InfinispanDistributedExecutionFramework
> [2] https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop1
> [3] https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop2
> [4] https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop3
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list