Something else to add here. Earlier today I was looking at GridGain and the functional way
to define map/reduce functions and seemed quite appealing, particularly cos they had
managed to do it for Java as well:
"Native Java & Scala Support" section in
http://www.gridgain.com/product.html
Having contracted the Scala virus last year, this way of coding functions looks more
appealing (to me) than the way they did in 2.0 which is fairly close to what is currently
being proposed.
Maybe something not for the first version, but something definitely worth keeping in
mind.
On Dec 27, 2010, at 5:25 PM, Vladimir Blagojevic wrote:
Hey,
I spent the last week working on concrete API proposals for distributed
execution framework. I believe that we are close to finalize the
proposal and your input and feedback is important now! Here are the main
ideas where I think we made progress since we last talked.
Access to multiple caches during task execution
While we have agreed to allow access to multiple caches during task
execution including this logic into task API complicates it greatly. The
compromise I found is to focus all API on to a one specific cache but
allow access to other caches through DistributedTaskContext API. The
focus on one specific cache and its input keys will allows us to
properly CH map task units across Infinispan cluster and will cover most
of the use cases. DistributedTaskContext can also easily be mapped to a
single cache. See DistributedTask and DistributedTaskContext for more
details.
DistributedTask and DistributedCallable
I found it useful to separate task characteristics in general and actual
work/computation details. Therefore the main task characteristics are
specified through DistributedTask API and details of actual task
computation are specified through DistributedCallable API.
DistributedTask specifies coarse task details, the failover policy, the
task splitting policy, cancellation policy and so on while in
DistributedCallable API implementers focus on actual details of a
computation/work unit.
I have updated the original document [1] to reflect API update. You can
see the actual proposal in git here [2] and I have also included the
variation of this approach [3] that separates map and reduce task phases
with separate interfaces and removes DistributedCallable interaface. I
have also kept Trustin's ideas in another proposal [4] since I would
like to include them as well if possible.
Regards,
Vladimir
[1]
http://community.jboss.org/wiki/InfinispanDistributedExecutionFramework
[2]
https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop1
[3]
https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop2
[4]
https://github.com/vblagoje/infinispan/tree/t_ispn-39_master_prop3
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache