[infinispan-dev] Dist.exec failover

Erik Salter an1310 at hotmail.com
Tue Oct 23 12:42:22 EDT 2012


Hi all,

 

There are a couple of reasons:  

-        I may not want a task failover policy at all, and the current API
kind of obscures this.  At the very least, this is different than the 5.1
behavior.  

-        I specify keys to a task to represent the key set that will be
pessimistically acquired to eliminate RPCs.  (there's some Group API magic
and taking advantage of the new SyncConsistentHash).  A random policy
defeats this purpose.  Thus if I do want a task failover policy, I want one
that allows the task to be retried on the node that owns the representative
keys.

 

Incidentally, the failover policy obscures the original reason the task
failed.  The exception thrown to the calling node is a FailoverException.
The original reason is about 4 levels deep.  Example:

 

java.util.concurrent.ExecutionException: Failover execution failed

               at
org.infinispan.distexec.DefaultExecutorService$DistributedTaskPart.failoverE
xecution(DefaultExecutorService.java:855)

.

Caused by: java.lang.Exception: Failover execution failed

               ... 45 more

Caused by: java.util.concurrent.ExecutionException: Failover execution
failed

               at
org.infinispan.distexec.DefaultExecutorService$DistributedTaskPart.failoverE
xecution(DefaultExecutorService.java:852)

               ... 44 more

Caused by: java.lang.Exception: Failover execution failed

               ... 45 more

Caused by: java.util.concurrent.ExecutionException:
org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock
after [5 seconds] on key
[ServiceGroupKey[edgeDeviceId=1,serviceGroupNo=101]] for requestor
[GlobalTransaction:<east-dg02-61087(east)>:1203:remote]! Lock held by
[GlobalTransaction:<east-dg02-61087(east)>:1198:remote]

               at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)

               ... 43 more

Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to
acquire lock after [5 seconds] on key
[ServiceGroupKey[edgeDeviceId=1,serviceGroupNo=101]] for requestor
[GlobalTransaction:<east-dg02-61087(east)>:1203:remote]! Lock held by
[GlobalTransaction:<east-dg02-61087(east)>:1198:remote]

               at
org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.ja
va:217)

               at
org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManager
Impl.java:190)

 

 

I really need the original exception for processing/reporting to my
analytics engine.

 

Thanks,

 

Erik

 

From: Mircea Markus [mailto:mircea.markus at jboss.com] 
Sent: Tuesday, October 23, 2012 11:40 AM
To: infinispan -Dev List
Cc: Erik Salter
Subject: Re: [infinispan-dev] Dist.exec failover

 

 

On 22 Oct 2012, at 19:48, Vladimir Blagojevic wrote:





Hey guys,

Erik noted that we should by default have no failover policy installed 
rather the default random policy we currently have.

The random policy tries to re-run the task in case of a cluster failure. I
imagine the user would do the same, so not sure why not to add it..

Erik, mind comment about the cons of having this failover in place? 

 

Also he noted that 
keys are never supplied to failover policy and they might be important 
when it comes to a deciding where to dispatch the failed over task.

The main reason for passing the keys is in order to calculate the locality
of the task based on consistent hash. We don't have a reference to the
consistent hash in the DistributedTaskFailoverPolicy, so not sure this would
be useful as it is.


Our current interface is :

public interface DistributedTaskFailoverPolicy {
Address failover(Address failedExecution, List<Address> 
executionCandidates, Exception cause);
}

Rather than adding yet another parameter here maybe we should make a 
simple container class

public class FailoverContext {

Address failedExecution;
List<Address> executionCandidates;
Exception cause;
List<Object> inputKeys;
}

and have


public interface DistributedTaskFailoverPolicy {
Address failover(FailoverContext context);
}

WDYT?

Regards,
Vladimir
_______________________________________________
infinispan-dev mailing list
infinispan-dev at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

 

Cheers,

-- 
Mircea Markus

Infinispan lead (www.infinispan.org)

 





 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20121023/ec373fef/attachment-0001.html 


More information about the infinispan-dev mailing list