[infinispan-issues] [JBoss JIRA] (ISPN-2291) Tx rollback during state transfer has stale locks

Erik Salter (JIRA) jira-events at lists.jboss.org
Sun Oct 7 22:17:03 EDT 2012


    [ https://issues.jboss.org/browse/ISPN-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724541#comment-12724541 ] 

Erik Salter commented on ISPN-2291:
-----------------------------------

So here's the root cause -- the StaleTransactionCleanupService can remove transactions from the tx table while there are pending locks.  I've seen 3 cases of this during a rehash:

1.  A RemoteTransaction was registered and the lock was attempting to be acquired. The tx is cleaned up by the StaleTxCleanupService in another thread since there were no locks yet acquired.  Afterwards, the invocation context acquires the lock and registers it anyway.

2.  Similar to the above, but the invocation context is waiting in the interceptor on the pending locks (lockKeyAndCheckOwnership).  By the time it finishes waiting on the pending locks, the tx has been cleaned up.

3.  I've seen the tx get cleaned up AS SOON as the RemoteTransaction is registered with the txTable.

..
2012-10-06 18:15:14,249 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (OOB-19,erm-cluster,phl-dg3-8896(phl)) Calling perform() on LockControlCommand{cache=serviceGroup, keys=[ServiceGroupKey[edgeDeviceId=4,serviceGroupNo=401]], flags=null, unlock=false}
2012-10-06 18:15:14,249 TRACE [org.infinispan.transaction.TransactionTable] (OOB-19,erm-cluster,phl-dg3-8896(phl)) Created and registered remote transaction RemoteTransaction{modifications=[], lookedUpEntries={}, lockedKeys= null, backupKeyLocks null, missingLookedUpEntries false, tx=GlobalTransaction:<phl-dg2-51640(phl)>:8900:remote}

...
2012-10-06 18:15:14,250 TRACE [org.infinispan.transaction.StaleTransactionCleanupService] (transport-thread-20) Killing remote transaction without any local keys GlobalTransaction:<phl-dg2-51640(phl)>:8900:remote


                
> Tx rollback during state transfer has stale locks
> -------------------------------------------------
>
>                 Key: ISPN-2291
>                 URL: https://issues.jboss.org/browse/ISPN-2291
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State transfer
>    Affects Versions: 5.2.0.Alpha3, 5.2.0.Beta1
>            Reporter: Erik Salter
>            Assignee: Adrian Nistor
>            Priority: Critical
>             Fix For: 5.2.0.CR1
>
>
> There are stale locks that happened when a transaction failed due to a replication timeout to its peer node.  The transaction contained grouped keys that were submitted to the primary owner.  During execution of one such transaction, there was a state transfer, which changed ownership of these keys.  
> Here's an example flow:
> The task executed and started a transaction.  Since it was not the primary owner, it sent a LockControlCommand to the new owner.  This call timed out, and a rollback was issued.  The local transaction is completed, but subsequent attempts to lock this key fail.
> The logs can be found here:
> http://dl.dropbox.com/u/50401510/5.2.0.ALPHA3/lock/10.30.12.83/server.log.gz
> http://dl.dropbox.com/u/50401510/5.2.0.ALPHA3/lock/10.30.12.84/server.log.gz
> http://dl.dropbox.com/u/50401510/5.2.0.ALPHA3/lock/10.30.12.85/server.log.gz

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list