[infinispan-issues] [JBoss JIRA] (ISPN-3389) Forwarded transactions can remain stale after state transfer

Dan Berindei (JIRA) jira-events at lists.jboss.org
Tue Aug 27 07:58:26 EDT 2013


    [ https://issues.jboss.org/browse/ISPN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799494#comment-12799494 ] 

Dan Berindei commented on ISPN-3389:
------------------------------------

A small correction: at step 3, the tx must be forwarded to the new owner *after* the lock was acquired and registered with the tx.

I managed to reproduce the issue with a unit test. I believe adding a check in StateProviderImpl to ignore txs started in the current topology should be enough to fix the problem as, but we should still add the new owner to the locked nodes set as it clarifies things. 

Since the set of locked nodes is now complete, at some point we should try to simplify the other places where we check whether we should invoke a commit/rollback command remotely and on what nodes (e.g. {{BaseDistributionInterceptor.shouldInvokeRemoteTxCommand}}, {{TxDistributionInterceptor.getCommitNodes}}).
                
> Forwarded transactions can remain stale after state transfer
> ------------------------------------------------------------
>
>                 Key: ISPN-3389
>                 URL: https://issues.jboss.org/browse/ISPN-3389
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State transfer
>    Affects Versions: 5.2.7.Final
>            Reporter: Erik Salter
>            Assignee: Dan Berindei
>            Priority: Critical
>              Labels: 5.2.x
>             Fix For: 6.0.0.CR1, 6.0.0.Final
>
>
> There is a scenario where a tx started on one node, moved during state transfer, and committed on the originating node won't be removed from the new owner's tx table.
> The chain of events is as follows:
> 1. New topology comes in as part of a view change.
> 2. Local transaction started with the new topology ID.  This transaction was started due to a LockControlCommand and has no modifications.  Also important, it only has local locks.
> 3. Tx forwarded to new owner before the local lock is acquired and registered with the transaction.
> 4. Since the tx has only local locks and no modifications, it is only removed locally.  No TxCompletion or Rollback are broadcast to the new owners.
> This key becomes unusable not due to stale locks, but because the waitForTransaction() code will see that the old tx can "potentially" lock the key.
> This easily happens with pessimistic caches, though I have seen it happen with optimistic caches (there is a delta between the transaction being created and the lock registration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list