[
https://issues.jboss.org/browse/ISPN-3389?page=com.atlassian.jira.plugin....
]
Dan Berindei commented on ISPN-3389:
------------------------------------
A small correction: at step 3, the tx must be forwarded to the new owner *after* the lock
was acquired and registered with the tx.
I managed to reproduce the issue with a unit test. I believe adding a check in
StateProviderImpl to ignore txs started in the current topology should be enough to fix
the problem as, but we should still add the new owner to the locked nodes set as it
clarifies things.
Since the set of locked nodes is now complete, at some point we should try to simplify the
other places where we check whether we should invoke a commit/rollback command remotely
and on what nodes (e.g. {{BaseDistributionInterceptor.shouldInvokeRemoteTxCommand}},
{{TxDistributionInterceptor.getCommitNodes}}).
Forwarded transactions can remain stale after state transfer
------------------------------------------------------------
Key: ISPN-3389
URL:
https://issues.jboss.org/browse/ISPN-3389
Project: Infinispan
Issue Type: Bug
Components: State transfer
Affects Versions: 5.2.7.Final
Reporter: Erik Salter
Assignee: Dan Berindei
Priority: Critical
Labels: 5.2.x
Fix For: 6.0.0.CR1, 6.0.0.Final
There is a scenario where a tx started on one node, moved during state transfer, and
committed on the originating node won't be removed from the new owner's tx table.
The chain of events is as follows:
1. New topology comes in as part of a view change.
2. Local transaction started with the new topology ID. This transaction was started due
to a LockControlCommand and has no modifications. Also important, it only has local
locks.
3. Tx forwarded to new owner before the local lock is acquired and registered with the
transaction.
4. Since the tx has only local locks and no modifications, it is only removed locally.
No TxCompletion or Rollback are broadcast to the new owners.
This key becomes unusable not due to stale locks, but because the waitForTransaction()
code will see that the old tx can "potentially" lock the key.
This easily happens with pessimistic caches, though I have seen it happen with optimistic
caches (there is a delta between the transaction being created and the lock registration).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira