Dan Berindei created ISPN-5168:
----------------------------------
Summary: Recovery: force commit on an orphan tx unlocks remote keys too soon
Key: ISPN-5168
URL:
https://issues.jboss.org/browse/ISPN-5168
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 7.1.0.Beta1, 7.0.3.Final
Reporter: Dan Berindei
The force commit admin operation replays the PrepareCommand on all the owners to acquire
any missing locks. But the prepare doesn't do anything if the tx already exists and is
marked as prepared on the remote nodes.
However, when executing the CommitCommand, {{TxInterceptor}} realizes that the existing
remote tx has an older topology id and replays the PrepareCommand. And if the originator
of the tx left the cluster, {{TxInterceptor.invokeNextInterceptorAndVerifyTransaction()}}
will roll back the tx and unlock all the keys. It doesn't throw an exception, so the
commit still succeeds, but without holding any locks.
{noformat}
10:38:51,313 TRACE (testng-OriginatorAndOwnerFailureReplicationTest:) [JGroupsTransport]
dests=[OriginatorAndOwnerFailureReplicationTest-NodeD-50040,
OriginatorAndOwnerFailureReplicationTest-NodeE-44976], command=PrepareCommand
{modifications=[PutKeyValueCommand{key=aKey, value=newValue, flags=null,
putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null},
successful=true}], onePhaseCommit=false, gtx=RecoveryAwareGlobalTransaction{xid=< 1,
64, 64,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000
>, internalId=562962838323201}
GlobalTransaction:<OriginatorAndOwnerFailureReplicationTest-NodeD-50040>:2:local,
cacheName='___defaultcache', topologyId=5}, mode=SYNCHRONOUS, timeout=15000
10:38:51,319 TRACE (testng-OriginatorAndOwnerFailureReplicationTest:) [JGroupsTransport]
dests=null, command=CommitCommand {gtx=RecoveryAwareGlobalTransaction{xid=< 1, 64, 64,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000
>, internalId=562962838323201}
GlobalTransaction:<OriginatorAndOwnerFailureReplicationTest-NodeD-50040>:2:local,
cacheName='___defaultcache', topologyId=5}, mode=SYNCHRONOUS_IGNORE_LEAVERS,
timeout=15000
10:38:51,322 TRACE (remote-thread-1,OriginatorAndOwnerFailureReplicationTest-NodeE:)
[TxInterceptor] Remote tx topology id 4 and command topology is 5
10:38:51,322 TRACE (remote-thread-1,OriginatorAndOwnerFailureReplicationTest-NodeE:)
[TxInterceptor] Replaying the transactions received as a result of state transfer
PrepareCommand {modifications=[PutKeyValueCommand{key=aKey, value=newValue, flags=null,
putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null},
successful=true}], onePhaseCommit=false, gtx=RecoveryAwareGlobalTransaction{xid=< 1,
64, 64,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000
>, internalId=562962838323201}
GlobalTransaction:<OriginatorAndOwnerFailureReplicationTest-NodeF-60014>:2:remote,
cacheName='___defaultcache', topologyId=-1}
10:38:51,323 TRACE (remote-thread-1,OriginatorAndOwnerFailureReplicationTest-NodeE:)
[TxInterceptor] invokeNextInterceptorAndVerifyTransaction :: originatorMissing=true,
alreadyCompleted=true
10:38:51,323 TRACE (remote-thread-1,OriginatorAndOwnerFailureReplicationTest-NodeE:)
[TxInterceptor] Rolling back remote transaction RecoveryAwareGlobalTransaction{xid=< 1,
64, 64,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000,
-12-63-13-63-32-39-44-29-891-73-111-107-75-113-88-108-59-88120000000000000000000000000000000000000000000
>, internalId=562962838323201}
GlobalTransaction:<OriginatorAndOwnerFailureReplicationTest-NodeF-60014>:2:remote
because either already completed (true) or originator no longer in the cluster (true).
10:38:51,323 TRACE (remote-thread-1,OriginatorAndOwnerFailureReplicationTest-NodeE:)
[OwnableReentrantPerEntryLockContainer] Unlocking lock instance for key aKey
10:38:51,328 TRACE (remote-thread-1,OriginatorAndOwnerFailureReplicationTest-NodeE:)
[ReadCommittedEntry] Updating entry (key=aKey removed=false valid=true changed=true
created=true loaded=false value=newValue metadata=EmbeddedMetadata{version=null},
providedMetadata=null)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)