[JBoss JIRA] (ISPN-5460) Prepare commands sent before the target became a member should be rejected
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5460?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5460:
-----------------------------------------------
Dan Berindei <dberinde(a)redhat.com> changed the Status of [bug 1221164|https://bugzilla.redhat.com/show_bug.cgi?id=1221164] from NEW to ASSIGNED
> Prepare commands sent before the target became a member should be rejected
> --------------------------------------------------------------------------
>
> Key: ISPN-5460
> URL: https://issues.jboss.org/browse/ISPN-5460
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> Since ISPN-4198 was fixed, joiners ignore commands that were sent in a topology in which they were not members. But the joiner will still send a {{null}} response, which is valid, so the tx originator will not retry the command.
> The owner may have already sent the list of transactions to the joiner before the prepare, so this sequence of events can happen:
> 1. A new topology is installed, which includes the joiner (B) in the write consistent hash. B requests the transactions from A, but doesn't receive anything because the affected keys of {{GlobalTransaction:<NodeA-40680>:99974:local}} haven't been updated yet.
> {noformat}
> 19:14:11,027 TRACE (remote-thread-NodeB-p35288-t2:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = false, isMember = false, topology = CacheTopology{id=0, rebalanceId=0, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-40680: 1]}, pendingCH=null, unionCH=null, actualMembers=[NodeA-40680]}
> 19:14:11,039 TRACE (transport-thread-NodeA-p35281-t6:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=-1}, origin null
> 19:14:11,044 TRACE (transport-thread-NodeA-p35281-t1:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-40680: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-40680: 1, NodeB-64486: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-40680: 1, NodeB-64486: 0]}, actualMembers=[NodeA-40680, NodeB-64486]}
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [StateProviderImpl] Received request for transactions from node NodeB-64486 for segments [0] of cache resultCache2 with topology id 1
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [StateProviderImpl] Skipping transaction NodeB-64486 because the state requestor doesn't own any key
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=[]} for command StateRequestCommand{cache=resultCache2, origin=NodeB-64486, type=GET_TRANSACTIONS, topologyId=1, segments=[0]}
> 19:14:11,045 DEBUG (transport-thread-NodeB-p35287-t3:) [StateConsumerImpl] Applying 0 transactions for cache resultCache2 transferred from node NodeA-40680
> {noformat}
> 2. Node A adds the affected key and sends the prepare command to B with topology 0. B ignores the command.
> {noformat}
> 19:14:11,039 TRACE (transport-thread-NodeA-p35281-t6:) [AbstractCacheTransaction] Registering locked key: rules
> 19:14:11,047 TRACE (transport-thread-NodeA-p35281-t6:) [JGroupsTransport] dests=null, command=PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 19:14:11,048 TRACE (OOB-4,NodeB-64486:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0} [sender=NodeA-40680]
> 19:14:11,048 TRACE (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] Ignoring command sent before the local node was a member (command topology id is 0)
> 19:14:11,048 TRACE (remote-thread-NodeB-p35288-t6:) [CommandAwareRpcDispatcher] About to send back response null for command PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=0}
> 19:14:11,058 TRACE (transport-thread-NodeA-p35281-t6:) [RpcManagerImpl] Response(s) to PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0} is {}
> {noformat}
> 3. A tries to commit the tx, but B throws an exception because it can't find the remote tx.
> {noformat}
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [LocalTransaction] Adding remote locks on [NodeA-40680, NodeB-64486]. Remote locks are null
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-40680>:99974:local
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [RpcManagerImpl] NodeA-40680 invoking CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=2} to recipient list null with options RpcOptions{timeout=15000, unit=MILLISECONDS, deliverOrder=NONE, responseFilter=null, responseMode=SYNCHRONOUS_IGNORE_LEAVERS, skipReplicationQueue=false}
> 19:14:11,066 TRACE (OOB-5,NodeB-64486:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=2} [sender=NodeA-40680]
> 19:14:11,066 TRACE (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=2}
> 19:14:11,066 TRACE (remote-thread-NodeB-p35288-t6:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-40680>:99974:remote
> 19:14:11,067 WARN (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=2}
> java.lang.IllegalStateException: Remote transaction not found: GlobalTransaction:<NodeA-40680>:99974:remote
> at org.infinispan.commands.tx.CommitCommand.invalidRemoteTxReturnValue(CommitCommand.java:54)
> at org.infinispan.commands.tx.AbstractTransactionBoundaryCommand.perform(AbstractTransactionBoundaryCommand.java:89)
> at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokePerform(BasePerCacheInboundInvocationHandler.java:84)
> at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.run(BaseBlockingRunnable.java:31)
> {noformat}
> This is causing random failures in the map/reduce tests (e.g. DistributedSharedCacheTwoNodesMapReduceTest), because map/reduce doesn't wait for all the nodes to join before inserting in the cache.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5462) Transaction prepare is not replicated to new owners during state transfer
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5462?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5462:
-----------------------------------------------
Dan Berindei <dberinde(a)redhat.com> changed the Status of [bug 1221166|https://bugzilla.redhat.com/show_bug.cgi?id=1221166] from NEW to ASSIGNED
> Transaction prepare is not replicated to new owners during state transfer
> -------------------------------------------------------------------------
>
> Key: ISPN-5462
> URL: https://issues.jboss.org/browse/ISPN-5462
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> This is related to ISPN-5460, and I've seen it in the same map/reduce tests.
> {{TransactionTable}} updates its topology id *before* the new topology is installed in {{StateConsumerImpl}}. This means a transaction could be created with the new topology id, while the prepare is replicated to the owners in the old topology - meaning the new owners never receive the prepare and the commit then fails.
> Note: without the ISPN-4546 fix, it would have reported success, but it wouldn't have updated the keys.
> {noformat}
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]}
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionTable] Created a new local transaction: LocalXaTransaction{xid=null} LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, lockedKeys=null, backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionXaAdapter] end called on tx GlobalTransaction:<NodeA-6285>:15510:local(resultCache2)
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=-1}, origin null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [JGroupsTransport] dests=[NodeA-6285], command=PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [LocalTransaction] Adding remote locks on [NodeA-6285]. Remote locks are null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-6285>:15510:local
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Lock State Transfer in Progress for topology ID 1
> // StateConsumerImpl's topology is updated here
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Start keeping track of keys for rebalance
> 09:25:57,751 TRACE (remote-thread-5,NodeB:) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]} on cache resultCache2
> 09:25:57,752 TRACE (remote-thread-5,NodeB:) [StateConsumerImpl] Requesting transactions for segments [0] of cache resultCache2 from node NodeA-6285
> 09:25:57,752 TRACE (remote-thread-2,NodeA:) [StateProviderImpl] Skipping transaction LocalXaTransaction{xid=< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a80a66:ab12:55504ae3:5806, node_name=1, branch_uid=0:ffffc0a80a66:ab12:55504ae3:5807, subordinatenodename=null, eis_name=0 >} LocalTransaction{remoteLockedNodes=[NodeA-6285], isMarkedForRollback=false, lockedKeys=[Boston], backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96 as it was started in the current topology or by a leaver
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-6285>:15510:remote, cacheName='resultCache2', topologyId=1}
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-6285>:15510:remote, completed successfully? false
> 09:25:57,758 ERROR (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] ISPN000260: Exception executing command
> java.lang.IllegalStateException: Remote transaction not found
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5460) Prepare commands sent before the target became a member should be rejected
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5460?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-5460:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1221164
> Prepare commands sent before the target became a member should be rejected
> --------------------------------------------------------------------------
>
> Key: ISPN-5460
> URL: https://issues.jboss.org/browse/ISPN-5460
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> Since ISPN-4198 was fixed, joiners ignore commands that were sent in a topology in which they were not members. But the joiner will still send a {{null}} response, which is valid, so the tx originator will not retry the command.
> The owner may have already sent the list of transactions to the joiner before the prepare, so this sequence of events can happen:
> 1. A new topology is installed, which includes the joiner (B) in the write consistent hash. B requests the transactions from A, but doesn't receive anything because the affected keys of {{GlobalTransaction:<NodeA-40680>:99974:local}} haven't been updated yet.
> {noformat}
> 19:14:11,027 TRACE (remote-thread-NodeB-p35288-t2:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = false, isMember = false, topology = CacheTopology{id=0, rebalanceId=0, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-40680: 1]}, pendingCH=null, unionCH=null, actualMembers=[NodeA-40680]}
> 19:14:11,039 TRACE (transport-thread-NodeA-p35281-t6:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=-1}, origin null
> 19:14:11,044 TRACE (transport-thread-NodeA-p35281-t1:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-40680: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-40680: 1, NodeB-64486: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-40680: 1, NodeB-64486: 0]}, actualMembers=[NodeA-40680, NodeB-64486]}
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [StateProviderImpl] Received request for transactions from node NodeB-64486 for segments [0] of cache resultCache2 with topology id 1
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [StateProviderImpl] Skipping transaction NodeB-64486 because the state requestor doesn't own any key
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=[]} for command StateRequestCommand{cache=resultCache2, origin=NodeB-64486, type=GET_TRANSACTIONS, topologyId=1, segments=[0]}
> 19:14:11,045 DEBUG (transport-thread-NodeB-p35287-t3:) [StateConsumerImpl] Applying 0 transactions for cache resultCache2 transferred from node NodeA-40680
> {noformat}
> 2. Node A adds the affected key and sends the prepare command to B with topology 0. B ignores the command.
> {noformat}
> 19:14:11,039 TRACE (transport-thread-NodeA-p35281-t6:) [AbstractCacheTransaction] Registering locked key: rules
> 19:14:11,047 TRACE (transport-thread-NodeA-p35281-t6:) [JGroupsTransport] dests=null, command=PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 19:14:11,048 TRACE (OOB-4,NodeB-64486:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0} [sender=NodeA-40680]
> 19:14:11,048 TRACE (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] Ignoring command sent before the local node was a member (command topology id is 0)
> 19:14:11,048 TRACE (remote-thread-NodeB-p35288-t6:) [CommandAwareRpcDispatcher] About to send back response null for command PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=0}
> 19:14:11,058 TRACE (transport-thread-NodeA-p35281-t6:) [RpcManagerImpl] Response(s) to PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0} is {}
> {noformat}
> 3. A tries to commit the tx, but B throws an exception because it can't find the remote tx.
> {noformat}
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [LocalTransaction] Adding remote locks on [NodeA-40680, NodeB-64486]. Remote locks are null
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-40680>:99974:local
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [RpcManagerImpl] NodeA-40680 invoking CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=2} to recipient list null with options RpcOptions{timeout=15000, unit=MILLISECONDS, deliverOrder=NONE, responseFilter=null, responseMode=SYNCHRONOUS_IGNORE_LEAVERS, skipReplicationQueue=false}
> 19:14:11,066 TRACE (OOB-5,NodeB-64486:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=2} [sender=NodeA-40680]
> 19:14:11,066 TRACE (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=2}
> 19:14:11,066 TRACE (remote-thread-NodeB-p35288-t6:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-40680>:99974:remote
> 19:14:11,067 WARN (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=2}
> java.lang.IllegalStateException: Remote transaction not found: GlobalTransaction:<NodeA-40680>:99974:remote
> at org.infinispan.commands.tx.CommitCommand.invalidRemoteTxReturnValue(CommitCommand.java:54)
> at org.infinispan.commands.tx.AbstractTransactionBoundaryCommand.perform(AbstractTransactionBoundaryCommand.java:89)
> at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokePerform(BasePerCacheInboundInvocationHandler.java:84)
> at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.run(BaseBlockingRunnable.java:31)
> {noformat}
> This is causing random failures in the map/reduce tests (e.g. DistributedSharedCacheTwoNodesMapReduceTest), because map/reduce doesn't wait for all the nodes to join before inserting in the cache.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5462) Transaction prepare is not replicated to new owners during state transfer
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5462?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-5462:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1221166
> Transaction prepare is not replicated to new owners during state transfer
> -------------------------------------------------------------------------
>
> Key: ISPN-5462
> URL: https://issues.jboss.org/browse/ISPN-5462
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> This is related to ISPN-5460, and I've seen it in the same map/reduce tests.
> {{TransactionTable}} updates its topology id *before* the new topology is installed in {{StateConsumerImpl}}. This means a transaction could be created with the new topology id, while the prepare is replicated to the owners in the old topology - meaning the new owners never receive the prepare and the commit then fails.
> Note: without the ISPN-4546 fix, it would have reported success, but it wouldn't have updated the keys.
> {noformat}
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]}
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionTable] Created a new local transaction: LocalXaTransaction{xid=null} LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, lockedKeys=null, backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionXaAdapter] end called on tx GlobalTransaction:<NodeA-6285>:15510:local(resultCache2)
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=-1}, origin null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [JGroupsTransport] dests=[NodeA-6285], command=PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [LocalTransaction] Adding remote locks on [NodeA-6285]. Remote locks are null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-6285>:15510:local
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Lock State Transfer in Progress for topology ID 1
> // StateConsumerImpl's topology is updated here
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Start keeping track of keys for rebalance
> 09:25:57,751 TRACE (remote-thread-5,NodeB:) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]} on cache resultCache2
> 09:25:57,752 TRACE (remote-thread-5,NodeB:) [StateConsumerImpl] Requesting transactions for segments [0] of cache resultCache2 from node NodeA-6285
> 09:25:57,752 TRACE (remote-thread-2,NodeA:) [StateProviderImpl] Skipping transaction LocalXaTransaction{xid=< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a80a66:ab12:55504ae3:5806, node_name=1, branch_uid=0:ffffc0a80a66:ab12:55504ae3:5807, subordinatenodename=null, eis_name=0 >} LocalTransaction{remoteLockedNodes=[NodeA-6285], isMarkedForRollback=false, lockedKeys=[Boston], backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96 as it was started in the current topology or by a leaver
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-6285>:15510:remote, cacheName='resultCache2', topologyId=1}
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-6285>:15510:remote, completed successfully? false
> 09:25:57,758 ERROR (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] ISPN000260: Exception executing command
> java.lang.IllegalStateException: Remote transaction not found
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5460) Prepare commands sent before the target became a member should be rejected
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5460?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5460:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3459
> Prepare commands sent before the target became a member should be rejected
> --------------------------------------------------------------------------
>
> Key: ISPN-5460
> URL: https://issues.jboss.org/browse/ISPN-5460
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> Since ISPN-4198 was fixed, joiners ignore commands that were sent in a topology in which they were not members. But the joiner will still send a {{null}} response, which is valid, so the tx originator will not retry the command.
> The owner may have already sent the list of transactions to the joiner before the prepare, so this sequence of events can happen:
> 1. A new topology is installed, which includes the joiner (B) in the write consistent hash. B requests the transactions from A, but doesn't receive anything because the affected keys of {{GlobalTransaction:<NodeA-40680>:99974:local}} haven't been updated yet.
> {noformat}
> 19:14:11,027 TRACE (remote-thread-NodeB-p35288-t2:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = false, isMember = false, topology = CacheTopology{id=0, rebalanceId=0, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-40680: 1]}, pendingCH=null, unionCH=null, actualMembers=[NodeA-40680]}
> 19:14:11,039 TRACE (transport-thread-NodeA-p35281-t6:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=-1}, origin null
> 19:14:11,044 TRACE (transport-thread-NodeA-p35281-t1:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-40680: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-40680: 1, NodeB-64486: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-40680: 1, NodeB-64486: 0]}, actualMembers=[NodeA-40680, NodeB-64486]}
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [StateProviderImpl] Received request for transactions from node NodeB-64486 for segments [0] of cache resultCache2 with topology id 1
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [StateProviderImpl] Skipping transaction NodeB-64486 because the state requestor doesn't own any key
> 19:14:11,045 TRACE (remote-thread-NodeA-p35282-t4:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=[]} for command StateRequestCommand{cache=resultCache2, origin=NodeB-64486, type=GET_TRANSACTIONS, topologyId=1, segments=[0]}
> 19:14:11,045 DEBUG (transport-thread-NodeB-p35287-t3:) [StateConsumerImpl] Applying 0 transactions for cache resultCache2 transferred from node NodeA-40680
> {noformat}
> 2. Node A adds the affected key and sends the prepare command to B with topology 0. B ignores the command.
> {noformat}
> 19:14:11,039 TRACE (transport-thread-NodeA-p35281-t6:) [AbstractCacheTransaction] Registering locked key: rules
> 19:14:11,047 TRACE (transport-thread-NodeA-p35281-t6:) [JGroupsTransport] dests=null, command=PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 19:14:11,048 TRACE (OOB-4,NodeB-64486:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0} [sender=NodeA-40680]
> 19:14:11,048 TRACE (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] Ignoring command sent before the local node was a member (command topology id is 0)
> 19:14:11,048 TRACE (remote-thread-NodeB-p35288-t6:) [CommandAwareRpcDispatcher] About to send back response null for command PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=0}
> 19:14:11,058 TRACE (transport-thread-NodeA-p35281-t6:) [RpcManagerImpl] Response(s) to PrepareCommand {modifications=[PutKeyValueCommand{key=rules, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=0} is {}
> {noformat}
> 3. A tries to commit the tx, but B throws an exception because it can't find the remote tx.
> {noformat}
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [LocalTransaction] Adding remote locks on [NodeA-40680, NodeB-64486]. Remote locks are null
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-40680>:99974:local
> 19:14:11,066 TRACE (transport-thread-NodeA-p35281-t6:) [RpcManagerImpl] NodeA-40680 invoking CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=2} to recipient list null with options RpcOptions{timeout=15000, unit=MILLISECONDS, deliverOrder=NONE, responseFilter=null, responseMode=SYNCHRONOUS_IGNORE_LEAVERS, skipReplicationQueue=false}
> 19:14:11,066 TRACE (OOB-5,NodeB-64486:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:local, cacheName='resultCache2', topologyId=2} [sender=NodeA-40680]
> 19:14:11,066 TRACE (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=2}
> 19:14:11,066 TRACE (remote-thread-NodeB-p35288-t6:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-40680>:99974:remote
> 19:14:11,067 WARN (remote-thread-NodeB-p35288-t6:) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command CommitCommand {gtx=GlobalTransaction:<NodeA-40680>:99974:remote, cacheName='resultCache2', topologyId=2}
> java.lang.IllegalStateException: Remote transaction not found: GlobalTransaction:<NodeA-40680>:99974:remote
> at org.infinispan.commands.tx.CommitCommand.invalidRemoteTxReturnValue(CommitCommand.java:54)
> at org.infinispan.commands.tx.AbstractTransactionBoundaryCommand.perform(AbstractTransactionBoundaryCommand.java:89)
> at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokePerform(BasePerCacheInboundInvocationHandler.java:84)
> at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.run(BaseBlockingRunnable.java:31)
> {noformat}
> This is causing random failures in the map/reduce tests (e.g. DistributedSharedCacheTwoNodesMapReduceTest), because map/reduce doesn't wait for all the nodes to join before inserting in the cache.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5462) Transaction prepare is not replicated to new owners during state transfer
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5462?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5462:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3459
> Transaction prepare is not replicated to new owners during state transfer
> -------------------------------------------------------------------------
>
> Key: ISPN-5462
> URL: https://issues.jboss.org/browse/ISPN-5462
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> This is related to ISPN-5460, and I've seen it in the same map/reduce tests.
> {{TransactionTable}} updates its topology id *before* the new topology is installed in {{StateConsumerImpl}}. This means a transaction could be created with the new topology id, while the prepare is replicated to the owners in the old topology - meaning the new owners never receive the prepare and the commit then fails.
> Note: without the ISPN-4546 fix, it would have reported success, but it wouldn't have updated the keys.
> {noformat}
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]}
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionTable] Created a new local transaction: LocalXaTransaction{xid=null} LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, lockedKeys=null, backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionXaAdapter] end called on tx GlobalTransaction:<NodeA-6285>:15510:local(resultCache2)
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=-1}, origin null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [JGroupsTransport] dests=[NodeA-6285], command=PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [LocalTransaction] Adding remote locks on [NodeA-6285]. Remote locks are null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-6285>:15510:local
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Lock State Transfer in Progress for topology ID 1
> // StateConsumerImpl's topology is updated here
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Start keeping track of keys for rebalance
> 09:25:57,751 TRACE (remote-thread-5,NodeB:) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]} on cache resultCache2
> 09:25:57,752 TRACE (remote-thread-5,NodeB:) [StateConsumerImpl] Requesting transactions for segments [0] of cache resultCache2 from node NodeA-6285
> 09:25:57,752 TRACE (remote-thread-2,NodeA:) [StateProviderImpl] Skipping transaction LocalXaTransaction{xid=< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a80a66:ab12:55504ae3:5806, node_name=1, branch_uid=0:ffffc0a80a66:ab12:55504ae3:5807, subordinatenodename=null, eis_name=0 >} LocalTransaction{remoteLockedNodes=[NodeA-6285], isMarkedForRollback=false, lockedKeys=[Boston], backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96 as it was started in the current topology or by a leaver
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-6285>:15510:remote, cacheName='resultCache2', topologyId=1}
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-6285>:15510:remote, completed successfully? false
> 09:25:57,758 ERROR (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] ISPN000260: Exception executing command
> java.lang.IllegalStateException: Remote transaction not found
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5462) Transaction prepare is not replicated to new owners during state transfer
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5462?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5462:
-------------------------------
Summary: Transaction prepare is not replicated to new owners during state transfer (was: A transaction's topology id can be higher than the current topology id)
> Transaction prepare is not replicated to new owners during state transfer
> -------------------------------------------------------------------------
>
> Key: ISPN-5462
> URL: https://issues.jboss.org/browse/ISPN-5462
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> This is related to ISPN-5460, and I've seen it in the same map/reduce tests.
> {{TransactionTable}} updates its topology id *before* the new topology is installed in {{StateConsumerImpl}}. This means a transaction could be created with the new topology id, while the prepare is replicated to the owners in the old topology - meaning the new owners never receive the prepare and the commit then fails.
> Note: without the ISPN-4546 fix, it would have reported success, but it wouldn't have updated the keys.
> {noformat}
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]}
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionTable] Created a new local transaction: LocalXaTransaction{xid=null} LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, lockedKeys=null, backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionXaAdapter] end called on tx GlobalTransaction:<NodeA-6285>:15510:local(resultCache2)
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=-1}, origin null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [JGroupsTransport] dests=[NodeA-6285], command=PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [LocalTransaction] Adding remote locks on [NodeA-6285]. Remote locks are null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-6285>:15510:local
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Lock State Transfer in Progress for topology ID 1
> // StateConsumerImpl's topology is updated here
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Start keeping track of keys for rebalance
> 09:25:57,751 TRACE (remote-thread-5,NodeB:) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]} on cache resultCache2
> 09:25:57,752 TRACE (remote-thread-5,NodeB:) [StateConsumerImpl] Requesting transactions for segments [0] of cache resultCache2 from node NodeA-6285
> 09:25:57,752 TRACE (remote-thread-2,NodeA:) [StateProviderImpl] Skipping transaction LocalXaTransaction{xid=< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a80a66:ab12:55504ae3:5806, node_name=1, branch_uid=0:ffffc0a80a66:ab12:55504ae3:5807, subordinatenodename=null, eis_name=0 >} LocalTransaction{remoteLockedNodes=[NodeA-6285], isMarkedForRollback=false, lockedKeys=[Boston], backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96 as it was started in the current topology or by a leaver
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-6285>:15510:remote, cacheName='resultCache2', topologyId=1}
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-6285>:15510:remote, completed successfully? false
> 09:25:57,758 ERROR (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] ISPN000260: Exception executing command
> java.lang.IllegalStateException: Remote transaction not found
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5459) StateTransferManager.waitForInitialTransferToComplete can fail if the coordinator crashes
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5459?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5459:
-------------------------------
Status: Open (was: New)
> StateTransferManager.waitForInitialTransferToComplete can fail if the coordinator crashes
> -----------------------------------------------------------------------------------------
>
> Key: ISPN-5459
> URL: https://issues.jboss.org/browse/ISPN-5459
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> {{LocalTopologyManagerImpl.isRebalancingEnabled()}} will throw a {{SuspectException}} if the coordinator crashes, preventing the cache from starting up.
> This is causing random failures in {{ClusterListenerDistTxAddListenerTest}}:
> {noformat}
> 22:23:59,439 ERROR (testng-ClusterListenerDistTxAddListenerTest:) [UnitTestTestNGListener] Test testNodeJoiningAndStateNodeDiesWithExistingClusterListener(org.infinispan.notifications.cachelistener.cluster.ClusterListenerDistTxAddListenerTest) failed.
> java.util.concurrent.ExecutionException: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.Exception on object of type StateTransferManagerImpl
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:202)
> at org.infinispan.notifications.cachelistener.cluster.AbstractClusterListenerDistAddListenerTest.testNodeJoiningAndStateNodeDiesWithExistingClusterListener(AbstractClusterListenerDistAddListenerTest.java:254)
> ...
> Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.Exception on object of type StateTransferManagerImpl
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:172)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
> at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:218)
> at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:850)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:599)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:554)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:424)
> at org.infinispan.test.MultipleCacheManagersTest.cache(MultipleCacheManagersTest.java:366)
> at org.infinispan.notifications.cachelistener.cluster.AbstractClusterListenerDistAddListenerTest.access$100(AbstractClusterListenerDistAddListenerTest.java:32)
> at org.infinispan.notifications.cachelistener.cluster.AbstractClusterListenerDistAddListenerTest$4.call(AbstractClusterListenerDistAddListenerTest.java:237)
> at org.infinispan.notifications.cachelistener.cluster.AbstractClusterListenerDistAddListenerTest$4.call(AbstractClusterListenerDistAddListenerTest.java:234)
> at org.infinispan.test.AbstractInfinispanTest$LoggingCallable.call(AbstractInfinispanTest.java:422)
> ... 4 more
> Caused by: org.infinispan.remoting.transport.jgroups.SuspectException: Node NodeM-34961 was suspected
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:245)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:566)
> at org.infinispan.topology.LocalTopologyManagerImpl.executeOnCoordinator(LocalTopologyManagerImpl.java:501)
> at org.infinispan.topology.LocalTopologyManagerImpl.isRebalancingEnabled(LocalTopologyManagerImpl.java:445)
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216)
> at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> ... 18 more
> Caused by: SuspectedException
> at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:414)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:427)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:240)
> ... 26 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ISPN-5462) A transaction's topology id can be higher than the current topology id
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5462?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5462:
-------------------------------
Status: Open (was: New)
> A transaction's topology id can be higher than the current topology id
> ----------------------------------------------------------------------
>
> Key: ISPN-5462
> URL: https://issues.jboss.org/browse/ISPN-5462
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 8.0.0.Alpha1
>
>
> This is related to ISPN-5460, and I've seen it in the same map/reduce tests.
> {{TransactionTable}} updates its topology id *before* the new topology is installed in {{StateConsumerImpl}}. This means a transaction could be created with the new topology id, while the prepare is replicated to the owners in the old topology - meaning the new owners never receive the prepare and the commit then fails.
> Note: without the ISPN-4546 fix, it would have reported success, but it wouldn't have updated the keys.
> {noformat}
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Received new topology for cache resultCache2, isRebalance = true, isMember = true, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]}
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionTable] Created a new local transaction: LocalXaTransaction{xid=null} LocalTransaction{remoteLockedNodes=null, isMarkedForRollback=false, lockedKeys=null, backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionXaAdapter] end called on tx GlobalTransaction:<NodeA-6285>:15510:local(resultCache2)
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [StateTransferInterceptor] handleTxCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=-1}, origin null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [JGroupsTransport] dests=[NodeA-6285], command=PrepareCommand {modifications=[PutKeyValueCommand{key=Boston, value=1, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-6285>:15510:local, cacheName='resultCache2', topologyId=0}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=15000
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [LocalTransaction] Adding remote locks on [NodeA-6285]. Remote locks are null
> 09:25:57,749 TRACE (asyncTransportThread-1,NodeA:) [TransactionCoordinator] Committing transaction GlobalTransaction:<NodeA-6285>:15510:local
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Lock State Transfer in Progress for topology ID 1
> // StateConsumerImpl's topology is updated here
> 09:25:57,749 TRACE (asyncTransportThread-0,NodeA:) [StateConsumerImpl] Start keeping track of keys for rebalance
> 09:25:57,751 TRACE (remote-thread-5,NodeB:) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 1, owners = (1)[NodeA-6285: 1]}, pendingCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, unionCH=ReplicatedConsistentHash{ns = 1, owners = (2)[NodeA-6285: 1, NodeB-17038: 0]}, actualMembers=[NodeA-6285, NodeB-17038]} on cache resultCache2
> 09:25:57,752 TRACE (remote-thread-5,NodeB:) [StateConsumerImpl] Requesting transactions for segments [0] of cache resultCache2 from node NodeA-6285
> 09:25:57,752 TRACE (remote-thread-2,NodeA:) [StateProviderImpl] Skipping transaction LocalXaTransaction{xid=< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a80a66:ab12:55504ae3:5806, node_name=1, branch_uid=0:ffffc0a80a66:ab12:55504ae3:5807, subordinatenodename=null, eis_name=0 >} LocalTransaction{remoteLockedNodes=[NodeA-6285], isMarkedForRollback=false, lockedKeys=[Boston], backupKeyLocks=null, topologyId=1, stateTransferFlag=null} org.infinispan.transaction.xa.LocalXaTransaction@3c96 as it was started in the current topology or by a leaver
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] Calling perform() on CommitCommand {gtx=GlobalTransaction:<NodeA-6285>:15510:remote, cacheName='resultCache2', topologyId=1}
> 09:25:57,753 TRACE (remote-thread-3,NodeB:) [AbstractTransactionBoundaryCommand] Did not find a RemoteTransaction for GlobalTransaction:<NodeA-6285>:15510:remote, completed successfully? false
> 09:25:57,758 ERROR (remote-thread-3,NodeB:) [InboundInvocationHandlerImpl] ISPN000260: Exception executing command
> java.lang.IllegalStateException: Remote transaction not found
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months