]
Dan Berindei updated ISPN-5137:
-------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request:
I tried to change the way cache topology membership is computed, but that produced some
complications. In the end I changed the way orphan transactions are checked to rely on the
JGroups view instead, and made it run both on cache topology changes and on JGroups view
changes.
Transaction from new topology rolled back (but succeeds on
originator)
----------------------------------------------------------------------
Key: ISPN-5137
URL:
https://issues.jboss.org/browse/ISPN-5137
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 7.1.0.Alpha1, 7.0.3.Final
Reporter: Radim Vansa
Assignee: Dan Berindei
Priority: Critical
During a merge of (edg-perf01, edg-perf02, edg-perf03) + (edg-perf04), transaction
executed on edg-perf04 in latest topology is silently rolled back and commit command is
ignored since the topology has not arrived yet.
Topologies on edg-perf04
{code}
09:06:38,233 DEBUG [org.infinispan.topology.ClusterTopologyManagerImpl]
(transport-thread-23) Updating cluster-wide current topology for cache testCache, topology
= CacheTopology{id=51, rebalanceId=12, currentCH
=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171,
edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, unionCH=null,
actualMembers=[edg-perf03-18534, edg-perf02-60481, e
dg-perf01-19198]}, availability mode = AVAILABLE
09:06:38,248 TRACE [org.infinispan.topology.ClusterTopologyManagerImpl]
(transport-thread-21) Attempting to execute command on self:
CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04
-59142, joinInfo=null, topologyId=51, rebalanceId=12, currentCH=DefaultConsistentHash{ns
= 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170,
edg-perf01-19198: 170+171]}, pendingCH=null, ava
ilabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481,
edg-perf01-19198], throwable=null, viewId=19}
09:06:38,413 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(transport-thread-23) Replication task sending
CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-per
f04-59142, joinInfo=null, topologyId=52, rebalanceId=12,
currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171,
edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null,
availabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481,
edg-perf01-19198], throwable=null, viewId=19} to addresses null with response mode
GET_NONE
09:06:38,415 TRACE [org.infinispan.topology.ClusterTopologyManagerImpl]
(transport-thread-10) Attempting to execute command on self:
CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04
-59142, joinInfo=null, topologyId=52, rebalanceId=12, currentCH=DefaultConsistentHash{ns
= 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170,
edg-perf01-19198: 170+171]}, pendingCH=null, ava
ilabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481,
edg-perf01-19198], throwable=null, viewId=19}
09:06:38,451 TRACE [org.infinispan.topology.ClusterTopologyManagerImpl]
(transport-thread-8) Attempting to execute command on self:
CacheTopologyControlCommand{cache=testCache, type=REBALANCE_START, sender=edg-p
erf04-59142, joinInfo=null, topologyId=53, rebalanceId=13,
currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171,
edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=Defa
ultConsistentHash{ns = 512, owners = (4)[edg-perf04-59142: 128+128, edg-perf03-18534:
128+128, edg-perf02-60481: 128+128, edg-perf01-19198: 128+128]}, availabilityMode=null,
actualMembers=[edg-perf04-59142, edg-
perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19}
{code}
Transaction on edg-perf04 can be executed, since all owners are in current topology,
though this node itself is degraded yet:
{code}
09:06:38,545 TRACE [org.infinispan.partionhandling.impl.PartitionHandlingManagerImpl]
(DefaultStressor-5) Checking availability for key=key_00000000000022CF,
status=DEGRADED_MODE
09:06:38,545 TRACE [org.infinispan.partionhandling.impl.PartitionHandlingManagerImpl]
(DefaultStressor-5) Key key_00000000000022CF is available.
09:06:38,549 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (DefaultStressor-5)
edg-perf04-59142 invoking PrepareCommand
{modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331,
1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false,
valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}],
onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:local,
cacheName='testCache', topologyId=51} to recipient list [edg-perf01-19198,
edg-perf02-60481] with options RpcOptions{timeout=60000, unit=MILLISECONDS,
fifoOrder=true, totalOrder=false, responseFilter=null, responseMode=SYNCHRONOUS,
skipReplicationQueue=false}
09:06:38,780 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(DefaultStressor-5) Responses: [sender=edg-perf01-19198, received=true, suspected=false]
[sender=edg-perf02-60481, received=true, suspected=false]
09:06:38,788 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (DefaultStressor-5)
edg-perf04-59142 invoking CommitCommand
{gtx=GlobalTransaction:<edg-perf04-59142>:23898:local,
cacheName='testCache', topologyId=51} to recipient list [edg-perf01-19198,
edg-perf02-60481] with options RpcOptions{timeout=60000, unit=MILLISECONDS,
fifoOrder=false, totalOrder=false, responseFilter=null,
responseMode=SYNCHRONOUS_IGNORE_LEAVERS, skipReplicationQueue=false}
09:06:39,080 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(DefaultStressor-5) Responses: [sender=edg-perf01-19198,
retval=SuccessfulResponse{responseValue=null} , received=true, suspected=false]
[sender=edg-perf02-60481, retval=SuccessfulResponse{responseValue=null} , received=true,
suspected=false]
{code}
These are topologies on edg-perf02:
{code}
09:06:38,417 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(remote-thread-55) Attempting to execute non-CacheRpcCommand command:
CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04-59142,
joinInfo=null, topologyId=51, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512,
owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198:
170+171]}, pendingCH=null, availabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534,
edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} [sender=edg-perf04-59142]
09:06:38,467 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(remote-thread-56) Attempting to execute non-CacheRpcCommand command:
CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04-59142,
joinInfo=null, topologyId=52, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512,
owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198:
170+171]}, pendingCH=null, availabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534,
edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} [sender=edg-perf04-59142]
09:06:38,577 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(remote-thread-52) Attempting to execute non-CacheRpcCommand command:
CacheTopologyControlCommand{cache=testCache, type=REBALANCE_START,
sender=edg-perf04-59142, joinInfo=null, topologyId=53, rebalanceId=13,
currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171,
edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=DefaultConsistentHash{ns
= 512, owners = (4)[edg-perf04-59142: 128+128, edg-perf03-18534: 128+128,
edg-perf02-60481: 128+128, edg-perf01-19198: 128+128]}, availabilityMode=null,
actualMembers=[edg-perf04-59142, edg-perf03-18534, edg-perf02-60481, edg-perf01-19198],
throwable=null, viewId=19} [sender=edg-perf04-59142]
{code}
The transaction is successfully prepared on edg-perf02 (even twice as the prepare command
is resent from edg-perf01)
{code}
09:06:38,613 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-64) Calling perform() on PrepareCommand
{modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331,
1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false,
valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}],
onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote,
cacheName='testCache', topologyId=51}
09:06:38,620 TRACE [org.infinispan.util.concurrent.locks.LockManagerImpl]
(remote-thread-64) Successfully acquired lock key_00000000000022CF!
09:06:38,626 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-64) About to send back response null for command PrepareCommand
{modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331,
1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false,
valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}],
onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote,
cacheName='testCache', topologyId=51}
09:06:38,670 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-64) Calling perform() on PrepareCommand
{modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331,
1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false,
valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}],
onePhaseCommit=false,
gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote,
cacheName='testCache', topologyId=52} // resent from edg-perf01
09:06:38,672 TRACE [org.infinispan.util.concurrent.locks.LockManagerImpl]
(remote-thread-64) Already own lock for entry
09:06:38,675 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-64) About to send back response null for command PrepareCommand
{modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331,
1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false,
valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}],
onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote,
cacheName='testCache', topologyId=52}
{code}
However, as topology 53 is installed, this transaction is rolled back:
{code}
09:06:38,577 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(remote-thread-52) Attempting to execute non-CacheRpcCommand command:
CacheTopologyControlCommand{cache=testCache, type=RE
BALANCE_START, sender=edg-perf04-59142, joinInfo=null, topologyId=53, rebalanceId=13,
currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171,
edg-perf02-60481: 171+170, edg-perf01-19198
: 170+171]}, pendingCH=DefaultConsistentHash{ns = 512, owners = (4)[edg-perf04-59142:
128+128, edg-perf03-18534: 128+128, edg-perf02-60481: 128+128, edg-perf01-19198:
128+128]}, availabilityMode=null, actualMemb
ers=[edg-perf04-59142, edg-perf03-18534, edg-perf02-60481, edg-perf01-19198],
throwable=null, viewId=19} [sender=edg-perf04-59142]
09:06:38,658 DEBUG [org.infinispan.topology.LocalTopologyManagerImpl] (remote-thread-52)
Installing fake cache topology CacheTopology{id=52, rebalanceId=12,
currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171,
edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, unionCH=null,
actualMembers=null} for cache testCache
09:06:38,670 DEBUG [org.infinispan.transaction.TransactionTable] (remote-thread-52) The
originating node left the cluster for 6 remote transactions
09:06:38,675 DEBUG [org.infinispan.transaction.TransactionTable] (remote-thread-52)
Rolling back transaction GlobalTransaction:<edg-perf04-59142>:23898:remote because
originator edg-perf04-59142 left the cluster
09:06:38,675 TRACE [org.infinispan.commands.tx.AbstractTransactionBoundaryCommand]
(remote-thread-52) About to execute tx command RollbackCommand
{gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName
='testCache', topologyId=-1}
{code}
But even after that, the CommitCommand still suceeds:
{code}
09:06:38,811 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-56) Calling perform() on CommitCommand
{gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote,
cacheName='testCache', topologyId=51}
09:06:38,812 TRACE [org.infinispan.commands.tx.AbstractTransactionBoundaryCommand]
(remote-thread-56) Did not find a RemoteTransaction for
GlobalTransaction:<edg-perf04-59142>:23898:remote
09:06:38,812 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-56) About to send back response SuccessfulResponse{responseValue=null} for
command CommitCommand {gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote,
cacheName='testCache', topologyId=51}
{code}