[JBoss JIRA] (ISPN-5137) Transaction from new topology rolled back (but succeeds on originator)
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5137?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5137:
-------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3215
I tried to change the way cache topology membership is computed, but that produced some complications. In the end I changed the way orphan transactions are checked to rely on the JGroups view instead, and made it run both on cache topology changes and on JGroups view changes.
> Transaction from new topology rolled back (but succeeds on originator)
> ----------------------------------------------------------------------
>
> Key: ISPN-5137
> URL: https://issues.jboss.org/browse/ISPN-5137
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.1.0.Alpha1, 7.0.3.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> During a merge of (edg-perf01, edg-perf02, edg-perf03) + (edg-perf04), transaction executed on edg-perf04 in latest topology is silently rolled back and commit command is ignored since the topology has not arrived yet.
> Topologies on edg-perf04
> {code}
> 09:06:38,233 DEBUG [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-23) Updating cluster-wide current topology for cache testCache, topology = CacheTopology{id=51, rebalanceId=12, currentCH
> =DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, unionCH=null, actualMembers=[edg-perf03-18534, edg-perf02-60481, e
> dg-perf01-19198]}, availability mode = AVAILABLE
> 09:06:38,248 TRACE [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-21) Attempting to execute command on self: CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04
> -59142, joinInfo=null, topologyId=51, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, ava
> ilabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19}
> 09:06:38,413 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (transport-thread-23) Replication task sending CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-per
> f04-59142, joinInfo=null, topologyId=52, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null,
> availabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} to addresses null with response mode GET_NONE
> 09:06:38,415 TRACE [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-10) Attempting to execute command on self: CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04
> -59142, joinInfo=null, topologyId=52, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, ava
> ilabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19}
> 09:06:38,451 TRACE [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-8) Attempting to execute command on self: CacheTopologyControlCommand{cache=testCache, type=REBALANCE_START, sender=edg-p
> erf04-59142, joinInfo=null, topologyId=53, rebalanceId=13, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=Defa
> ultConsistentHash{ns = 512, owners = (4)[edg-perf04-59142: 128+128, edg-perf03-18534: 128+128, edg-perf02-60481: 128+128, edg-perf01-19198: 128+128]}, availabilityMode=null, actualMembers=[edg-perf04-59142, edg-
> perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19}
> {code}
> Transaction on edg-perf04 can be executed, since all owners are in current topology, though this node itself is degraded yet:
> {code}
> 09:06:38,545 TRACE [org.infinispan.partionhandling.impl.PartitionHandlingManagerImpl] (DefaultStressor-5) Checking availability for key=key_00000000000022CF, status=DEGRADED_MODE
> 09:06:38,545 TRACE [org.infinispan.partionhandling.impl.PartitionHandlingManagerImpl] (DefaultStressor-5) Key key_00000000000022CF is available.
> 09:06:38,549 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (DefaultStressor-5) edg-perf04-59142 invoking PrepareCommand {modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331, 1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:local, cacheName='testCache', topologyId=51} to recipient list [edg-perf01-19198, edg-perf02-60481] with options RpcOptions{timeout=60000, unit=MILLISECONDS, fifoOrder=true, totalOrder=false, responseFilter=null, responseMode=SYNCHRONOUS, skipReplicationQueue=false}
> 09:06:38,780 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (DefaultStressor-5) Responses: [sender=edg-perf01-19198, received=true, suspected=false]
> [sender=edg-perf02-60481, received=true, suspected=false]
> 09:06:38,788 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (DefaultStressor-5) edg-perf04-59142 invoking CommitCommand {gtx=GlobalTransaction:<edg-perf04-59142>:23898:local, cacheName='testCache', topologyId=51} to recipient list [edg-perf01-19198, edg-perf02-60481] with options RpcOptions{timeout=60000, unit=MILLISECONDS, fifoOrder=false, totalOrder=false, responseFilter=null, responseMode=SYNCHRONOUS_IGNORE_LEAVERS, skipReplicationQueue=false}
> 09:06:39,080 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (DefaultStressor-5) Responses: [sender=edg-perf01-19198, retval=SuccessfulResponse{responseValue=null} , received=true, suspected=false]
> [sender=edg-perf02-60481, retval=SuccessfulResponse{responseValue=null} , received=true, suspected=false]
> {code}
> These are topologies on edg-perf02:
> {code}
> 09:06:38,417 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (remote-thread-55) Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04-59142, joinInfo=null, topologyId=51, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, availabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} [sender=edg-perf04-59142]
> 09:06:38,467 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (remote-thread-56) Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=testCache, type=CH_UPDATE, sender=edg-perf04-59142, joinInfo=null, topologyId=52, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, availabilityMode=AVAILABLE, actualMembers=[edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} [sender=edg-perf04-59142]
> 09:06:38,577 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (remote-thread-52) Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=testCache, type=REBALANCE_START, sender=edg-perf04-59142, joinInfo=null, topologyId=53, rebalanceId=13, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=DefaultConsistentHash{ns = 512, owners = (4)[edg-perf04-59142: 128+128, edg-perf03-18534: 128+128, edg-perf02-60481: 128+128, edg-perf01-19198: 128+128]}, availabilityMode=null, actualMembers=[edg-perf04-59142, edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} [sender=edg-perf04-59142]
> {code}
> The transaction is successfully prepared on edg-perf02 (even twice as the prepare command is resent from edg-perf01)
> {code}
> 09:06:38,613 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-64) Calling perform() on PrepareCommand {modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331, 1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName='testCache', topologyId=51}
> 09:06:38,620 TRACE [org.infinispan.util.concurrent.locks.LockManagerImpl] (remote-thread-64) Successfully acquired lock key_00000000000022CF!
> 09:06:38,626 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-64) About to send back response null for command PrepareCommand {modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331, 1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName='testCache', topologyId=51}
> 09:06:38,670 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-64) Calling perform() on PrepareCommand {modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331, 1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false,
> gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName='testCache', topologyId=52} // resent from edg-perf01
> 09:06:38,672 TRACE [org.infinispan.util.concurrent.locks.LockManagerImpl] (remote-thread-64) Already own lock for entry
> 09:06:38,675 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-64) About to send back response null for command PrepareCommand {modifications=[PutKeyValueCommand{key=key_00000000000022CF, value=[35 #4: 197, 314, 331, 1101, ], flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName='testCache', topologyId=52}
> {code}
> However, as topology 53 is installed, this transaction is rolled back:
> {code}
> 09:06:38,577 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (remote-thread-52) Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=testCache, type=RE
> BALANCE_START, sender=edg-perf04-59142, joinInfo=null, topologyId=53, rebalanceId=13, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198
> : 170+171]}, pendingCH=DefaultConsistentHash{ns = 512, owners = (4)[edg-perf04-59142: 128+128, edg-perf03-18534: 128+128, edg-perf02-60481: 128+128, edg-perf01-19198: 128+128]}, availabilityMode=null, actualMemb
> ers=[edg-perf04-59142, edg-perf03-18534, edg-perf02-60481, edg-perf01-19198], throwable=null, viewId=19} [sender=edg-perf04-59142]
> 09:06:38,658 DEBUG [org.infinispan.topology.LocalTopologyManagerImpl] (remote-thread-52) Installing fake cache topology CacheTopology{id=52, rebalanceId=12, currentCH=DefaultConsistentHash{ns = 512, owners = (3)[edg-perf03-18534: 171+171, edg-perf02-60481: 171+170, edg-perf01-19198: 170+171]}, pendingCH=null, unionCH=null, actualMembers=null} for cache testCache
> 09:06:38,670 DEBUG [org.infinispan.transaction.TransactionTable] (remote-thread-52) The originating node left the cluster for 6 remote transactions
> 09:06:38,675 DEBUG [org.infinispan.transaction.TransactionTable] (remote-thread-52) Rolling back transaction GlobalTransaction:<edg-perf04-59142>:23898:remote because originator edg-perf04-59142 left the cluster
> 09:06:38,675 TRACE [org.infinispan.commands.tx.AbstractTransactionBoundaryCommand] (remote-thread-52) About to execute tx command RollbackCommand {gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName
> ='testCache', topologyId=-1}
> {code}
> But even after that, the CommitCommand still suceeds:
> {code}
> 09:06:38,811 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-56) Calling perform() on CommitCommand {gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName='testCache', topologyId=51}
> 09:06:38,812 TRACE [org.infinispan.commands.tx.AbstractTransactionBoundaryCommand] (remote-thread-56) Did not find a RemoteTransaction for GlobalTransaction:<edg-perf04-59142>:23898:remote
> 09:06:38,812 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-56) About to send back response SuccessfulResponse{responseValue=null} for command CommitCommand {gtx=GlobalTransaction:<edg-perf04-59142>:23898:remote, cacheName='testCache', topologyId=51}
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years, 2 months
[JBoss JIRA] (ISPN-5146) all.embedded.EmbeddedAllTest.testAllEmbeddedClustered fails with AssertionError on Solaris
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5146?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5146:
-----------------------------------------------
Vitalii Chepeliuk <vchepeli(a)redhat.com> changed the Status of [bug 1181959|https://bugzilla.redhat.com/show_bug.cgi?id=1181959] from NEW to CLOSED
> all.embedded.EmbeddedAllTest.testAllEmbeddedClustered fails with AssertionError on Solaris
> ------------------------------------------------------------------------------------------
>
> Key: ISPN-5146
> URL: https://issues.jboss.org/browse/ISPN-5146
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 7.0.2.Final, 7.1.0.Beta1
> Environment: Solaris11 Sparc, Oracle JDK7
> Reporter: Vitalii Chepeliuk
> Labels: testsuite_stability
>
> {noformat}
> Stacktrace
> java.lang.AssertionError: expected:<value1> but was:<null>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:144)
> at org.infinispan.all.embedded.EmbeddedAllTest.testAllEmbeddedClustered(EmbeddedAllTest.java:80)
> Standard Error
> Jan 13, 2015 5:19:04 PM org.infinispan.remoting.transport.jgroups.JGroupsTransport start
> INFO: ISPN000078: Starting JGroups Channel
> Jan 13, 2015 5:19:04 PM org.infinispan.remoting.transport.jgroups.JGroupsTransport buildChannel
> INFO: ISPN000088: Unable to use any JGroups configuration mechanisms provided in properties {}. Using default JGroups configuration!
> Jan 13, 2015 5:19:04 PM org.jgroups.logging.JDKLogImpl warn
> WARNING: JGRP000014: Discovery.num_initial_members has been deprecated: will be ignored
> Jan 13, 2015 5:19:05 PM org.jgroups.logging.JDKLogImpl warn
> WARNING: JGRP000015: the receive buffer of socket DatagramSocket was set to 20MB, but the OS only allocated 2.1MB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)
> Jan 13, 2015 5:19:05 PM org.jgroups.logging.JDKLogImpl warn
> WARNING: JGRP000015: the receive buffer of socket MulticastSocket was set to 25MB, but the OS only allocated 2.1MB. This might lead to performance problems. Please set your max receive buffer in the OS correctly (e.g. net.core.rmem_max on Linux)
> Jan 13, 2015 5:19:20 PM org.infinispan.remoting.transport.jgroups.JGroupsTransport viewAccepted
> INFO: ISPN000094: Received new cluster view: [node2-61360|0] (1) [node2-61360]
> Jan 13, 2015 5:19:20 PM org.infinispan.remoting.transport.jgroups.JGroupsTransport startJGroupsChannelIfNeeded
> INFO: ISPN000079: Cache local address is node2-61360, physical addresses are [127.0.0.1:52001]
> {noformat}
> *JENKINS JOB:* https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years, 2 months