Dan Berindei created ISPN-7135:
----------------------------------
Summary:
ClusterListenerReplTest.testPrimaryOwnerGoesDownBeforeBackupRaisesEvent random failures
Key: ISPN-7135
URL:
https://issues.jboss.org/browse/ISPN-7135
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Core
Affects Versions: 9.0.0.Alpha4
Reporter: Dan Berindei
Assignee: William Burns
Priority: Critical
Fix For: 9.0.0.Beta1
The test assumes that because the primary owner is killed before the backups updated the
entry, the clustered listener will only receive one {{CACHE_ENTRY_CREATED}} event.
However, the crash and the following rebalance require 3 topology updates, meaning the
command could be retried a second time, after the backup already applied the new value. In
the log below, {{NodeDS}} retries with topology {{7}}, the primary {{NodeDU}} forwards it
to the backup (also {{NodeDS}}), but then sees topology {{8}} and triggers a retry. The
clustered listener on {{NodeDS}} receives both a {{CACHE_ENTRY_CREATED}} event and a
{{CACHE_ENTRY_MODIFIED}} event.
{noformat}
10:35:47,429 TRACE (ForkThread-2,Test:[]) [BaseDistributionInterceptor] Command topology
id is 5, current topology id is 5, successful? true
10:35:47,429 TRACE (ForkThread-2,Test:[]) [BaseDistributionInterceptor] I'm not the
primary owner, so sending the command to the primary owner(Test-NodeDT-11194) in order to
be forwarded
10:35:47,496 DEBUG (remote-thread-Test-NodeDS-p31418-t6:[cluster-listener])
[ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache
cluster-listener, topology = CacheTopology{id=6, rebalanceId=3,
currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133,
Test-NodeDU-38945: 123]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906,
Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}, availability mode = AVAILABLE
10:35:47,502 TRACE (transport-thread-Test-NodeDS-p31420-t1:[Topology-cluster-listener])
[StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false,
isMember = true, topology = CacheTopology{id=6, rebalanceId=3,
currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=null, unionCH=null,
actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945],
persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
10:35:47,514 INFO (remote-thread-Test-NodeDS-p31418-t6:[cluster-listener]) [CLUSTER]
ISPN000310: Starting cluster-wide rebalance for cache cluster-listener, topology
CacheTopology{id=7, rebalanceId=4, currentCH=ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]},
pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134,
Test-NodeDU-38945: 122]}, unionCH=null, actualMembers=[Test-NodeDS-11906,
Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
10:35:47,526 TRACE (transport-thread-Test-NodeDS-p31420-t2:[Topology-cluster-listener])
[StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = true,
isMember = true, topology = CacheTopology{id=7, rebalanceId=4,
currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]},
pendingCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]},
unionCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, actualMembers=[Test-NodeDS-11906,
Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
10:35:47,527 TRACE (transport-thread-Test-NodeDU-p31480-t5:[Topology-cluster-listener])
[StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false,
isMember = true, topology = CacheTopology{id=6, rebalanceId=3,
currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=null, unionCH=null,
actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945],
persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
*** 10:35:47,555 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [StateTransferInterceptor]
Retrying command because of topology change, current topology is 7:
PutKeyValueCommand{key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value,
flags=[], putIfAbsent=false, valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null},
successful=true}
10:35:47,555 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [BaseDistributionInterceptor]
I'm not the primary owner, so sending the command to the primary
owner(Test-NodeDU-38945) in order to be forwarded
10:35:47,558 TRACE (transport-thread-Test-NodeDU-p31480-t5:[Topology-cluster-listener])
[StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = true,
isMember = true, topology = CacheTopology{id=7, rebalanceId=4,
currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]},
pendingCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]},
unionCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, actualMembers=[Test-NodeDS-11906,
Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
10:35:47,600 DEBUG (remote-thread-Test-NodeDS-p31418-t5:[cluster-listener])
[ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache
cluster-listener, topology = CacheTopology{id=8, rebalanceId=4,
currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134,
Test-NodeDU-38945: 122]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906,
Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}, availability mode = AVAILABLE
10:35:47,603 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [ReadCommittedEntry] Updating
entry (key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194} removed=false valid=true
changed=true created=true value=first-value
metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null},
providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
10:35:47,603 TRACE (transport-thread-Test-NodeDS-p31420-t6:[Topology-cluster-listener])
[StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false,
isMember = true, topology = CacheTopology{id=8, rebalanceId=4,
currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, pendingCH=null, unionCH=null,
actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945],
persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
10:35:47,611 DEBUG (remote-thread-Test-NodeDS-p31418-t6:[]) [Test] Adding new cluster
event EventImpl{type=CACHE_ENTRY_CREATED, pre=false, cache=Cache
'cluster-listener'@Test-NodeDS-11906,
key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value, oldValue=null,
transaction=null, originLocal=false, transactionSuccessful=false, entries=null,
created=false}
10:35:47,614 TRACE (transport-thread-Test-NodeDU-p31480-t1:[Topology-cluster-listener])
[StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false,
isMember = true, topology = CacheTopology{id=8, rebalanceId=4,
currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, pendingCH=null, unionCH=null,
actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945],
persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb,
cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
10:35:47,660 TRACE (OOB-1,Test-NodeDS-11906:[]) [AsyncInvocationStage] Exception from
invocation handler
org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the
command was executing: expected 7, got 8
*** 10:35:47,661 TRACE (OOB-1,Test-NodeDS-11906:[]) [StateTransferInterceptor] Retrying
command because of topology change, current topology is 8:
PutKeyValueCommand{key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value,
flags=[COMMAND_RETRY], putIfAbsent=false, valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null},
successful=true}
10:35:47,661 TRACE (OOB-1,Test-NodeDS-11906:[]) [BaseDistributionInterceptor] I'm not
the primary owner, so sending the command to the primary owner(Test-NodeDU-38945) in order
to be forwarded
10:35:47,745 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [ReadCommittedEntry] Updating
entry (key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194} removed=false valid=true
changed=true created=false value=first-value
metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null},
providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
10:35:47,746 DEBUG (remote-thread-Test-NodeDS-p31418-t6:[]) [Test] Adding new cluster
event EventImpl{type=CACHE_ENTRY_MODIFIED, pre=false, cache=Cache
'cluster-listener'@Test-NodeDS-11906,
key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value, oldValue=first-value,
transaction=null, originLocal=false, transactionSuccessful=false, entries=null,
created=false}
10:35:47,771 TRACE (OOB-2,Test-NodeDU-38945:[]) [ReadCommittedEntry] Updating entry
(key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194} removed=false valid=true changed=true
created=true value=first-value metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1,
version=null}, providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1,
version=null})
{noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)