]
Dan Berindei commented on ISPN-9257:
------------------------------------
The problem is that the coordinator (NodeA) is killed with a regular
{{EmbeddedCacheManager.stop()}}, and has time to rebalance the cache with members \[NodeB,
NodeC\] before stopping.
It is not a problem in the DIST_SYNC config once the rebalance with \[NodeB, NodeC\] is
done, killing one more node doesn't start another rebalance. But scattered cache needs
a second rebalance to assign primary owners to all the segments, and this rebalance is
discarded by the test instead of the \[NodeB, NodeC\] rebalance.
ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random
failures
---------------------------------------------------------------------------------------------------
Key: ISPN-9257
URL:
https://issues.jboss.org/browse/ISPN-9257
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Core
Affects Versions: 9.3.0.CR1
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Minor
Labels: testsuite_stability
Fix For: 10.0.0.Final
Attachments:
ISPN-8731_wrong_topology_2018-05-18_ClusterTopologyManagerTest-infinispan-core.log.gz
The test kills the coordinator NodeA, then while NodeB is trying to recover the caches it
also kills NodeC. It expects NodeB to start a rebalance with 2 nodes and discards it, in
order to test that it can process the 1-node rebalance first:
{noformat}
00:34:06,582 DEBUG (transport-thread-test-NodeB-p12-t6:[testCache])
[ClusterTopologyManagerTest] Discarding rebalance command CacheTopology{id=8,
phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256,
rebalanced=false, owners = (2)[test-NodeB-49590: 85, test-NodeC-58596: 85]},
pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[test-NodeB-49590:
128, test-NodeC-58596: 128]}, unionCH=null, actualMembers=[test-NodeB-49590,
test-NodeC-58596], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888,
d47dc4a9-2a95-4bb1-a83b-bb8a27c9999f]}
00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache])
[LocalTopologyManagerImpl] Updating local topology for cache testCache:
CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5,
currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590:
85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners =
(1)[test-NodeB-49590: 128]}, unionCH=null, actualMembers=[test-NodeB-49590],
persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]}
00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache])
[LocalTopologyManagerImpl] Installing fake cache topology CacheTopology{id=8,
phase=NO_REBALANCE, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=256,
rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=null, unionCH=null,
actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]}
for cache testCache
{noformat}
Unfortunately {{PreferAvailabilityStrategy}} has changed a bit and the rebalance ids
don't always match the expectations of the test, so that the 1-node rebalance is
discarded instead:
{noformat}
09:46:10,530 DEBUG (transport-thread-Test-NodeB-p54539-t3:[testCache]) [Test] Discarding
rebalance command CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5,
currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[Test-NodeB-62039:
85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners =
(1)[Test-NodeB-62039: 256]}, unionCH=null, actualMembers=[Test-NodeB-62039],
persistentUUIDs=[0ed7be74-4485-489b-baee-28c461c9e5de]}
{noformat}