]
Pedro Ruivo updated ISPN-4776:
------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
The topology id for the merged cache topology is not always bigger
than all the partition topology ids
------------------------------------------------------------------------------------------------------
Key: ISPN-4776
URL:
https://issues.jboss.org/browse/ISPN-4776
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 7.0.0.Beta2
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Blocker
Labels: testsuite_stability
Fix For: 7.0.0.CR1
With the ISPN-4574 fix, I changed the merge algorithm to pick the partition with the most
members (both in the _stable_ topology and in the _current_ topology) instead of the
partition with the highest topology id.
However, the biggest topology is not necessarily the partition with the highest topology
id, so it's possible that some nodes will ignore the merged topology because they
already have a higher topology installed. This happened once in
ClusterTopologyManagerTest.testClusterRecoveryAfterThreeWaySplit:
{noformat}
00:24:59,286 DEBUG (transport-thread-NodeL-p33097-t6:) [ClusterCacheStatus] Recovered 3
partition(s) for cache cache: [CacheTopology{id=8, rebalanceId=3,
currentCH=DefaultConsistentHash{ns = 60, owners = (1)[, NodeL-25322: 60+0]},
pendingCH=null, unionCH=null}, CacheTopology{id=6, rebalanceId=3,
currentCH=DefaultConsistentHash{ns = 60, owners = (2)[, NodeL-25322: 30+10, NodeN-6727:
30+10]}, pendingCH=DefaultConsistentHash{ns = 60, owners = (2)[, NodeL-25322: 30+30,
NodeN-6727: 30+30]}, unionCH=null}, CacheTopology{id=5, rebalanceId=2,
currentCH=DefaultConsistentHash{ns = 60, owners = (1)[, NodeM-12972: 60+0]},
pendingCH=null, unionCH=null}]
00:24:59,287 DEBUG (transport-thread-NodeL-p33097-t6:) [ClusterCacheStatus] Updating
topologies after merge for cache cache, current topology = CacheTopology{id=5,
rebalanceId=2, currentCH=DefaultConsistentHash{ns = 60, owners = (1)[, NodeM-12972:
60+0]}, pendingCH=null, unionCH=null}, stable topology = CacheTopology{id=4,
rebalanceId=2, currentCH=DefaultConsistentHash{ns = 60, owners = (3)[, NodeL-25322: 20+20,
NodeM-12972: 20+20, NodeN-6727: 20+20]}, pendingCH=null, unionCH=null}, availability mode
= null
00:24:59,287 DEBUG (transport-thread-NodeL-p33097-t6:) [ClusterTopologyManagerImpl]
Updating cluster-wide current topology for cache cache, topology = CacheTopology{id=5,
rebalanceId=2, currentCH=DefaultConsistentHash{ns = 60, owners = (1)[, NodeM-12972:
60+0]}, pendingCH=null, unionCH=null}, availability mode = null
00:24:59,288 TRACE (transport-thread-NodeL-p33097-t3:) [LocalTopologyManagerImpl]
Ignoring consistent hash update for cache cache, current topology is 8:
CacheTopology{id=5, rebalanceId=2, currentCH=DefaultConsistentHash{ns = 60, owners = (1)[,
NodeM-12972: 60+0]}, pendingCH=null, unionCH=null}
{noformat}
Failure logs here:
http://ci.infinispan.org/viewLog.html?buildId=12364&buildTypeId=Infin...