[infinispan-issues] [JBoss JIRA] (ISPN-7461) Cache is not rebalanced on merge
Dennis Reed (JIRA)
issues at jboss.org
Sat Feb 11 20:52:00 EST 2017
[ https://issues.jboss.org/browse/ISPN-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13362252#comment-13362252 ]
Dennis Reed edited comment on ISPN-7461 at 2/11/17 8:51 PM:
------------------------------------------------------------
Rebalancing is done in DefaultRebalancePolicy#updateCacheStatus, which (incorrectly) skips the rebalance after a merge because of:
if (!cacheStatus.hasJoiners() && isBalanced(cacheStatus.getCacheTopology().getCurrentCH())) {
ClusterTopologyManagerImpl#updateCacheStatusAfterMerge is setting the CH to the current merged member list, and the union of all CHs, which makes ClusterCacheStatus.hasJoiners return false.
isBalanced only checks if each owner has the correct number of segments, not that it's evenly distributed.
was (Author: dereed):
Rebalancing is done in DefaultRebalancePolicy#updateCacheStatus, which skips the rebalance during a merge because of:
if (!cacheStatus.hasJoiners() && isBalanced(cacheStatus.getCacheTopology().getCurrentCH())) {
ClusterTopologyManagerImpl#updateCacheStatusAfterMerge is setting the CH to the current merged member list, and the union of all CHs, which makes ClusterCacheStatus.hasJoiners return false.
isBalanced only checks if each owner has the correct number of segments, not that it's evenly distributed.
> Cache is not rebalanced on merge
> --------------------------------
>
> Key: ISPN-7461
> URL: https://issues.jboss.org/browse/ISPN-7461
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 5.2.7.Final
> Reporter: Dennis Reed
>
> After a cluster split and merge, the consistent hash is not balanced between the members.
> For example in a 2-member cluster, after the merge one node will be primary owner of every segment. In In a larger cluster, some nodes will not own any data.
> DefaultConsistentHash{numSegments=60, numOwners=2, members=[RehashAfterPartitionMergeTest-NodeB-49100, RehashAfterPartitionMergeTest-NodeA-11552]} -- 0: 0 1, 1: 0 1, 2: 0 1, 3: 0 1, 4: 0 1, 5: 0 1, 6: 0 1, 7: 0 1, 8: 0 1, 9: 0 1, 10: 0 1, 11: 0 1, 12: 0 1, 13: 0 1, 14: 0 1, 15: 0 1, 16: 0 1, 17: 0 1, 18: 0 1, 19: 0 1, 20: 0 1, 21: 0 1, 22: 0 1, 23: 0 1, 24: 0 1, 25: 0 1, 26: 0 1, 27: 0 1, 28: 0 1, 29: 0 1, 30: 0 1, 31: 0 1, 32: 0 1, 33: 0 1, 34: 0 1, 35: 0 1, 36: 0 1, 37: 0 1, 38: 0 1, 39: 0 1, 40: 0 1, 41: 0 1, 42: 0 1, 43: 0 1, 44: 0 1, 45: 0 1, 46: 0 1, 47: 0 1, 48: 0 1, 49: 0 1, 50: 0 1, 51: 0 1, 52: 0 1, 53: 0 1, 54: 0 1, 55: 0 1, 56: 0 1, 57: 0 1, 58: 0 1, 59: 0 1
> This is triggered consistently by the RehashAfterPartitionMergeTest test case, but is not caught because it does not sufficiently check the consistent hash. (it checks RebalancePolicy.isBalanced, which merely makes sure each segment has the correct number of owners, not that it's evenly distributed).
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
More information about the infinispan-issues
mailing list