[infinispan-issues] [JBoss JIRA] (ISPN-11634) Merge TopologyUpdateStableCommand and TopologyUpdateCommand

Mon May 25 03:07:42 EDT 2020

     [ https://issues.redhat.com/browse/ISPN-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tristan Tarrant updated ISPN-11634:
-----------------------------------
    Fix Version/s: 11.0.0.Final
                       (was: 11.0.0.CR1)


> Merge TopologyUpdateStableCommand and TopologyUpdateCommand
> -----------------------------------------------------------
>
>                 Key: ISPN-11634
>                 URL: https://issues.redhat.com/browse/ISPN-11634
>             Project: Infinispan
>          Issue Type: Enhancement
>          Components: Core
>    Affects Versions: 11.0.0.Dev04
>            Reporter: Ryan Emerson
>            Assignee: Ryan Emerson
>            Priority: Major
>             Fix For: 11.0.0.Final
>
>
> {code:java}
> 19:48:14,672 TRACE (testng-Test:[]) [TopologyManagementHelper] Attempting to execute command on self: StableTopologyUpdateCommand{cacheName='testCache', origin=Test-NodeD, currentCH=DefaultConsistentHash
> {ns=256, owners = (1)[Test-NodeE: 256+0]}
> , pendingCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342], rebalanceId=5, topologyId=27, viewId=7}
> 19:48:14,686 INFO (jgroups-5,Test-NodeE:[]) [CLUSTER] ISPN000094: Received new cluster view for channel org.infinispan.statetransfer.Test: [Test-NodeE|8] (1) [Test-NodeE]
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [PreferConsistencyStrategy] Max stable partition topology: CacheTopology{id=25, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash
> {ns=256, owners = (2)[Test-NodeD: 129+127, Test-NodeE: 127+129]}
> , pendingCH=null, unionCH=null, actualMembers=[Test-NodeD, Test-NodeE], persistentUUIDs=[099401f9-c0b6-460d-8dc8-5051699e8287, 72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,690 TRACE (non-blocking-thread-Test-NodeE-p9541-t5:[Merge-8]) [PreferConsistencyStrategy] Max active partition topology: CacheTopology{id=27, phase=NO_REBALANCE, rebalanceId=5, currentCH=DefaultConsistentHash
> {ns=256, owners = (1)[Test-NodeE: 256+0]}
> , pendingCH=null, unionCH=null, actualMembers=[Test-NodeE], persistentUUIDs=[72b17309-62d1-4928-abf6-88a8606ef342]}
> 19:48:14,690 DEBUG (non-blocking-thread-Test-NodeE-p9541-t3:[]) [LocalTopologyManagerImpl] Ignoring topology 27 for cache testCache from old coordinator Test-NodeD
> {code}
> The old coordinator (NodeD) sent the topology update and the stable topology update, and NodeE received both. However NodeE might ignores the stable topology update because it was processed after it got the new cluster view, with itself as the coordinator. The proper fix is to get rid of the StableTopologyUpdateCommand and add a stable=true/false field in CacheTopology or in TopologyUpdateCommand.
> There's still a small chance that the coordinator will shut down before sending the topology update command to any of the other nodes, but it's a small chance. It's much more likely that a separate stable topology update command will be ignored because it is queued by LocalTopologyManagerImpl after the regular topology update command, and the regular topology update has a lot of processing to do in StateConsumerImpl.


--
This message was sent by Atlassian Jira
(v7.13.8#713008)