[JBoss JIRA] (ISPN-8962) PreferAvailabilityStrategy: Rely less on the stable topology
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-8962?page=com.atlassian.jira.plugin.... ]
Dan Berindei reopened ISPN-8962:
--------------------------------
I changed the algorithm to prefer the topology with the higher topology id when 2 topologies are overlapping, and the topology with the most members when they are completely independent.
I kept the overlapping topologies with a lower topology id and with at least {{numOwners}} extra members for conflict resolution, because it signals the partition with the highest topology id might have lost some values that still exist in the nodes with the lower topology id.
The problem is that this older topology has the most members and ends up being the preferred topology. It's not a big issue when conflict resolution is enabled, but when conflict resolution is disabled it means different nodes will see different values.
This is actually shown by test {{PreferAvailabilityStrategyTest#testMerge1Paused2StableAfterLosingAnotherNode}}, but at the time I wrote it I was only trying to prove that conflict resolution would happen (if enabled).
1. Start with cluster ABC
1. A was paused and keeps the stable topology
1. B and C finished rebalancing, then B was paused
1. Now A has resumed and merges with C
1. The preferred topology should be \[C\] (from C), but it's \[ABC\] (from A)
> PreferAvailabilityStrategy: Rely less on the stable topology
> ------------------------------------------------------------
>
> Key: ISPN-8962
> URL: https://issues.jboss.org/browse/ISPN-8962
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.2.2.Final, 9.3.0.Alpha1
>
>
> {{PreferAvailabilityStrategy}} checks the size of the stable topology, and only considers cache topologies that are derived from the biggest topology (in size) when picking a post-merge topology.
> Unfortunately, in some situations this algorithm fails pretty badly. If a node has a very long GC pause, when it comes back it will report the old topology *and* the old stable topology. If the rest of the cluster rebalanced, it now has both a smaller current topology and a smaller stable topology.
> Furthermore, the stable topology is updated asynchronously, independent from the current topology. So even if there's a split and the minority partition installs a current topology with fewer members, it may take some time for its stable topology to be updated with fewer members. In fact, it appears that when a rebalance is not needed (e.g. because the partition has a single node), the stable topology is never updated!
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 8 months