]
Vittorio Rigamonti updated ISPN-10365:
--------------------------------------
Fix Version/s: 10.0.0.CR2
(was: 10.0.0.CR1)
PreferAvailabilityStrategy assertion failure
--------------------------------------------
Key: ISPN-10365
URL:
https://issues.jboss.org/browse/ISPN-10365
Project: Infinispan
Issue Type: Bug
Components: Core, Test Suite - Core
Affects Versions: 10.0.0.Beta3, 9.4.15.Final
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Major
Labels: testsuite_stability
Fix For: 10.0.0.CR2, 9.4.17.Final
This scenario happens unintentionally in {{RebalancePolicyJmxTest}}, because the test
waits for the default cache to finish rebalancing before killing the coordinator but
doesn't care about the {{CONFIG}} cache:
* A and B are running, rebalancing is disabled, then C and D join
* Re-enable rebalance, but stop B and A before the rebalance is done
* C sees the finished rebalance, D sees the READ_OLD phase
* C becomes coordinator and should recover with C's topology, but instead has an
assertion failure and doesn't install a stable topology
{noformat}
16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[PreferAvailabilityStrategy] Cache org.infinispan.CONFIG keeping partition from
[Test-NodeC-27509(rack-id=r2)]: CacheTopology{id=9, phase=NO_REBALANCE, rebalanceId=3,
currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeC-27509(rack-id=r2):
125, Test-NodeD-62603(rack-id=r2): 131]}, pendingCH=null, unionCH=null,
actualMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)],
persistentUUIDs=[59d0898c-f166-4129-9165-a22aca475286,
05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[PreferAvailabilityStrategy] Cache org.infinispan.CONFIG keeping partition from
[Test-NodeD-62603(rack-id=r2)]: CacheTopology{id=6, phase=READ_OLD_WRITE_ALL,
rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]},
pendingCH=ReplicatedConsistentHash{ns = 256, owners = (4)[Test-NodeA-4515(rack-id=r1): 63,
Test-NodeB-42590(rack-id=r1): 62, Test-NodeC-27509(rack-id=r2): 64,
Test-NodeD-62603(rack-id=r2): 67]}, unionCH=null,
actualMembers=[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1),
Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)],
persistentUUIDs=[e9dcc3da-07a2-4159-a8b1-94e6428011c4,
3f27ddaa-1146-483e-8473-d79a5ba347f5, 59d0898c-f166-4129-9165-a22aca475286,
05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[PreferAvailabilityStrategy] Cache org.infinispan.CONFIG, resolveConflicts=false,
newMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)],
possibleOwners=[Test-NodeD-62603(rack-id=r2), Test-NodeC-27509(rack-id=r2)],
preferredTopology=CacheTopology{id=6, phase=READ_OLD_WRITE_ALL, rebalanceId=3,
currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1):
127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=ReplicatedConsistentHash{ns = 256,
owners = (4)[Test-NodeA-4515(rack-id=r1): 63, Test-NodeB-42590(rack-id=r1): 62,
Test-NodeC-27509(rack-id=r2): 64, Test-NodeD-62603(rack-id=r2): 67]}, unionCH=null,
actualMembers=[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1),
Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)],
persistentUUIDs=[e9dcc3da-07a2-4159-a8b1-94e6428011c4,
3f27ddaa-1146-483e-8473-d79a5ba347f5, 59d0898c-f166-4129-9165-a22aca475286,
05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}, mergeTopologyId=10
16:48:48,454 WARN (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[PreferAvailabilityStrategy] ISPN000517: Ignoring cache topology from
[Test-NodeC-27509(rack-id=r2)] during merge: CacheTopology{id=9, phase=NO_REBALANCE,
rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners =
(2)[Test-NodeC-27509(rack-id=r2): 125, Test-NodeD-62603(rack-id=r2): 131]},
pendingCH=null, unionCH=null, actualMembers=[Test-NodeC-27509(rack-id=r2),
Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[59d0898c-f166-4129-9165-a22aca475286,
05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
16:48:48,454 DEBUG (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[CLUSTER] ISPN000521: Cache org.infinispan.CONFIG recovered after merge with topology =
CacheTopology{id=10, phase=NO_REBALANCE, rebalanceId=4,
currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1):
127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=null, unionCH=null, actualMembers=[],
persistentUUIDs=[]}, availability mode null
16:48:48,454 FATAL (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[CLUSTER] [Context=org.infinispan.CONFIG] ISPN000313: Lost data because of abrupt leavers
[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2),
Test-NodeD-62603(rack-id=r2)]
16:48:48,455 ERROR (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5])
[LimitedExecutor] Exception in task
java.lang.AssertionError: null
at
org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy.onPartitionMerge(PreferAvailabilityStrategy.java:217)
~[classes/:?]
at
org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:647)
~[classes/:?]
at
org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:500)
~[classes/:?]
at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
[classes/:?]
{noformat}
Eventually the missing stable topology makes the test fail:
{noformat}
16:48:49,349 DEBUG (testng-Test:[null]) [ClusterCacheStatus] ISPN000519: Updating stable
topology for cache org.infinispan.CONFIG, topology null
16:48:49,349 WARN (testng-Test:[null]) [CacheTopologyControlCommand] ISPN000071: Caught
exception when handling command CacheTopologyControlCommand{cache=null,
type=POLICY_ENABLE, sender=Test-NodeC-27509(rack-id=r2), joinInfo=null, topologyId=0,
rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, phase=null,
actualMembers=null, throwable=null, viewId=5}
java.lang.NullPointerException: null
at
org.infinispan.topology.CacheTopologyControlCommand.<init>(CacheTopologyControlCommand.java:147)
~[classes/:?]
at
org.infinispan.topology.ClusterTopologyManagerImpl.broadcastStableTopologyUpdate(ClusterTopologyManagerImpl.java:659)
~[classes/:?]
at
org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:806)
~[classes/:?]
at
java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4772)
~[?:?]
at
org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:702)
~[classes/:?]
at
org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:682)
~[classes/:?]
at
org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:215)
~[classes/:?]
at
org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:163)
[classes/:?]
at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44)
[classes/:?]
at
org.infinispan.topology.LocalTopologyManagerImpl.executeOnClusterSync(LocalTopologyManagerImpl.java:752)
[classes/:?]
at
org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:623)
[classes/:?]
at
org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:581)
[classes/:?]
16:48:49,355 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed:
org.infinispan.statetransfer.RebalancePolicyJmxTest.testJoinAndLeaveWithRebalanceSuspendedAwaitingInitialTransfer[DIST_SYNC]
javax.management.MBeanException: Error invoking setter for attribute rebalancingEnabled
at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:358)
~[classes/:?]
at org.infinispan.jmx.ResourceDMBean.setAttribute(ResourceDMBean.java:216)
~[classes/:?]
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.setAttribute(DefaultMBeanServerInterceptor.java:736)
~[?:?]
at com.sun.jmx.mbeanserver.JmxMBeanServer.setAttribute(JmxMBeanServer.java:739) ~[?:?]
at
org.infinispan.statetransfer.RebalancePolicyJmxTest.doTest(RebalancePolicyJmxTest.java:163)
~[test-classes/:?]
at
org.infinispan.statetransfer.RebalancePolicyJmxTest.testJoinAndLeaveWithRebalanceSuspendedAwaitingInitialTransfer(RebalancePolicyJmxTest.java:44)
~[test-classes/:?]
Caused by: java.lang.reflect.InvocationTargetException
at jdk.internal.reflect.GeneratedMethodAccessor495.invoke(Unknown Source) ~[?:?]
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?]
at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at
org.infinispan.jmx.ResourceDMBean$InvokableSetterBasedMBeanAttributeInfo.invoke(ResourceDMBean.java:422)
~[classes/:?]
at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:355)
~[classes/:?]
... 28 more
Caused by: org.infinispan.commons.CacheException: Unsuccessful local response
at
org.infinispan.topology.LocalTopologyManagerImpl.executeOnClusterSync(LocalTopologyManagerImpl.java:757)
~[classes/:?]
at
org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:623)
~[classes/:?]
at
org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:581)
~[classes/:?]
at jdk.internal.reflect.GeneratedMethodAccessor495.invoke(Unknown Source) ~[?:?]
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:?]
at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at
org.infinispan.jmx.ResourceDMBean$InvokableSetterBasedMBeanAttributeInfo.invoke(ResourceDMBean.java:422)
~[classes/:?]
at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:355)
~[classes/:?]
... 28 more
{noformat}