[JBoss JIRA] (ISPN-9257) ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9257?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9257:
-------------------------------
Fix Version/s: 10.0.0.Beta4
9.4.16.Final
(was: 10.0.0.Final)
> ClustertopologyManagerTest.testAbruptLeaveAfterGetStatus2[SCATTERED_SYNC, tx=false] random failures
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-9257
> URL: https://issues.jboss.org/browse/ISPN-9257
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
> Attachments: ISPN-8731_wrong_topology_2018-05-18_ClusterTopologyManagerTest-infinispan-core.log.gz
>
>
> The test kills the coordinator NodeA, then while NodeB is trying to recover the caches it also kills NodeC. It expects NodeB to start a rebalance with 2 nodes and discards it, in order to test that it can process the 1-node rebalance first:
> {noformat}
> 00:34:06,582 DEBUG (transport-thread-test-NodeB-p12-t6:[testCache]) [ClusterTopologyManagerTest] Discarding rebalance command CacheTopology{id=8, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (2)[test-NodeB-49590: 85, test-NodeC-58596: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (2)[test-NodeB-49590: 128, test-NodeC-58596: 128]}, unionCH=null, actualMembers=[test-NodeB-49590, test-NodeC-58596], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888, d47dc4a9-2a95-4bb1-a83b-bb8a27c9999f]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Updating local topology for cache testCache: CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 128]}, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]}
> 00:34:06,609 DEBUG (transport-thread-test-NodeB-p12-t2:[Topology-testCache]) [LocalTopologyManagerImpl] Installing fake cache topology CacheTopology{id=8, phase=NO_REBALANCE, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[test-NodeB-49590: 85]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeB-49590], persistentUUIDs=[6b96414e-15d8-4350-aa3c-4fb4fc34e888]} for cache testCache
> {noformat}
> Unfortunately {{PreferAvailabilityStrategy}} has changed a bit and the rebalance ids don't always match the expectations of the test, so that the 1-node rebalance is discarded instead:
> {noformat}
> 09:46:10,530 DEBUG (transport-thread-Test-NodeB-p54539-t3:[testCache]) [Test] Discarding rebalance command CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[Test-NodeB-62039: 85]}, pendingCH=ScatteredConsistentHash{ns=256, rebalanced=true, owners = (1)[Test-NodeB-62039: 256]}, unionCH=null, actualMembers=[Test-NodeB-62039], persistentUUIDs=[0ed7be74-4485-489b-baee-28c461c9e5de]}
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (ISPN-8329) ClusterTopologyManagerTest.testAbruptLeaveAfterGetStatus2 random failures with scattered cache
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-8329?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-8329:
-------------------------------
Fix Version/s: 10.0.0.Beta4
9.4.16.Final
> ClusterTopologyManagerTest.testAbruptLeaveAfterGetStatus2 random failures with scattered cache
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-8329
> URL: https://issues.jboss.org/browse/ISPN-8329
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Reporter: Tristan Tarrant
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
>
> Error Message
> Timed out waiting for rebalancing to complete on node ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false]-NodeB-47665, current topology is CacheTopology{id=8, rebalanceId=4, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false]-NodeB-47665: 85]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false]-NodeB-47665], persistentUUIDs=[63b3a997-f229-475b-a14c-9c892f608ba0]}. rebalanceInProgress=false, currentChIsBalanced=false
> Stacktrace
> java.lang.RuntimeException: Timed out waiting for rebalancing to complete on node ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false]-NodeB-47665, current topology is CacheTopology{id=8, rebalanceId=4, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=256, rebalanced=false, owners = (1)[ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false]-NodeB-47665: 85]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false]-NodeB-47665], persistentUUIDs=[63b3a997-f229-475b-a14c-9c892f608ba0]}. rebalanceInProgress=false, currentChIsBalanced=false
> at org.infinispan.test.TestingUtil.waitForNoRebalance(TestingUtil.java:386)
> at org.infinispan.statetransfer.ClusterTopologyManagerTest.testAbruptLeaveAfterGetStatus2(ClusterTopologyManagerTest.java:430)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ... Removed 16 stack frames
>
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (ISPN-10363) LazyInitializingExecutorService is not thread-safe
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10363?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10363:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/7096
> LazyInitializingExecutorService is not thread-safe
> --------------------------------------------------
>
> Key: ISPN-10363
> URL: https://issues.jboss.org/browse/ISPN-10363
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
>
> {{LazyInitializingExecutorService.shutdown()}} is not synchronized, so it may miss the executor created on another thread.
> Normally the long time between the first use and stop is long enough for this to be a non-issue, but it does cause random thread leaks in {{PersistenceManagerTest}}:
> {noformat}
> 17:57:06,421 DEBUG (testng-PersistenceManagerTest:[]) [DefaultCacheManager] Started cache manager PersistenceManagerTest-NodeB on null
> 17:57:06,423 INFO (testng-PersistenceManagerTest:[]) [TestSuiteProgress] Test starting: org.infinispan.persistence.PersistenceManagerTest.testProcessAfterStop
> 17:57:06,446 INFO (testng-PersistenceManagerTest:[]) [TestSuiteProgress] Test succeeded: org.infinispan.persistence.PersistenceManagerTest.testProcessAfterStop
> 17:57:06,447 DEBUG (testng-PersistenceManagerTest:[]) [DefaultCacheManager] Stopped cache manager PersistenceManagerTest-NodeB
> 17:58:38,062 WARN (main:[]) [ThreadLeakChecker] Possible leaked thread:
> "async-thread-PersistenceManagerTest-NodeB-p54399-t1" daemon prio=5 tid=0x2ab4c nid=NA waiting
> java.lang.Thread.State: WAITING
> at java.base(a)11.0.3/jdk.internal.misc.Unsafe.park(Native Method)
> at java.base@11.0.3/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> at java.base@11.0.3/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
> at java.base@11.0.3/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:433)
> at java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
> at java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
> at java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base@11.0.3/java.lang.Thread.run(Thread.java:834)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (ISPN-10363) LazyInitializingExecutorService is not thread-safe
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10363?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10363:
--------------------------------
Summary: LazyInitializingExecutorService is not thread-safe (was: LazyInitializingExecutorService.shutdown() is not thread-safe)
> LazyInitializingExecutorService is not thread-safe
> --------------------------------------------------
>
> Key: ISPN-10363
> URL: https://issues.jboss.org/browse/ISPN-10363
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
>
> {{LazyInitializingExecutorService.shutdown()}} is not synchronized, so it may miss the executor created on another thread.
> Normally the long time between the first use and stop is long enough for this to be a non-issue, but it does cause random thread leaks in {{PersistenceManagerTest}}:
> {noformat}
> 17:57:06,421 DEBUG (testng-PersistenceManagerTest:[]) [DefaultCacheManager] Started cache manager PersistenceManagerTest-NodeB on null
> 17:57:06,423 INFO (testng-PersistenceManagerTest:[]) [TestSuiteProgress] Test starting: org.infinispan.persistence.PersistenceManagerTest.testProcessAfterStop
> 17:57:06,446 INFO (testng-PersistenceManagerTest:[]) [TestSuiteProgress] Test succeeded: org.infinispan.persistence.PersistenceManagerTest.testProcessAfterStop
> 17:57:06,447 DEBUG (testng-PersistenceManagerTest:[]) [DefaultCacheManager] Stopped cache manager PersistenceManagerTest-NodeB
> 17:58:38,062 WARN (main:[]) [ThreadLeakChecker] Possible leaked thread:
> "async-thread-PersistenceManagerTest-NodeB-p54399-t1" daemon prio=5 tid=0x2ab4c nid=NA waiting
> java.lang.Thread.State: WAITING
> at java.base(a)11.0.3/jdk.internal.misc.Unsafe.park(Native Method)
> at java.base@11.0.3/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> at java.base@11.0.3/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
> at java.base@11.0.3/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:433)
> at java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
> at java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
> at java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base@11.0.3/java.lang.Thread.run(Thread.java:834)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (ISPN-10365) PreferAvailabilityStrategy assertion failure
by Dan Berindei (Jira)
Dan Berindei created ISPN-10365:
-----------------------------------
Summary: PreferAvailabilityStrategy assertion failure
Key: ISPN-10365
URL: https://issues.jboss.org/browse/ISPN-10365
Project: Infinispan
Issue Type: Bug
Components: Core, Test Suite - Core
Affects Versions: 9.4.15.Final, 10.0.0.Beta3
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 10.0.0.Beta4, 9.4.16.Final
This scenario happens unintentionally in {{RebalancePolicyJmxTest}}, because the test waits for the default cache to finish rebalancing before killing the coordinator but doesn't care about the {{CONFIG}} cache:
* A and B are running, rebalancing is disabled, then C and D join
* Re-enable rebalance, but stop B and A before the rebalance is done
* C sees the finished rebalance, D sees the READ_OLD phase
* C becomes coordinator and should recover with C's topology, but instead has an assertion failure and doesn't install a stable topology
{noformat}
16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] Cache org.infinispan.CONFIG keeping partition from [Test-NodeC-27509(rack-id=r2)]: CacheTopology{id=9, phase=NO_REBALANCE, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeC-27509(rack-id=r2): 125, Test-NodeD-62603(rack-id=r2): 131]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] Cache org.infinispan.CONFIG keeping partition from [Test-NodeD-62603(rack-id=r2)]: CacheTopology{id=6, phase=READ_OLD_WRITE_ALL, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (4)[Test-NodeA-4515(rack-id=r1): 63, Test-NodeB-42590(rack-id=r1): 62, Test-NodeC-27509(rack-id=r2): 64, Test-NodeD-62603(rack-id=r2): 67]}, unionCH=null, actualMembers=[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[e9dcc3da-07a2-4159-a8b1-94e6428011c4, 3f27ddaa-1146-483e-8473-d79a5ba347f5, 59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] Cache org.infinispan.CONFIG, resolveConflicts=false, newMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], possibleOwners=[Test-NodeD-62603(rack-id=r2), Test-NodeC-27509(rack-id=r2)], preferredTopology=CacheTopology{id=6, phase=READ_OLD_WRITE_ALL, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (4)[Test-NodeA-4515(rack-id=r1): 63, Test-NodeB-42590(rack-id=r1): 62, Test-NodeC-27509(rack-id=r2): 64, Test-NodeD-62603(rack-id=r2): 67]}, unionCH=null, actualMembers=[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[e9dcc3da-07a2-4159-a8b1-94e6428011c4, 3f27ddaa-1146-483e-8473-d79a5ba347f5, 59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}, mergeTopologyId=10
16:48:48,454 WARN (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] ISPN000517: Ignoring cache topology from [Test-NodeC-27509(rack-id=r2)] during merge: CacheTopology{id=9, phase=NO_REBALANCE, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeC-27509(rack-id=r2): 125, Test-NodeD-62603(rack-id=r2): 131]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
16:48:48,454 DEBUG (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [CLUSTER] ISPN000521: Cache org.infinispan.CONFIG recovered after merge with topology = CacheTopology{id=10, phase=NO_REBALANCE, rebalanceId=4, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=null, unionCH=null, actualMembers=[], persistentUUIDs=[]}, availability mode null
16:48:48,454 FATAL (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [CLUSTER] [Context=org.infinispan.CONFIG] ISPN000313: Lost data because of abrupt leavers [Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)]
16:48:48,455 ERROR (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [LimitedExecutor] Exception in task
java.lang.AssertionError: null
at org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy.onPartitionMerge(PreferAvailabilityStrategy.java:217) ~[classes/:?]
at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:647) ~[classes/:?]
at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:500) ~[classes/:?]
at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) [classes/:?]
{noformat}
Eventually the missing stable topology makes the test fail:
{noformat}
16:48:49,349 DEBUG (testng-Test:[null]) [ClusterCacheStatus] ISPN000519: Updating stable topology for cache org.infinispan.CONFIG, topology null
16:48:49,349 WARN (testng-Test:[null]) [CacheTopologyControlCommand] ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=null, type=POLICY_ENABLE, sender=Test-NodeC-27509(rack-id=r2), joinInfo=null, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, phase=null, actualMembers=null, throwable=null, viewId=5}
java.lang.NullPointerException: null
at org.infinispan.topology.CacheTopologyControlCommand.<init>(CacheTopologyControlCommand.java:147) ~[classes/:?]
at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastStableTopologyUpdate(ClusterTopologyManagerImpl.java:659) ~[classes/:?]
at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:806) ~[classes/:?]
at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4772) ~[?:?]
at org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:702) ~[classes/:?]
at org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:682) ~[classes/:?]
at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:215) ~[classes/:?]
at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:163) [classes/:?]
at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44) [classes/:?]
at org.infinispan.topology.LocalTopologyManagerImpl.executeOnClusterSync(LocalTopologyManagerImpl.java:752) [classes/:?]
at org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:623) [classes/:?]
at org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:581) [classes/:?]
16:48:49,355 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.statetransfer.RebalancePolicyJmxTest.testJoinAndLeaveWithRebalanceSuspendedAwaitingInitialTransfer[DIST_SYNC]
javax.management.MBeanException: Error invoking setter for attribute rebalancingEnabled
at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:358) ~[classes/:?]
at org.infinispan.jmx.ResourceDMBean.setAttribute(ResourceDMBean.java:216) ~[classes/:?]
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.setAttribute(DefaultMBeanServerInterceptor.java:736) ~[?:?]
at com.sun.jmx.mbeanserver.JmxMBeanServer.setAttribute(JmxMBeanServer.java:739) ~[?:?]
at org.infinispan.statetransfer.RebalancePolicyJmxTest.doTest(RebalancePolicyJmxTest.java:163) ~[test-classes/:?]
at org.infinispan.statetransfer.RebalancePolicyJmxTest.testJoinAndLeaveWithRebalanceSuspendedAwaitingInitialTransfer(RebalancePolicyJmxTest.java:44) ~[test-classes/:?]
Caused by: java.lang.reflect.InvocationTargetException
at jdk.internal.reflect.GeneratedMethodAccessor495.invoke(Unknown Source) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at org.infinispan.jmx.ResourceDMBean$InvokableSetterBasedMBeanAttributeInfo.invoke(ResourceDMBean.java:422) ~[classes/:?]
at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:355) ~[classes/:?]
... 28 more
Caused by: org.infinispan.commons.CacheException: Unsuccessful local response
at org.infinispan.topology.LocalTopologyManagerImpl.executeOnClusterSync(LocalTopologyManagerImpl.java:757) ~[classes/:?]
at org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:623) ~[classes/:?]
at org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:581) ~[classes/:?]
at jdk.internal.reflect.GeneratedMethodAccessor495.invoke(Unknown Source) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
at org.infinispan.jmx.ResourceDMBean$InvokableSetterBasedMBeanAttributeInfo.invoke(ResourceDMBean.java:422) ~[classes/:?]
at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:355) ~[classes/:?]
... 28 more
{noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (ISPN-10365) PreferAvailabilityStrategy assertion failure
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10365?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10365:
--------------------------------
Status: Open (was: New)
> PreferAvailabilityStrategy assertion failure
> --------------------------------------------
>
> Key: ISPN-10365
> URL: https://issues.jboss.org/browse/ISPN-10365
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3, 9.4.15.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
>
> This scenario happens unintentionally in {{RebalancePolicyJmxTest}}, because the test waits for the default cache to finish rebalancing before killing the coordinator but doesn't care about the {{CONFIG}} cache:
> * A and B are running, rebalancing is disabled, then C and D join
> * Re-enable rebalance, but stop B and A before the rebalance is done
> * C sees the finished rebalance, D sees the READ_OLD phase
> * C becomes coordinator and should recover with C's topology, but instead has an assertion failure and doesn't install a stable topology
> {noformat}
> 16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] Cache org.infinispan.CONFIG keeping partition from [Test-NodeC-27509(rack-id=r2)]: CacheTopology{id=9, phase=NO_REBALANCE, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeC-27509(rack-id=r2): 125, Test-NodeD-62603(rack-id=r2): 131]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
> 16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] Cache org.infinispan.CONFIG keeping partition from [Test-NodeD-62603(rack-id=r2)]: CacheTopology{id=6, phase=READ_OLD_WRITE_ALL, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (4)[Test-NodeA-4515(rack-id=r1): 63, Test-NodeB-42590(rack-id=r1): 62, Test-NodeC-27509(rack-id=r2): 64, Test-NodeD-62603(rack-id=r2): 67]}, unionCH=null, actualMembers=[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[e9dcc3da-07a2-4159-a8b1-94e6428011c4, 3f27ddaa-1146-483e-8473-d79a5ba347f5, 59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
> 16:48:48,454 TRACE (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] Cache org.infinispan.CONFIG, resolveConflicts=false, newMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], possibleOwners=[Test-NodeD-62603(rack-id=r2), Test-NodeC-27509(rack-id=r2)], preferredTopology=CacheTopology{id=6, phase=READ_OLD_WRITE_ALL, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (4)[Test-NodeA-4515(rack-id=r1): 63, Test-NodeB-42590(rack-id=r1): 62, Test-NodeC-27509(rack-id=r2): 64, Test-NodeD-62603(rack-id=r2): 67]}, unionCH=null, actualMembers=[Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[e9dcc3da-07a2-4159-a8b1-94e6428011c4, 3f27ddaa-1146-483e-8473-d79a5ba347f5, 59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}, mergeTopologyId=10
> 16:48:48,454 WARN (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [PreferAvailabilityStrategy] ISPN000517: Ignoring cache topology from [Test-NodeC-27509(rack-id=r2)] during merge: CacheTopology{id=9, phase=NO_REBALANCE, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeC-27509(rack-id=r2): 125, Test-NodeD-62603(rack-id=r2): 131]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)], persistentUUIDs=[59d0898c-f166-4129-9165-a22aca475286, 05d7dd0b-7cd8-464d-8adb-41fac100e8bf]}
> 16:48:48,454 DEBUG (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [CLUSTER] ISPN000521: Cache org.infinispan.CONFIG recovered after merge with topology = CacheTopology{id=10, phase=NO_REBALANCE, rebalanceId=4, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-4515(rack-id=r1): 127, Test-NodeB-42590(rack-id=r1): 129]}, pendingCH=null, unionCH=null, actualMembers=[], persistentUUIDs=[]}, availability mode null
> 16:48:48,454 FATAL (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [CLUSTER] [Context=org.infinispan.CONFIG] ISPN000313: Lost data because of abrupt leavers [Test-NodeA-4515(rack-id=r1), Test-NodeB-42590(rack-id=r1), Test-NodeC-27509(rack-id=r2), Test-NodeD-62603(rack-id=r2)]
> 16:48:48,455 ERROR (stateTransferExecutor-thread-Test-NodeC-p49651-t6:[Merge-5]) [LimitedExecutor] Exception in task
> java.lang.AssertionError: null
> at org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy.onPartitionMerge(PreferAvailabilityStrategy.java:217) ~[classes/:?]
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:647) ~[classes/:?]
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:500) ~[classes/:?]
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) [classes/:?]
> {noformat}
> Eventually the missing stable topology makes the test fail:
> {noformat}
> 16:48:49,349 DEBUG (testng-Test:[null]) [ClusterCacheStatus] ISPN000519: Updating stable topology for cache org.infinispan.CONFIG, topology null
> 16:48:49,349 WARN (testng-Test:[null]) [CacheTopologyControlCommand] ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=null, type=POLICY_ENABLE, sender=Test-NodeC-27509(rack-id=r2), joinInfo=null, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, phase=null, actualMembers=null, throwable=null, viewId=5}
> java.lang.NullPointerException: null
> at org.infinispan.topology.CacheTopologyControlCommand.<init>(CacheTopologyControlCommand.java:147) ~[classes/:?]
> at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastStableTopologyUpdate(ClusterTopologyManagerImpl.java:659) ~[classes/:?]
> at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:806) ~[classes/:?]
> at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4772) ~[?:?]
> at org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:702) ~[classes/:?]
> at org.infinispan.topology.ClusterTopologyManagerImpl.setRebalancingEnabled(ClusterTopologyManagerImpl.java:682) ~[classes/:?]
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:215) ~[classes/:?]
> at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:163) [classes/:?]
> at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44) [classes/:?]
> at org.infinispan.topology.LocalTopologyManagerImpl.executeOnClusterSync(LocalTopologyManagerImpl.java:752) [classes/:?]
> at org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:623) [classes/:?]
> at org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:581) [classes/:?]
> 16:48:49,355 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.statetransfer.RebalancePolicyJmxTest.testJoinAndLeaveWithRebalanceSuspendedAwaitingInitialTransfer[DIST_SYNC]
> javax.management.MBeanException: Error invoking setter for attribute rebalancingEnabled
> at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:358) ~[classes/:?]
> at org.infinispan.jmx.ResourceDMBean.setAttribute(ResourceDMBean.java:216) ~[classes/:?]
> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.setAttribute(DefaultMBeanServerInterceptor.java:736) ~[?:?]
> at com.sun.jmx.mbeanserver.JmxMBeanServer.setAttribute(JmxMBeanServer.java:739) ~[?:?]
> at org.infinispan.statetransfer.RebalancePolicyJmxTest.doTest(RebalancePolicyJmxTest.java:163) ~[test-classes/:?]
> at org.infinispan.statetransfer.RebalancePolicyJmxTest.testJoinAndLeaveWithRebalanceSuspendedAwaitingInitialTransfer(RebalancePolicyJmxTest.java:44) ~[test-classes/:?]
> Caused by: java.lang.reflect.InvocationTargetException
> at jdk.internal.reflect.GeneratedMethodAccessor495.invoke(Unknown Source) ~[?:?]
> at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
> at org.infinispan.jmx.ResourceDMBean$InvokableSetterBasedMBeanAttributeInfo.invoke(ResourceDMBean.java:422) ~[classes/:?]
> at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:355) ~[classes/:?]
> ... 28 more
> Caused by: org.infinispan.commons.CacheException: Unsuccessful local response
> at org.infinispan.topology.LocalTopologyManagerImpl.executeOnClusterSync(LocalTopologyManagerImpl.java:757) ~[classes/:?]
> at org.infinispan.topology.LocalTopologyManagerImpl.setCacheRebalancingEnabled(LocalTopologyManagerImpl.java:623) ~[classes/:?]
> at org.infinispan.topology.LocalTopologyManagerImpl.setRebalancingEnabled(LocalTopologyManagerImpl.java:581) ~[classes/:?]
> at jdk.internal.reflect.GeneratedMethodAccessor495.invoke(Unknown Source) ~[?:?]
> at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
> at org.infinispan.jmx.ResourceDMBean$InvokableSetterBasedMBeanAttributeInfo.invoke(ResourceDMBean.java:422) ~[classes/:?]
> at org.infinispan.jmx.ResourceDMBean.setNamedAttribute(ResourceDMBean.java:355) ~[classes/:?]
> ... 28 more
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (IPROTO-101) Enum values with same name in different classes clash
by Ryan Emerson (Jira)
Ryan Emerson created IPROTO-101:
-----------------------------------
Summary: Enum values with same name in different classes clash
Key: IPROTO-101
URL: https://issues.jboss.org/browse/IPROTO-101
Project: Infinispan ProtoStream
Issue Type: Bug
Affects Versions: 4.3.0.Alpha6
Reporter: Ryan Emerson
{code:java}
@ProtoName("TakeSiteOfflineResponse")
public enum TakeSiteOfflineResponse {
@ProtoEnumValue(number = 1)
NO_SUCH_SITE,
@ProtoEnumValue(number = 2)
ALREADY_OFFLINE,
@ProtoEnumValue(number = 3)
TAKEN_OFFLINE
}
@ProtoName("BringSiteOnlineResponse")
public enum BringSiteOnlineResponse {
@ProtoEnumValue(number = 1)
NO_SUCH_SITE,
@ProtoEnumValue(number = 2)
ALREADY_ONLINE,
@ProtoEnumValue(number = 3)
BROUGHT_ONLINE
}
{code}
Results in:
{code:java}
Enum value org.infinispan.test.TakeSiteOfflineResponse.NO_SUCH_SITE clashes with enum value org.infinispan.test.BringSiteOnlineResponse.NO_SUCH_SITE
{code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (IPROTO-101) Enum values with same name in different classes clash
by Ryan Emerson (Jira)
[ https://issues.jboss.org/browse/IPROTO-101?page=com.atlassian.jira.plugin... ]
Ryan Emerson reassigned IPROTO-101:
-----------------------------------
Assignee: Nistor Adrian
> Enum values with same name in different classes clash
> -----------------------------------------------------
>
> Key: IPROTO-101
> URL: https://issues.jboss.org/browse/IPROTO-101
> Project: Infinispan ProtoStream
> Issue Type: Bug
> Affects Versions: 4.3.0.Alpha6
> Reporter: Ryan Emerson
> Assignee: Nistor Adrian
> Priority: Major
>
> {code:java}
> @ProtoName("TakeSiteOfflineResponse")
> public enum TakeSiteOfflineResponse {
> @ProtoEnumValue(number = 1)
> NO_SUCH_SITE,
> @ProtoEnumValue(number = 2)
> ALREADY_OFFLINE,
> @ProtoEnumValue(number = 3)
> TAKEN_OFFLINE
> }
> @ProtoName("BringSiteOnlineResponse")
> public enum BringSiteOnlineResponse {
> @ProtoEnumValue(number = 1)
> NO_SUCH_SITE,
> @ProtoEnumValue(number = 2)
> ALREADY_ONLINE,
> @ProtoEnumValue(number = 3)
> BROUGHT_ONLINE
> }
> {code}
> Results in:
> {code:java}
> Enum value org.infinispan.test.TakeSiteOfflineResponse.NO_SUCH_SITE clashes with enum value org.infinispan.test.BringSiteOnlineResponse.NO_SUCH_SITE
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months
[JBoss JIRA] (ISPN-9077) NullPointerException when trying to recover cache
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9077?page=com.atlassian.jira.plugin.... ]
Dan Berindei resolved ISPN-9077.
--------------------------------
Fix Version/s: 9.3.0.Final
9.2.2.Final
Resolution: Done
Fixed with ISPN-8962 by ignoring {{null}} cache topologies from nodes which haven't finished joining.
> NullPointerException when trying to recover cache
> -------------------------------------------------
>
> Key: ISPN-9077
> URL: https://issues.jboss.org/browse/ISPN-9077
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.0.Final
> Reporter: Johno Crawford
> Priority: Major
> Fix For: 9.3.0.Final, 9.2.2.Final
>
>
> {code:java}
> 2018-04-13 08:34:35,065 ERROR [transport-thread-x-service-2-p4-t20] (org.infinispan.topology.ClusterCacheStatus) ISPN000228: Failed to recover cache xx state after the current node became the coordinator
> java.lang.NullPointerException: null
> at org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy.lambda$static$0(PreferAvailabilityStrategy.java:33) ~[infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at java.util.TimSort.countRunAndMakeAscending(TimSort.java:360) ~[?:1.8.0_144]
> at java.util.TimSort.sort(TimSort.java:220) ~[?:1.8.0_144]
> at java.util.Arrays.sort(Arrays.java:1512) ~[?:1.8.0_144]
> at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:348) ~[?:1.8.0_144]
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_144]
> at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[?:1.8.0_144]
> at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[?:1.8.0_144]
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_144]
> at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) ~[?:1.8.0_144]
> at org.infinispan.partitionhandling.impl.PreferAvailabilityStrategy.onPartitionMerge(PreferAvailabilityStrategy.java:120) ~[infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:597) ~[infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$6(ClusterTopologyManagerImpl.java:519) ~[infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144) [infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33) [infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174) [infinispan-core-9.2.0.Final.jar:9.2.0.Final]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_144]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_144]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 10 months