November 2018 - infinispan-issues

[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown

by Diego Lovison (Jira)

[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ] Diego Lovison updated ISPN-9512: -------------------------------- Labels: testsuite_stability (was: on-hold testsuite_stability) > *TxPartitionAndMerge*Test tests hang during teardown > ---------------------------------------------------- > > Key: ISPN-9512 > URL: https://issues.jboss.org/browse/ISPN-9512 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Reporter: Dan Berindei > Assignee: Dan Berindei > Priority: Major > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log > > > Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full. > {noformat} > "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93) > at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262) > at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895) > at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453) > at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309) > at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197) > at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54) > at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117) > at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517) > - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus) > at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475) > at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121) > at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151) > at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118) > at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108) > at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473) > at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199) > at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160) > at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91) > at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596) > at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437) > at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245) > at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest. -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown

by Diego Lovison (Jira)

[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ] Diego Lovison closed ISPN-9512. ------------------------------- > *TxPartitionAndMerge*Test tests hang during teardown > ---------------------------------------------------- > > Key: ISPN-9512 > URL: https://issues.jboss.org/browse/ISPN-9512 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Reporter: Dan Berindei > Assignee: Dan Berindei > Priority: Major > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log > > > Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full. > {noformat} > "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93) > at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262) > at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895) > at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453) > at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309) > at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197) > at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54) > at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117) > at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517) > - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus) > at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475) > at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121) > at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151) > at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118) > at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108) > at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473) > at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199) > at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160) > at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91) > at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596) > at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437) > at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245) > at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest. -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9746) HotRod decoder should release allocated buffers

by Dan Berindei (Jira)

[ https://issues.jboss.org/browse/ISPN-9746?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9746: ------------------------------- Affects Version/s: 10.0.0.Alpha2 > HotRod decoder should release allocated buffers > ----------------------------------------------- > > Key: ISPN-9746 > URL: https://issues.jboss.org/browse/ISPN-9746 > Project: Infinispan > Issue Type: Bug > Components: Server > Affects Versions: 9.4.1.Final, 10.0.0.Alpha2 > Reporter: Dan Berindei > Priority: Major > > {noformat} > 19:09:07,279 ERROR [io.netty.util.ResourceLeakDetector] (HotRod-ServerIO-6-1) LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information. > Recent access records: > Created at: > io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331) > io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185) > io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176) > io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:113) > org.infinispan.server.hotrod.HotRodDecoder.switch3(HotRodDecoder.java:1940) > org.infinispan.server.hotrod.HotRodDecoder.switch1_0(HotRodDecoder.java:156) > org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.java:143) > {noformat} -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9760) HotRodPipeTest.testPipeRequests random failures

by Dan Berindei (Jira)

[ https://issues.jboss.org/browse/ISPN-9760?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9760: ------------------------------- Affects Version/s: 10.0.0.Alpha1 > HotRodPipeTest.testPipeRequests random failures > ----------------------------------------------- > > Key: ISPN-9760 > URL: https://issues.jboss.org/browse/ISPN-9760 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Server > Affects Versions: 9.4.2.Final, 10.0.0.Alpha1 > Reporter: Dan Berindei > Priority: Major > Labels: testsuite_stability > > On my machine, with trace logging enabled, I sometimes get > {noformat} > 10:43:23,658 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests > java.lang.AssertionError: expected:<10000>, got:<4668> > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] > at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:199) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT] > at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:178) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT] > at org.infinispan.test.AbstractInfinispanTest.eventuallyEquals(AbstractInfinispanTest.java:168) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT] > at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65) ~[test-classes/:?] > {noformat} > There are some older failures in CI that look like this > {noformat} > java.lang.AssertionError: > at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:249) > at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:231) > at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:207) > at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:385) > at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65) > {noformat} > https://ci.infinispan.org/job/Infinispan/job/master/862/testReport/junit/... -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9760) HotRodPipeTest.testPipeRequests random failures

by Dan Berindei (Jira)

Dan Berindei created ISPN-9760: ---------------------------------- Summary: HotRodPipeTest.testPipeRequests random failures Key: ISPN-9760 URL: https://issues.jboss.org/browse/ISPN-9760 Project: Infinispan Issue Type: Bug Components: Test Suite - Server Affects Versions: 9.4.2.Final Reporter: Dan Berindei On my machine, with trace logging enabled, I sometimes get {noformat} 10:43:23,658 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests java.lang.AssertionError: expected:<10000>, got:<4668> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:199) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT] at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:178) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT] at org.infinispan.test.AbstractInfinispanTest.eventuallyEquals(AbstractInfinispanTest.java:168) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT] at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65) ~[test-classes/:?] {noformat} There are some older failures in CI that look like this {noformat} java.lang.AssertionError: at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:249) at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:231) at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:207) at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:385) at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65) {noformat} https://ci.infinispan.org/job/Infinispan/job/master/862/testReport/junit/... -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9759) Hot Rod server non-hash aware topology updates can include non-members

by Dan Berindei (Jira)

[ https://issues.jboss.org/browse/ISPN-9759?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9759: ------------------------------- Description: When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down). When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client. This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}} I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}. {noformat} 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges java.lang.AssertionError: expected:<3> but was:<2> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?] at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?] {noformat} https://ci.infinispan.org/job/Infinispan/job/master/878/ was: When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down). When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client. This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}} I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}. {noformat} 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges java.lang.AssertionError: expected:<3> but was:<2> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?] at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?] {noformat} > Hot Rod server non-hash aware topology updates can include non-members > ---------------------------------------------------------------------- > > Key: ISPN-9759 > URL: https://issues.jboss.org/browse/ISPN-9759 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Server > Affects Versions: 9.4.2.Final > Reporter: Dan Berindei > Priority: Critical > Labels: testsuite_stability > Fix For: 10.0.0.Alpha2 > > Attachments: HotRod12ReplicationTest.log.gz > > > When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down). > When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client. > This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}} > I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}. > {noformat} > 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges > java.lang.AssertionError: expected:<3> but was:<2> > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?] > at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?] > {noformat} > https://ci.infinispan.org/job/Infinispan/job/master/878/ -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9759) Hot Rod server non-hash aware topology updates can include non-members

by Dan Berindei (Jira)

[ https://issues.jboss.org/browse/ISPN-9759?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9759: ------------------------------- Attachment: HotRod12ReplicationTest.log.gz > Hot Rod server non-hash aware topology updates can include non-members > ---------------------------------------------------------------------- > > Key: ISPN-9759 > URL: https://issues.jboss.org/browse/ISPN-9759 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Server > Affects Versions: 9.4.2.Final > Reporter: Dan Berindei > Priority: Critical > Labels: testsuite_stability > Fix For: 10.0.0.Alpha2 > > Attachments: HotRod12ReplicationTest.log.gz > > > When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down). > When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client. > This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}} > I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}. > {noformat} > 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges > java.lang.AssertionError: expected:<3> but was:<2> > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?] > at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?] > at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?] > {noformat} -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9759) Hot Rod server non-hash aware topology updates can include non-members

by Dan Berindei (Jira)

Dan Berindei created ISPN-9759: ---------------------------------- Summary: Hot Rod server non-hash aware topology updates can include non-members Key: ISPN-9759 URL: https://issues.jboss.org/browse/ISPN-9759 Project: Infinispan Issue Type: Bug Components: Test Suite - Server Affects Versions: 9.4.2.Final Reporter: Dan Berindei Fix For: 10.0.0.Alpha2 When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down). When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client. This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}} I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}. {noformat} 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges java.lang.AssertionError: expected:<3> but was:<2> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?] at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?] at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?] {noformat} -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-6158) Move interceptors to private package

by Dan Berindei (Jira)

[ https://issues.jboss.org/browse/ISPN-6158?page=com.atlassian.jira.plugin.... ] Dan Berindei resolved ISPN-6158. -------------------------------- Fix Version/s: 9.1.0.Final (was: 9.4.3.Final) Resolution: Done Done with ISPN-5467 > Move interceptors to private package > ------------------------------------ > > Key: ISPN-6158 > URL: https://issues.jboss.org/browse/ISPN-6158 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Reporter: Tristan Tarrant > Assignee: Dan Berindei > Priority: Major > Fix For: 9.1.0.Final > > -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-5575) Shared write-behind store can read stale entries on joiner

by Dan Berindei (Jira)

[ https://issues.jboss.org/browse/ISPN-5575?page=com.atlassian.jira.plugin.... ] Dan Berindei reassigned ISPN-5575: ---------------------------------- Assignee: (was: Dan Berindei) > Shared write-behind store can read stale entries on joiner > ---------------------------------------------------------- > > Key: ISPN-5575 > URL: https://issues.jboss.org/browse/ISPN-5575 > Project: Infinispan > Issue Type: Bug > Components: Core, Loaders and Stores > Affects Versions: 8.0.0.Alpha2, 7.2.3.Final > Reporter: Dan Berindei > Priority: Major > Fix For: 9.4.3.Final > > > The AsyncCacheWriter modification queue is not sent with state transfer when the store is shared. A joiner can then read from the shared store a stale version of entries that have updates in the modification queue but are no longer in memory (because they were either removed explicitly, or evicted). -- This message was sent by Atlassian Jira (v7.12.1#712002)

7 years, 6 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues November 2018