[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ]
Diego Lovison updated ISPN-9512:
--------------------------------
Labels: testsuite_stability (was: on-hold testsuite_stability)
> *TxPartitionAndMerge*Test tests hang during teardown
> ----------------------------------------------------
>
> Key: ISPN-9512
> URL: https://issues.jboss.org/browse/ISPN-9512
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 9.4.0.CR3
>
> Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log
>
>
> Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full.
> {noformat}
> "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93)
> at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262)
> at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895)
> at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117)
> at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517)
> - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus)
> at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475)
> at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121)
> at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151)
> at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118)
> at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108)
> at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199)
> at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160)
> at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
> at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596)
> at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437)
> at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245)
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ]
Diego Lovison closed ISPN-9512.
-------------------------------
> *TxPartitionAndMerge*Test tests hang during teardown
> ----------------------------------------------------
>
> Key: ISPN-9512
> URL: https://issues.jboss.org/browse/ISPN-9512
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 9.4.0.CR3
>
> Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log
>
>
> Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full.
> {noformat}
> "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93)
> at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262)
> at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895)
> at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117)
> at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517)
> - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus)
> at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475)
> at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121)
> at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151)
> at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118)
> at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108)
> at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199)
> at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160)
> at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
> at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596)
> at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437)
> at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245)
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9746) HotRod decoder should release allocated buffers
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9746?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9746:
-------------------------------
Affects Version/s: 10.0.0.Alpha2
> HotRod decoder should release allocated buffers
> -----------------------------------------------
>
> Key: ISPN-9746
> URL: https://issues.jboss.org/browse/ISPN-9746
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 9.4.1.Final, 10.0.0.Alpha2
> Reporter: Dan Berindei
> Priority: Major
>
> {noformat}
> 19:09:07,279 ERROR [io.netty.util.ResourceLeakDetector] (HotRod-ServerIO-6-1) LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records:
> Created at:
> io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
> io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:113)
> org.infinispan.server.hotrod.HotRodDecoder.switch3(HotRodDecoder.java:1940)
> org.infinispan.server.hotrod.HotRodDecoder.switch1_0(HotRodDecoder.java:156)
> org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.java:143)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9760) HotRodPipeTest.testPipeRequests random failures
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9760?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9760:
-------------------------------
Affects Version/s: 10.0.0.Alpha1
> HotRodPipeTest.testPipeRequests random failures
> -----------------------------------------------
>
> Key: ISPN-9760
> URL: https://issues.jboss.org/browse/ISPN-9760
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 9.4.2.Final, 10.0.0.Alpha1
> Reporter: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
>
> On my machine, with trace logging enabled, I sometimes get
> {noformat}
> 10:43:23,658 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests
> java.lang.AssertionError: expected:<10000>, got:<4668>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:199) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:178) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT]
> at org.infinispan.test.AbstractInfinispanTest.eventuallyEquals(AbstractInfinispanTest.java:168) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT]
> at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65) ~[test-classes/:?]
> {noformat}
> There are some older failures in CI that look like this
> {noformat}
> java.lang.AssertionError:
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:249)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:231)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:207)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:385)
> at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65)
> {noformat}
> https://ci.infinispan.org/job/Infinispan/job/master/862/testReport/junit/...
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9760) HotRodPipeTest.testPipeRequests random failures
by Dan Berindei (Jira)
Dan Berindei created ISPN-9760:
----------------------------------
Summary: HotRodPipeTest.testPipeRequests random failures
Key: ISPN-9760
URL: https://issues.jboss.org/browse/ISPN-9760
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Server
Affects Versions: 9.4.2.Final
Reporter: Dan Berindei
On my machine, with trace logging enabled, I sometimes get
{noformat}
10:43:23,658 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests
java.lang.AssertionError: expected:<10000>, got:<4668>
at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:199) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT]
at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:178) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT]
at org.infinispan.test.AbstractInfinispanTest.eventuallyEquals(AbstractInfinispanTest.java:168) ~[infinispan-core-9.4.2-SNAPSHOT-tests.jar:9.4.2-SNAPSHOT]
at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65) ~[test-classes/:?]
{noformat}
There are some older failures in CI that look like this
{noformat}
java.lang.AssertionError:
at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:249)
at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:231)
at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:207)
at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:385)
at org.infinispan.server.hotrod.test.HotRodPipeTest.testPipeRequests(HotRodPipeTest.java:65)
{noformat}
https://ci.infinispan.org/job/Infinispan/job/master/862/testReport/junit/...
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9759) Hot Rod server non-hash aware topology updates can include non-members
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9759?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9759:
-------------------------------
Description:
When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down).
When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client.
This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}}
I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}.
{noformat}
16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges
java.lang.AssertionError: expected:<3> but was:<2>
at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?]
at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?]
{noformat}
https://ci.infinispan.org/job/Infinispan/job/master/878/
was:
When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down).
When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client.
This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}}
I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}.
{noformat}
16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges
java.lang.AssertionError: expected:<3> but was:<2>
at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?]
at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?]
{noformat}
> Hot Rod server non-hash aware topology updates can include non-members
> ----------------------------------------------------------------------
>
> Key: ISPN-9759
> URL: https://issues.jboss.org/browse/ISPN-9759
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 9.4.2.Final
> Reporter: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 10.0.0.Alpha2
>
> Attachments: HotRod12ReplicationTest.log.gz
>
>
> When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down).
> When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client.
> This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}}
> I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}.
> {noformat}
> 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges
> java.lang.AssertionError: expected:<3> but was:<2>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?]
> at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?]
> {noformat}
> https://ci.infinispan.org/job/Infinispan/job/master/878/
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9759) Hot Rod server non-hash aware topology updates can include non-members
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9759?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9759:
-------------------------------
Attachment: HotRod12ReplicationTest.log.gz
> Hot Rod server non-hash aware topology updates can include non-members
> ----------------------------------------------------------------------
>
> Key: ISPN-9759
> URL: https://issues.jboss.org/browse/ISPN-9759
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 9.4.2.Final
> Reporter: Dan Berindei
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 10.0.0.Alpha2
>
> Attachments: HotRod12ReplicationTest.log.gz
>
>
> When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down).
> When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client.
> This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}}
> I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}.
> {noformat}
> 16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges
> java.lang.AssertionError: expected:<3> but was:<2>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?]
> at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9759) Hot Rod server non-hash aware topology updates can include non-members
by Dan Berindei (Jira)
Dan Berindei created ISPN-9759:
----------------------------------
Summary: Hot Rod server non-hash aware topology updates can include non-members
Key: ISPN-9759
URL: https://issues.jboss.org/browse/ISPN-9759
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Server
Affects Versions: 9.4.2.Final
Reporter: Dan Berindei
Fix For: 10.0.0.Alpha2
When sending a non-hash aware topology update, the server includes all the servers that are present in the topology cache. This includes both servers that don't have the cache running yet (if the cache was started dynamically) and servers that are shutting down or already shutdown (because a node doesn't remove itself from the address cache before shutting down).
When a node shut down, the remaining nodes eventually see the view change and remove the stopped server from the address cache, but likely after sending a topology update with the new topology id to the clients. In cases where a rebalance is not necessary (e.g. replicated caches, or a single node is alive), a corrected topology update is never sent to the client.
This is causing random failures in {{HotRod10ReplicationTest.testReplicatedPutWithTopologyChanges}},{{HotRodReplicationTest.testReplicatedPutWithTopologyChanges}} and {{HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges}}
I saw it on a pull request build first, but I see it (and its subclasses) has been randomly failing in master as well. I am able to reliably reproduce the failure on my machine if I add a small delay in {{CrashedMemberDetectorListener.detectCrashedMember()}}.
{noformat}
16:24:18,288 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.server.hotrod.HotRod12ReplicationTest.testReplicatedPutWithTopologyChanges
java.lang.AssertionError: expected:<3> but was:<2>
at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245) ~[testng-6.9.9.jar:?]
at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252) ~[testng-6.9.9.jar:?]
at org.infinispan.server.hotrod.HotRodReplicationTest.testReplicatedPutWithTopologyChanges(HotRodReplicationTest.java:145) ~[test-classes/:?]
{noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-5575) Shared write-behind store can read stale entries on joiner
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-5575?page=com.atlassian.jira.plugin.... ]
Dan Berindei reassigned ISPN-5575:
----------------------------------
Assignee: (was: Dan Berindei)
> Shared write-behind store can read stale entries on joiner
> ----------------------------------------------------------
>
> Key: ISPN-5575
> URL: https://issues.jboss.org/browse/ISPN-5575
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Loaders and Stores
> Affects Versions: 8.0.0.Alpha2, 7.2.3.Final
> Reporter: Dan Berindei
> Priority: Major
> Fix For: 9.4.3.Final
>
>
> The AsyncCacheWriter modification queue is not sent with state transfer when the store is shared. A joiner can then read from the shared store a stale version of entries that have updates in the modification queue but are no longer in memory (because they were either removed explicitly, or evicted).
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months