September 2018 - infinispan-issues

[JBoss JIRA] (ISPN-9508) org.infinispan.persistence.spi.PersistenceException: java.lang.NullPointerException

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9508?page=com.atlassian.jira.plugin.... ] Dan Berindei edited comment on ISPN-9508 at 9/13/18 7:11 AM: ------------------------------------------------------------- {noformat} 2018-09-12 17:00:07,753 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-91,dr-opsdb01) ISPN000136: Error executing command PutMapCommand, writing keys [WrappedByteArray{bytes=[B0x4A0B373930393931..[13], hashCode=-1783032400}, WrappedByteArr ay{bytes=[B0x4A0B373936313030..[13], hashCode=-224759146}, WrappedByteArray{bytes=[B0x4A0B373936333936..[13], hashCode=-1470273455}, WrappedByteArray{bytes=[B0x4A0B373931333533..[13], hashCode=-358389069}, WrappedByteArray{bytes=[B0x4A0B373930333234..[13], hashCode=-1187 497688}, WrappedByteArray{bytes=[B0x4A0B373936343036..[13], hashCode=873985031}, WrappedByteArray{bytes=[B0x4A0B373936323331..[13], hashCode=-87622955}, WrappedByteArray{bytes=[B0x4A0B373930333134..[13], hashCode=-2072230806}...<5001 other elements>]: java.lang.ClassCast Exception: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1827) at java.util.HashMap$TreeNode.treeify(HashMap.java:1944) at java.util.HashMap.treeifyBin(HashMap.java:771) at java.util.HashMap.putVal(HashMap.java:643) at java.util.HashMap.put(HashMap.java:611) at org.infinispan.context.impl.NonTxInvocationContext.putLookedUpEntry(NonTxInvocationContext.java:49) at org.infinispan.container.impl.EntryFactoryImpl.wrapExternalEntry(EntryFactoryImpl.java:173) at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.wrapRemoteEntry(BaseDistributionInterceptor.java:233) at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.lambda$remoteGet$1(BaseDistributionInterceptor.java:210) at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656) at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67) at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102) at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50) at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1370) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1273) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125) at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1418) at org.jgroups.JChannel.up(JChannel.java:816) {noformat} was (Author: schernolyas): 2018-09-12 17:00:07,753 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-91,dr-opsdb01) ISPN000136: Error executing command PutMapCommand, writing keys [WrappedByteArray{bytes=[B0x4A0B373930393931..[13], hashCode=-1783032400}, WrappedByteArr ay{bytes=[B0x4A0B373936313030..[13], hashCode=-224759146}, WrappedByteArray{bytes=[B0x4A0B373936333936..[13], hashCode=-1470273455}, WrappedByteArray{bytes=[B0x4A0B373931333533..[13], hashCode=-358389069}, WrappedByteArray{bytes=[B0x4A0B373930333234..[13], hashCode=-1187 497688}, WrappedByteArray{bytes=[B0x4A0B373936343036..[13], hashCode=873985031}, WrappedByteArray{bytes=[B0x4A0B373936323331..[13], hashCode=-87622955}, WrappedByteArray{bytes=[B0x4A0B373930333134..[13], hashCode=-2072230806}...<5001 other elements>]: java.lang.ClassCast Exception: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1827) at java.util.HashMap$TreeNode.treeify(HashMap.java:1944) at java.util.HashMap.treeifyBin(HashMap.java:771) at java.util.HashMap.putVal(HashMap.java:643) at java.util.HashMap.put(HashMap.java:611) at org.infinispan.context.impl.NonTxInvocationContext.putLookedUpEntry(NonTxInvocationContext.java:49) at org.infinispan.container.impl.EntryFactoryImpl.wrapExternalEntry(EntryFactoryImpl.java:173) at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.wrapRemoteEntry(BaseDistributionInterceptor.java:233) at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.lambda$remoteGet$1(BaseDistributionInterceptor.java:210) at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656) at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67) at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102) at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50) at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1370) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1273) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125) at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1418) at org.jgroups.JChannel.up(JChannel.java:816) > org.infinispan.persistence.spi.PersistenceException: java.lang.NullPointerException > ----------------------------------------------------------------------------------- > > Key: ISPN-9508 > URL: https://issues.jboss.org/browse/ISPN-9508 > Project: Infinispan > Issue Type: Bug > Affects Versions: 9.3.1.Final > Reporter: Sergey Chernolyas > Attachments: hang_infinispan931.out, hs_err_pid17420.log, server1.log, server2.log, server2.log.2018-09-11 > > > Cache can't start. > See exception: > [org.infinispan.remoting.inboundhandler.NonTotalOrderPerCacheInboundInvocationHandler] (jgroups-30,dr-opsdb01) ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='SEGMENTS', command=PutMapCommand > .... > flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=844429225112 > 147}}, isForwarded=true}}: org.infinispan.persistence.spi.PersistenceException: java.lang.NullPointerException > at org.infinispan.persistence.rocksdb.RocksDBStore.writeBatch(RocksDBStore.java:412) > at org.infinispan.persistence.manager.PersistenceManagerImpl.lambda$writeBatchToAllNonTxStores$17(PersistenceManagerImpl.java:604) > at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) > at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) > at org.infinispan.persistence.manager.PersistenceManagerImpl.writeBatchToAllNonTxStores(PersistenceManagerImpl.java:604) > at org.infinispan.interceptors.impl.CacheWriterInterceptor.processIterableBatch(CacheWriterInterceptor.java:265) > at org.infinispan.interceptors.impl.DistCacheWriterInterceptor.handlePutMapCommandReturn(DistCacheWriterInterceptor.java:93) > at org.infinispan.interceptors.InvocationSuccessAction.apply(InvocationSuccessAction.java:22) > at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118) > at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81) > at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30) > at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) > at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67) > at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102) > at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50) > at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1370) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1273) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125) > at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1418) > at org.jgroups.JChannel.up(JChannel.java:816) > at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:134) > at org.jgroups.stack.Protocol.up(Protocol.java:340) > at org.jgroups.protocols.FORK.up(FORK.java:134) > at org.jgroups.protocols.FRAG3.up(FRAG3.java:171) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:343) > at org.jgroups.protocols.pbcast.GMS.up(GMS.java:865) > at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1003) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:729) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:384) > at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:600) > at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:119) > at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:199) > at org.jgroups.protocols.FD.up(FD.java:212) > at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:252) > at org.jgroups.protocols.Discovery.up(Discovery.java:267) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1248) > at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > ________________ -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9512: ------------------------------- Status: Pull Request Sent (was: Open) Git Pull Request: https://github.com/infinispan/infinispan/pull/6252 > *TxPartitionAndMerge*Test tests hang during teardown > ---------------------------------------------------- > > Key: ISPN-9512 > URL: https://issues.jboss.org/browse/ISPN-9512 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Reporter: Dan Berindei > Assignee: Dan Berindei > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log > > > Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full. > {noformat} > "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93) > at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262) > at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895) > at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453) > at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309) > at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197) > at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54) > at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117) > at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517) > - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus) > at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475) > at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121) > at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151) > at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118) > at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108) > at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473) > at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199) > at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160) > at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91) > at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596) > at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437) > at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245) > at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9483) TEST_PING doesn't trigger merge after JGroups 4.0.13 upgrade

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9483?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9483: ------------------------------- Sprint: Sprint 9.4.0.CR3 > TEST_PING doesn't trigger merge after JGroups 4.0.13 upgrade > ------------------------------------------------------------ > > Key: ISPN-9483 > URL: https://issues.jboss.org/browse/ISPN-9483 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Affects Versions: 9.4.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > > In JGroups 4.0.13.Final, {{MERGE3}} started using the {{ASYNC_DISCOVERY_EVENT}} to find other members. {{TEST_PING}} doesn't handle the event correctly, at least when trace logging is enabled, and the merge never happens. > {{Discovery}} should handle the new event automatically, but it only works if the discovery protocol actively sends out {{GET_MBRS_REQ}} messages and receives {{GET_MBRS_RSP}} messages from other members. {{TEST_PING}} doesn't receive any {{GET_MBRS_RSP}} messages, so {{Discovery.addResponse()}} is never called. > This causes failures in all the tests that split the cluster and heal it, but for some reason CI isn't reporting the failures: > {noformat} > [OK: 70, KO: 1, SKIP: 0] Test failed: org.infinispan.distribution.rehash.RehashAfterPartitionMergeTest.testCachePartition[DIST_SYNC] > java.lang.RuntimeException: Timed out before caches had changed views ([[RehashAfterPartitionMergeTest[DIST_SYNC]-NodeB-45390], [RehashAfterPartitionMergeTest[DIST_SYNC]-NodeD-46782]]) to contain 2 members > at org.infinispan.test.TestingUtil.blockUntilViewsChanged(TestingUtil.java:761) > at org.infinispan.test.TestingUtil.blockUntilViewsChanged(TestingUtil.java:743) > at org.infinispan.distribution.rehash.RehashAfterPartitionMergeTest.testCachePartition(RehashAfterPartitionMergeTest.java:67) > {noformat} > https://ci.infinispan.org/job/Infinispan/job/master/808/consoleFull -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9496) Some xsite tests hang during teardown

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9496?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9496: ------------------------------- Sprint: Sprint 9.4.0.CR3 > Some xsite tests hang during teardown > ------------------------------------- > > Key: ISPN-9496 > URL: https://issues.jboss.org/browse/ISPN-9496 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Affects Versions: 9.4.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > > {noformat} > Test org.infinispan.xsite.statetransfer.failures.RetryMechanismTest.clearContent has been running for more than 300 seconds. Interrupting the test thread and dumping thread stacks of the test suite process and its children. > Test org.infinispan.xsite.CacheOperationsTest.destroy has been running for more than 300 seconds. Interrupting the test thread and dumping thread stacks of the test suite process and its children. > ... > Killed processes 16913 > The forked VM terminated without properly saying goodbye. VM crash or System.exit called? > Error occurred in starting fork, check output in log > Process Exit Code: 143 > Crashed tests: > org.infinispan.eviction.impl.ExceptionEvictionTest > org.infinispan.statetransfer.ClusterTopologyManagerTest > org.infinispan.stream.LocalStreamOffHeapTest > {noformat} > The timeouts are very likely caused by the JGRP-2277 changes. Most of our tests run without any FD* protocol to avoid creating an extra socket + thread, so when the coordinator leaves, the 2nd node *must* receive the leave message from the coordinator or it will never install a view with itself as the coordinator. > This dependency still existed before JGRP-2277, but it appears the view message sent by the coordinator before leaving was somehow more likely to reach the 2nd node than the new leave message. > The "crashed tests" list only includes tests that we know take a very long time to run, so I am assuming that they're not relevant. And unfortunately the mechanism to interrupt long tests still isn't working as it should, the thread dumps are not included in the artifacts. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9488) Jenkins cleanup script can delete the current build's directory

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9488?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9488: ------------------------------- Sprint: Sprint 9.4.0.CR3 > Jenkins cleanup script can delete the current build's directory > --------------------------------------------------------------- > > Key: ISPN-9488 > URL: https://issues.jboss.org/browse/ISPN-9488 > Project: Infinispan > Issue Type: Bug > Components: CI > Affects Versions: 9.4.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Priority: Critical > Labels: testsuite_failure > Fix For: 9.4.0.CR3 > > > Our {{Jenkinsfile}} runs {{cleanup.sh}} (provisioned on each agent via Ansible) to make room for the new build. The idea is to keep the checked-out sources after a build, so that the next build for the same branch is faster, and {{cleanup.sh}} only deletes the workspace directories of old builds if there's less than 10gb of free space. > There is a problem, however: the agent may have less than 10GB of free space after deleting all the old workspace directories, and {{cleanup.sh}} will happily delete the current build's workspace to make more room. Obviously, the build fails afterwards: > {noformat} > ERROR: missing workspace /home/infinispan/workspace/Infinispan_PR-6236-6TTBGFU5OA5XZXKEPJRZI245GOIWTNAUH3HC5M6B36G25UNJPTCA on rhos-infinispan-slave-4.localdomain > {noformat} > Unfortunately we also use `returnOutput: true` when running the cleanup script, so it's not obvious who is deleting the build directory. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9485) RoundRobinBalancingStrategy always starts from server 0

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9485?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9485: ------------------------------- Sprint: Sprint 9.4.0.CR3 > RoundRobinBalancingStrategy always starts from server 0 > ------------------------------------------------------- > > Key: ISPN-9485 > URL: https://issues.jboss.org/browse/ISPN-9485 > Project: Infinispan > Issue Type: Bug > Components: Hot Rod > Affects Versions: 9.4.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Fix For: 9.4.0.CR3 > > > {{RoundRobinBalancingStrategy}} always starts from server 0, and resets back to 0 if a server topology update has less servers. This means if N clients start and immediately add a listener (e.g. for near cache), all N client listeners will be attached to the same server. > We should pick a random server on every server topology update instead, so that near cache listeners are attached to random servers. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9500) ConcurrentSmallIntSet.clear() does not always set the size to 0

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9500?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9500: ------------------------------- Sprint: Sprint 9.4.0.CR3 > ConcurrentSmallIntSet.clear() does not always set the size to 0 > --------------------------------------------------------------- > > Key: ISPN-9500 > URL: https://issues.jboss.org/browse/ISPN-9500 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 9.4.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Fix For: 9.4.0.CR3, 9.3.4.Final > > > {{ConcurrentSmallIntSet.clear()}} processes the array values with a loop that stops when {{value <= 0}}. This breaks when the highest bit is set, leaving the set empty but with {{size() > 0}} and {{isEmpty() == false}}. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9505) Off-heap data container puts all the entries in a single bucket

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9505?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9505: ------------------------------- Sprint: Sprint 9.4.0.CR3 > Off-heap data container puts all the entries in a single bucket > --------------------------------------------------------------- > > Key: ISPN-9505 > URL: https://issues.jboss.org/browse/ISPN-9505 > Project: Infinispan > Issue Type: Bug > Components: Core > Reporter: Dan Berindei > Assignee: William Burns > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > > Having lots of elements in the same bucket slows things down quite a bit, especially with trace logging enabled: inserting the 1000th entry requires 13k log messages. > This means {{OffHeapSingleNodeTest.testLotsOfWrites}} takes > 300 seconds with trace logging enabled, and the long test watcher crashes the build. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9501) AbstractCacheStream.performOperationRehashAware() can hang

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9501?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9501: ------------------------------- Sprint: Sprint 9.4.0.CR3 > AbstractCacheStream.performOperationRehashAware() can hang > ---------------------------------------------------------- > > Key: ISPN-9501 > URL: https://issues.jboss.org/browse/ISPN-9501 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 9.4.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Fix For: 9.4.0.CR3, 9.3.4.Final > > > There are actually 2 different issues: > # {{segmentsToProcess}} reuses the {{remoteResults.lostSegments}} {{ConcurrentSmallIntSet}} instance, so when {{remoteResults.lostSegments}} is cleared, {{segmentsToProcess}} is also cleared. > # Because of ISPN-9500, {{segmentsToProcess.isEmpty()}} keeps returning {{false}}, so {{performOperationRehashAware()}} keeps waiting for a newer topology. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown

by Dan Berindei (JIRA)

[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-9512: ------------------------------- Sprint: Sprint 9.4.0.CR3 > *TxPartitionAndMerge*Test tests hang during teardown > ---------------------------------------------------- > > Key: ISPN-9512 > URL: https://issues.jboss.org/browse/ISPN-9512 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Reporter: Dan Berindei > Assignee: Dan Berindei > Labels: testsuite_stability > Fix For: 9.4.0.CR3 > > Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log > > > Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full. > {noformat} > "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93) > at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262) > at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895) > at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453) > at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309) > at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197) > at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54) > at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117) > at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517) > - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus) > at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475) > at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121) > at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151) > at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118) > at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108) > at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473) > at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199) > at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160) > at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038) > at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) > at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) > at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) > at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91) > at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596) > at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437) > at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299) > at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245) > at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642) > - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus) > at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494) > at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source) > at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175) > at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37) > at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 6 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues September 2018