[JBoss JIRA] (ISPN-9508) org.infinispan.persistence.spi.PersistenceException: java.lang.NullPointerException
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9508?page=com.atlassian.jira.plugin.... ]
Dan Berindei edited comment on ISPN-9508 at 9/13/18 7:11 AM:
-------------------------------------------------------------
{noformat}
2018-09-12 17:00:07,753 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-91,dr-opsdb01) ISPN000136: Error executing command PutMapCommand, writing keys [WrappedByteArray{bytes=[B0x4A0B373930393931..[13], hashCode=-1783032400}, WrappedByteArr
ay{bytes=[B0x4A0B373936313030..[13], hashCode=-224759146}, WrappedByteArray{bytes=[B0x4A0B373936333936..[13], hashCode=-1470273455}, WrappedByteArray{bytes=[B0x4A0B373931333533..[13], hashCode=-358389069}, WrappedByteArray{bytes=[B0x4A0B373930333234..[13], hashCode=-1187
497688}, WrappedByteArray{bytes=[B0x4A0B373936343036..[13], hashCode=873985031}, WrappedByteArray{bytes=[B0x4A0B373936323331..[13], hashCode=-87622955}, WrappedByteArray{bytes=[B0x4A0B373930333134..[13], hashCode=-2072230806}...<5001 other elements>]: java.lang.ClassCast
Exception: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1827)
at java.util.HashMap$TreeNode.treeify(HashMap.java:1944)
at java.util.HashMap.treeifyBin(HashMap.java:771)
at java.util.HashMap.putVal(HashMap.java:643)
at java.util.HashMap.put(HashMap.java:611)
at org.infinispan.context.impl.NonTxInvocationContext.putLookedUpEntry(NonTxInvocationContext.java:49)
at org.infinispan.container.impl.EntryFactoryImpl.wrapExternalEntry(EntryFactoryImpl.java:173)
at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.wrapRemoteEntry(BaseDistributionInterceptor.java:233)
at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.lambda$remoteGet$1(BaseDistributionInterceptor.java:210)
at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102)
at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50)
at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1370)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1273)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1418)
at org.jgroups.JChannel.up(JChannel.java:816)
{noformat}
was (Author: schernolyas):
2018-09-12 17:00:07,753 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-91,dr-opsdb01) ISPN000136: Error executing command PutMapCommand, writing keys [WrappedByteArray{bytes=[B0x4A0B373930393931..[13], hashCode=-1783032400}, WrappedByteArr
ay{bytes=[B0x4A0B373936313030..[13], hashCode=-224759146}, WrappedByteArray{bytes=[B0x4A0B373936333936..[13], hashCode=-1470273455}, WrappedByteArray{bytes=[B0x4A0B373931333533..[13], hashCode=-358389069}, WrappedByteArray{bytes=[B0x4A0B373930333234..[13], hashCode=-1187
497688}, WrappedByteArray{bytes=[B0x4A0B373936343036..[13], hashCode=873985031}, WrappedByteArray{bytes=[B0x4A0B373936323331..[13], hashCode=-87622955}, WrappedByteArray{bytes=[B0x4A0B373930333134..[13], hashCode=-2072230806}...<5001 other elements>]: java.lang.ClassCast
Exception: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1827)
at java.util.HashMap$TreeNode.treeify(HashMap.java:1944)
at java.util.HashMap.treeifyBin(HashMap.java:771)
at java.util.HashMap.putVal(HashMap.java:643)
at java.util.HashMap.put(HashMap.java:611)
at org.infinispan.context.impl.NonTxInvocationContext.putLookedUpEntry(NonTxInvocationContext.java:49)
at org.infinispan.container.impl.EntryFactoryImpl.wrapExternalEntry(EntryFactoryImpl.java:173)
at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.wrapRemoteEntry(BaseDistributionInterceptor.java:233)
at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.lambda$remoteGet$1(BaseDistributionInterceptor.java:210)
at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656)
at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102)
at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50)
at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1370)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1273)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1418)
at org.jgroups.JChannel.up(JChannel.java:816)
> org.infinispan.persistence.spi.PersistenceException: java.lang.NullPointerException
> -----------------------------------------------------------------------------------
>
> Key: ISPN-9508
> URL: https://issues.jboss.org/browse/ISPN-9508
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 9.3.1.Final
> Reporter: Sergey Chernolyas
> Attachments: hang_infinispan931.out, hs_err_pid17420.log, server1.log, server2.log, server2.log.2018-09-11
>
>
> Cache can't start.
> See exception:
> [org.infinispan.remoting.inboundhandler.NonTotalOrderPerCacheInboundInvocationHandler] (jgroups-30,dr-opsdb01) ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='SEGMENTS', command=PutMapCommand
> ....
> flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=844429225112
> 147}}, isForwarded=true}}: org.infinispan.persistence.spi.PersistenceException: java.lang.NullPointerException
> at org.infinispan.persistence.rocksdb.RocksDBStore.writeBatch(RocksDBStore.java:412)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.lambda$writeBatchToAllNonTxStores$17(PersistenceManagerImpl.java:604)
> at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
> at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
> at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
> at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.writeBatchToAllNonTxStores(PersistenceManagerImpl.java:604)
> at org.infinispan.interceptors.impl.CacheWriterInterceptor.processIterableBatch(CacheWriterInterceptor.java:265)
> at org.infinispan.interceptors.impl.DistCacheWriterInterceptor.handlePutMapCommandReturn(DistCacheWriterInterceptor.java:93)
> at org.infinispan.interceptors.InvocationSuccessAction.apply(InvocationSuccessAction.java:22)
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118)
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81)
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30)
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102)
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50)
> at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1370)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1273)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1418)
> at org.jgroups.JChannel.up(JChannel.java:816)
> at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:134)
> at org.jgroups.stack.Protocol.up(Protocol.java:340)
> at org.jgroups.protocols.FORK.up(FORK.java:134)
> at org.jgroups.protocols.FRAG3.up(FRAG3.java:171)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:343)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:865)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
> at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1003)
> at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:729)
> at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:384)
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:600)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:119)
> at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:199)
> at org.jgroups.protocols.FD.up(FD.java:212)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:252)
> at org.jgroups.protocols.Discovery.up(Discovery.java:267)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1248)
> at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ________________
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9512:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/6252
> *TxPartitionAndMerge*Test tests hang during teardown
> ----------------------------------------------------
>
> Key: ISPN-9512
> URL: https://issues.jboss.org/browse/ISPN-9512
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.4.0.CR3
>
> Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log
>
>
> Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full.
> {noformat}
> "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93)
> at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262)
> at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895)
> at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117)
> at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517)
> - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus)
> at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475)
> at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121)
> at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151)
> at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118)
> at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108)
> at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199)
> at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160)
> at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
> at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596)
> at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437)
> at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245)
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9483) TEST_PING doesn't trigger merge after JGroups 4.0.13 upgrade
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9483?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9483:
-------------------------------
Sprint: Sprint 9.4.0.CR3
> TEST_PING doesn't trigger merge after JGroups 4.0.13 upgrade
> ------------------------------------------------------------
>
> Key: ISPN-9483
> URL: https://issues.jboss.org/browse/ISPN-9483
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.4.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.4.0.CR3
>
>
> In JGroups 4.0.13.Final, {{MERGE3}} started using the {{ASYNC_DISCOVERY_EVENT}} to find other members. {{TEST_PING}} doesn't handle the event correctly, at least when trace logging is enabled, and the merge never happens.
> {{Discovery}} should handle the new event automatically, but it only works if the discovery protocol actively sends out {{GET_MBRS_REQ}} messages and receives {{GET_MBRS_RSP}} messages from other members. {{TEST_PING}} doesn't receive any {{GET_MBRS_RSP}} messages, so {{Discovery.addResponse()}} is never called.
> This causes failures in all the tests that split the cluster and heal it, but for some reason CI isn't reporting the failures:
> {noformat}
> [OK: 70, KO: 1, SKIP: 0] Test failed: org.infinispan.distribution.rehash.RehashAfterPartitionMergeTest.testCachePartition[DIST_SYNC]
> java.lang.RuntimeException: Timed out before caches had changed views ([[RehashAfterPartitionMergeTest[DIST_SYNC]-NodeB-45390], [RehashAfterPartitionMergeTest[DIST_SYNC]-NodeD-46782]]) to contain 2 members
> at org.infinispan.test.TestingUtil.blockUntilViewsChanged(TestingUtil.java:761)
> at org.infinispan.test.TestingUtil.blockUntilViewsChanged(TestingUtil.java:743)
> at org.infinispan.distribution.rehash.RehashAfterPartitionMergeTest.testCachePartition(RehashAfterPartitionMergeTest.java:67)
> {noformat}
> https://ci.infinispan.org/job/Infinispan/job/master/808/consoleFull
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9496) Some xsite tests hang during teardown
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9496?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9496:
-------------------------------
Sprint: Sprint 9.4.0.CR3
> Some xsite tests hang during teardown
> -------------------------------------
>
> Key: ISPN-9496
> URL: https://issues.jboss.org/browse/ISPN-9496
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.4.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.4.0.CR3
>
>
> {noformat}
> Test org.infinispan.xsite.statetransfer.failures.RetryMechanismTest.clearContent has been running for more than 300 seconds. Interrupting the test thread and dumping thread stacks of the test suite process and its children.
> Test org.infinispan.xsite.CacheOperationsTest.destroy has been running for more than 300 seconds. Interrupting the test thread and dumping thread stacks of the test suite process and its children.
> ...
> Killed processes 16913
> The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
> Error occurred in starting fork, check output in log
> Process Exit Code: 143
> Crashed tests:
> org.infinispan.eviction.impl.ExceptionEvictionTest
> org.infinispan.statetransfer.ClusterTopologyManagerTest
> org.infinispan.stream.LocalStreamOffHeapTest
> {noformat}
> The timeouts are very likely caused by the JGRP-2277 changes. Most of our tests run without any FD* protocol to avoid creating an extra socket + thread, so when the coordinator leaves, the 2nd node *must* receive the leave message from the coordinator or it will never install a view with itself as the coordinator.
> This dependency still existed before JGRP-2277, but it appears the view message sent by the coordinator before leaving was somehow more likely to reach the 2nd node than the new leave message.
> The "crashed tests" list only includes tests that we know take a very long time to run, so I am assuming that they're not relevant. And unfortunately the mechanism to interrupt long tests still isn't working as it should, the thread dumps are not included in the artifacts.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9488) Jenkins cleanup script can delete the current build's directory
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9488?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9488:
-------------------------------
Sprint: Sprint 9.4.0.CR3
> Jenkins cleanup script can delete the current build's directory
> ---------------------------------------------------------------
>
> Key: ISPN-9488
> URL: https://issues.jboss.org/browse/ISPN-9488
> Project: Infinispan
> Issue Type: Bug
> Components: CI
> Affects Versions: 9.4.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: testsuite_failure
> Fix For: 9.4.0.CR3
>
>
> Our {{Jenkinsfile}} runs {{cleanup.sh}} (provisioned on each agent via Ansible) to make room for the new build. The idea is to keep the checked-out sources after a build, so that the next build for the same branch is faster, and {{cleanup.sh}} only deletes the workspace directories of old builds if there's less than 10gb of free space.
> There is a problem, however: the agent may have less than 10GB of free space after deleting all the old workspace directories, and {{cleanup.sh}} will happily delete the current build's workspace to make more room. Obviously, the build fails afterwards:
> {noformat}
> ERROR: missing workspace /home/infinispan/workspace/Infinispan_PR-6236-6TTBGFU5OA5XZXKEPJRZI245GOIWTNAUH3HC5M6B36G25UNJPTCA on rhos-infinispan-slave-4.localdomain
> {noformat}
> Unfortunately we also use `returnOutput: true` when running the cleanup script, so it's not obvious who is deleting the build directory.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9485) RoundRobinBalancingStrategy always starts from server 0
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9485?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9485:
-------------------------------
Sprint: Sprint 9.4.0.CR3
> RoundRobinBalancingStrategy always starts from server 0
> -------------------------------------------------------
>
> Key: ISPN-9485
> URL: https://issues.jboss.org/browse/ISPN-9485
> Project: Infinispan
> Issue Type: Bug
> Components: Hot Rod
> Affects Versions: 9.4.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.4.0.CR3
>
>
> {{RoundRobinBalancingStrategy}} always starts from server 0, and resets back to 0 if a server topology update has less servers. This means if N clients start and immediately add a listener (e.g. for near cache), all N client listeners will be attached to the same server.
> We should pick a random server on every server topology update instead, so that near cache listeners are attached to random servers.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9501) AbstractCacheStream.performOperationRehashAware() can hang
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9501?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9501:
-------------------------------
Sprint: Sprint 9.4.0.CR3
> AbstractCacheStream.performOperationRehashAware() can hang
> ----------------------------------------------------------
>
> Key: ISPN-9501
> URL: https://issues.jboss.org/browse/ISPN-9501
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.4.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.4.0.CR3, 9.3.4.Final
>
>
> There are actually 2 different issues:
> # {{segmentsToProcess}} reuses the {{remoteResults.lostSegments}} {{ConcurrentSmallIntSet}} instance, so when {{remoteResults.lostSegments}} is cleared, {{segmentsToProcess}} is also cleared.
> # Because of ISPN-9500, {{segmentsToProcess.isEmpty()}} keeps returning {{false}}, so {{performOperationRehashAware()}} keeps waiting for a newer topology.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (ISPN-9512) *TxPartitionAndMerge*Test tests hang during teardown
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9512?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9512:
-------------------------------
Sprint: Sprint 9.4.0.CR3
> *TxPartitionAndMerge*Test tests hang during teardown
> ----------------------------------------------------
>
> Key: ISPN-9512
> URL: https://issues.jboss.org/browse/ISPN-9512
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.4.0.CR3
>
> Attachments: master_20180913-1119_PessimisticTxPartitionAndMergeDuringRollbackTest-infinispan-core.log.gz, threaddump-org_infinispan_partitionhandling_PessimisticTxPartitionAndMergeDuringRollbackTest_clearContent-2018-09-13-13828.log
>
>
> Not sure what changed recently, but the thread dumps show a state transfer executor thread blocked waiting for a clustered listeners response. The stack includes two instances of {{ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution()}}, which suggests that at some point all the state transfer executor threads (6) and async transport threads (4) were busy, and the transport thread pool queue (10) was also full.
> {noformat}
> "stateTransferExecutor-thread-PessimisticTxPartitionAndMergeDuringRollbackTest-NodeC-p57758-t1" #192601 daemon prio=5 os_prio=0 tid=0x00007f7094031800 nid=0x5b27 waiting on condition [0x00007f70190ce000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000d470b0f8> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:93)
> at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:262)
> at org.infinispan.statetransfer.StateConsumerImpl.getClusterListeners(StateConsumerImpl.java:895)
> at org.infinispan.statetransfer.StateConsumerImpl.fetchClusterListeners(StateConsumerImpl.java:453)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:309)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:197)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:54)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:117)
> at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:517)
> - locked <0x00000000cc304f88> (a org.infinispan.topology.LocalCacheStatus)
> at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:475)
> at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$429/1368424830.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at org.infinispan.executors.LazyInitializingExecutorService.execute(LazyInitializingExecutorService.java:121)
> at org.infinispan.executors.LimitedExecutor.tryExecute(LimitedExecutor.java:151)
> at org.infinispan.executors.LimitedExecutor.executeInternal(LimitedExecutor.java:118)
> at org.infinispan.executors.LimitedExecutor.execute(LimitedExecutor.java:108)
> at org.infinispan.topology.LocalTopologyManagerImpl.handleRebalance(LocalTopologyManagerImpl.java:473)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:199)
> at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160)
> at org.infinispan.commands.ReplicableCommand.invoke(ReplicableCommand.java:44)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$executeOnClusterAsync$5(ClusterTopologyManagerImpl.java:600)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$304/909965247.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:2038)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
> at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
> at org.infinispan.executors.LazyInitializingExecutorService.submit(LazyInitializingExecutorService.java:91)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterAsync(ClusterTopologyManagerImpl.java:596)
> at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:437)
> at org.infinispan.topology.ClusterCacheStatus.startQueuedRebalance(ClusterCacheStatus.java:903)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterCacheStatus.queueRebalance(ClusterCacheStatus.java:140)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.updateMembersAndRebalance(PreferConsistencyStrategy.java:299)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:245)
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:642)
> - locked <0x00000000cc305138> (a org.infinispan.topology.ClusterCacheStatus)
> at org.infinispan.topology.ClusterTopologyManagerImpl.lambda$recoverClusterStatus$4(ClusterTopologyManagerImpl.java:494)
> at org.infinispan.topology.ClusterTopologyManagerImpl$$Lambda$578/46555845.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)
> at org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:37)
> at org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> All partition and merge tests seem to be affected: PessimisticTxPartitionAndMergeDuringPrepareTest, PessimisticTxPartitionAndMergeDuringRollbackTest, PessimisticTxPartitionAndMergeDuringRuntimeTest, OptimisticTxPartitionAndMergeDuringCommitTest, OptimisticTxPartitionAndMergeDuringPrepareTest, and OptimisticTxPartitionAndMergeDuringRollbackTest.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months