[infinispan-dev] 9.2 EmbeddedCacheManager blocked at shutdown

Thomas SEGISMONT tsegismont at gmail.com
Mon Apr 23 04:10:24 EDT 2018


Hi Will,

I will create the JIRA and provide the TRACE level logs as soon as possible.

Thanks for the update.


2018-04-20 20:10 GMT+02:00 William Burns <mudokonman at gmail.com>:

> On Fri, Apr 20, 2018 at 9:43 AM Thomas SEGISMONT <tsegismont at gmail.com>
> wrote:
>
>> I tried our test suite on a slower machine (iMac from 2011). It passes
>> consistently there.
>>
>> On my laptop, I keep seeing this from time to time (in different tests):
>>
>> 2018-04-19T19:53:09.513 WARN [Context=org.infinispan.LOCKS]ISPN000320:
>> After merge (or coordinator change), cache still hasn't recovered a
>> majority of members and must stay in degraded mode. Current members are
>> [sombrero-19385], lost members are [sombrero-42917], stable members are
>> [sombrero-42917, sombrero-19385]
>>
>
> I would expect the nodes to be leaving gracefully, which shouldn't cause a
> merge. I am not sure how your test is producing that. Can you produce a
> TRACE log and a JIRA for it?
>
> However if there is a merge, if you go down a single node it will always
> be in DEGRADED mode, when using partition handling. This is due to not
> having a simple majority as described in http://infinispan.org/docs/
> stable/user_guide/user_guide.html#partition_handling
>
>
>>
>> It happens when we shutdown nodes one after the other (even when waiting
>> for cluster status to be "healthy" plus extra 2 seconds).
>>
>> After that the nodes remains blocked in DefaultCacheManager.stop
>>
>> 2018-04-19T19:49:29.242 AVERTISSEMENT Thread
>> Thread[vert.x-worker-thread-5,5,main] has been blocked for 60774 ms,
>> time limit is 60000
>> io.vertx.core.VertxException: Thread blocked
>>     at sun.misc.Unsafe.park(Native Method)
>>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>>     at java.util.concurrent.CompletableFuture$Signaller.
>> block(CompletableFuture.java:1693)
>>     at java.util.concurrent.ForkJoinPool.managedBlock(
>> ForkJoinPool.java:3323)
>>     at java.util.concurrent.CompletableFuture.waitingGet(
>> CompletableFuture.java:1729)
>>     at java.util.concurrent.CompletableFuture.join(
>> CompletableFuture.java:1934)
>>     at org.infinispan.manager.DefaultCacheManager.terminate(
>> DefaultCacheManager.java:688)
>>     at org.infinispan.manager.DefaultCacheManager.stopCaches(
>> DefaultCacheManager.java:734)
>>     at org.infinispan.manager.DefaultCacheManager.stop(
>> DefaultCacheManager.java:711)
>>     at io.vertx.ext.cluster.infinispan.InfinispanClusterManager.
>> lambda$leave$5(InfinispanClusterManager.java:285)
>>     at io.vertx.ext.cluster.infinispan.InfinispanClusterManager$$
>> Lambda$421/578931659.handle(Unknown Source)
>>     at io.vertx.core.impl.ContextImpl.lambda$
>> executeBlocking$1(ContextImpl.java:265)
>>     at io.vertx.core.impl.ContextImpl$$Lambda$27/1330754528.run(Unknown
>> Source)
>>
>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1149)
>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:624)
>>     at io.netty.util.concurrent.FastThreadLocalRunnable.run(
>> FastThreadLocalRunnable.java:30)
>>     at java.lang.Thread.run(Thread.java:748)
>>
>
> This looks like the exact issue that Radim mentioned with
> https://issues.jboss.org/browse/ISPN-8859.
>
>
>>
>>
>>
>> 2018-04-18 17:00 GMT+02:00 Thomas SEGISMONT <tsegismont at gmail.com>:
>>
>>> So here's the Circular Referenced Suppressed Exception
>>>
>>> [stateTransferExecutor-thread--p221-t33] 2018-04-18T16:15:06.662+02:00
>>> WARN [org.infinispan.statetransfer.InboundTransferTask]  ISPN000210:
>>> Failed to request state of cache __vertx.subs from node sombrero-25286,
>>> segments {0}
>>> org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400:
>>> Node sombrero-25286 was suspected
>>>     at org.infinispan.remoting.transport.ResponseCollectors.
>>> remoteNodeSuspected(ResponseCollectors.java:33)
>>>     at org.infinispan.remoting.transport.impl.SingleResponseCollector.
>>> targetNotFound(SingleResponseCollector.java:31)
>>>     at org.infinispan.remoting.transport.impl.SingleResponseCollector.
>>> targetNotFound(SingleResponseCollector.java:17)
>>>     at org.infinispan.remoting.transport.ValidSingleResponseCollector.
>>> addResponse(ValidSingleResponseCollector.java:23)
>>>     at org.infinispan.remoting.transport.impl.SingleTargetRequest.
>>> receiveResponse(SingleTargetRequest.java:51)
>>>     at org.infinispan.remoting.transport.impl.
>>> SingleTargetRequest.onNewView(SingleTargetRequest.java:42)
>>>     at org.infinispan.remoting.transport.jgroups.
>>> JGroupsTransport.addRequest(JGroupsTransport.java:921)
>>>     at org.infinispan.remoting.transport.jgroups.JGroupsTransport.
>>> invokeCommand(JGroupsTransport.java:815)
>>>     at org.infinispan.remoting.transport.jgroups.JGroupsTransport.
>>> invokeCommand(JGroupsTransport.java:123)
>>>     at org.infinispan.remoting.rpc.RpcManagerImpl.invokeCommand(
>>> RpcManagerImpl.java:138)
>>>     at org.infinispan.statetransfer.InboundTransferTask.startTransfer(
>>> InboundTransferTask.java:134)
>>>     at org.infinispan.statetransfer.InboundTransferTask.requestSegments(
>>> InboundTransferTask.java:113)
>>>     at org.infinispan.conflict.impl.StateReceiverImpl$
>>> SegmentRequest.lambda$requestState$2(StateReceiverImpl.java:164)
>>>     at org.infinispan.executors.LimitedExecutor.lambda$executeAsync$1(
>>> LimitedExecutor.java:101)
>>>     at org.infinispan.executors.LimitedExecutor.runTasks(
>>> LimitedExecutor.java:144)
>>>     at org.infinispan.executors.LimitedExecutor.access$100(
>>> LimitedExecutor.java:33)
>>>     at org.infinispan.executors.LimitedExecutor$Runner.run(
>>> LimitedExecutor.java:174)
>>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1149)
>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:624)
>>>     at java.lang.Thread.run(Thread.java:748)
>>>     Suppressed: java.util.concurrent.ExecutionException:
>>> org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400:
>>> Node sombrero-25286 was suspected
>>>         at java.util.concurrent.CompletableFuture.reportGet(
>>> CompletableFuture.java:357)
>>>         at java.util.concurrent.CompletableFuture.get(
>>> CompletableFuture.java:1915)
>>>         at org.infinispan.util.concurrent.CompletableFutures.
>>> await(CompletableFutures.java:82)
>>>         at org.infinispan.remoting.rpc.RpcManagerImpl.blocking(
>>> RpcManagerImpl.java:260)
>>>         ... 10 more
>>>     [CIRCULAR REFERENCE:org.infinispan.remoting.transport.jgroups.SuspectException:
>>> ISPN000400: Node sombrero-25286 was suspected]
>>>
>>> It does not happen with 9.2.0.Final and prevents from using ISPN
>>> embedded with logback. Do you want me to file an issue ?
>>>
>>> 2018-04-18 11:45 GMT+02:00 Thomas SEGISMONT <tsegismont at gmail.com>:
>>>
>>>> Hi folks,
>>>>
>>>> Sorry I've been busy on other things and couldn't get back to you
>>>> earlier.
>>>>
>>>> I tried running vertx-infinispan test suite with 9.2.1.Final today.
>>>> There are some problems still but I can't say which ones yet because I hit:
>>>> https://jira.qos.ch/browse/LOGBACK-1027
>>>>
>>>> We use logback for test logs and all I get is:
>>>>
>>>> 2018-04-18 11:37:46,678 [stateTransferExecutor-thread--p4453-t24]
>>>> ERROR o.i.executors.LimitedExecutor - Exception in task
>>>> java.lang.StackOverflowError: null
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:54)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:60)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:72)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:60)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:72)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:60)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:72)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:60)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:72)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:60)
>>>>     at ch.qos.logback.classic.spi.ThrowableProxy.<init>(
>>>> ThrowableProxy.java:72)
>>>> ... so on so forth
>>>>
>>>> I will run the suite again without logback and tell you what the actual
>>>> problem is.
>>>>
>>>> Regards,
>>>> Thomas
>>>>
>>>> 2018-03-27 11:15 GMT+02:00 Pedro Ruivo <pedro at infinispan.org>:
>>>>
>>>>> JIRA: https://issues.jboss.org/browse/ISPN-8994
>>>>>
>>>>> On 27-03-2018 10:08, Pedro Ruivo wrote:
>>>>> >
>>>>> >
>>>>> > On 27-03-2018 09:03, Sebastian Laskawiec wrote:
>>>>> >> At the moment, the cluster health status checker enumerates all
>>>>> caches
>>>>> >> in the cache manager [1] and checks whether those cashes are
>>>>> running
>>>>> >> and not in degraded more [2].
>>>>> >>
>>>>> >> I'm not sure how counter caches have been implemented. One thing is
>>>>> >> for sure - they should be taken into account in this loop [3].
>>>>> >
>>>>> > The private caches aren't listed by CacheManager.getCacheNames(). We
>>>>> > have to check them via InternalCacheRegistry.
>>>>> getInternalCacheNames().
>>>>> >
>>>>> > I'll open a JIRA if you don't mind :)
>>>>> >
>>>>> >>
>>>>> >> [1]
>>>>> >> https://github.com/infinispan/infinispan/blob/master/core/
>>>>> src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L22
>>>>> >>
>>>>> >> [2]
>>>>> >> https://github.com/infinispan/infinispan/blob/master/core/
>>>>> src/main/java/org/infinispan/health/impl/CacheHealthImpl.java#L25
>>>>> >>
>>>>> >> [3]
>>>>> >> https://github.com/infinispan/infinispan/blob/master/core/
>>>>> src/main/java/org/infinispan/health/impl/ClusterHealthImpl.
>>>>> java#L23-L24
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>
>>>>
>>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20180423/5e54e815/attachment-0001.html 


More information about the infinispan-dev mailing list