[infinispan-issues] [JBoss JIRA] (ISPN-10238) RemoteCacheManager.stop() hangs if a client thread is waiting for a server response

Tristan Tarrant (Jira) issues at jboss.org
Sun Sep 15 08:48:03 EDT 2019


     [ https://issues.jboss.org/browse/ISPN-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tristan Tarrant updated ISPN-10238:
-----------------------------------
    Fix Version/s: 10.0.0.CR3
                       (was: 10.0.0.CR2)


> RemoteCacheManager.stop() hangs if a client thread is waiting for a server response
> -----------------------------------------------------------------------------------
>
>                 Key: ISPN-10238
>                 URL: https://issues.jboss.org/browse/ISPN-10238
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Server, Test Suite - Server
>    Affects Versions: 10.0.0.Beta3, 9.4.14.Final
>            Reporter: Dan Berindei
>            Priority: Major
>              Labels: testsuite_stability
>             Fix For: 10.0.0.CR3
>
>
> One of our integration tests performs a blocking {{RemoteCache.size()}} operation on the thread where another asynchronous operation was completed (a {{HotRod-client-async-pool}} thread):
> {code:title=EvictionIT}
>       CompletableFuture res = rc.putAllAsync(entries);
>       res.thenRun(() -> assertEquals(3, rc.size()));
> {code}
> The test then finishes, but doesn't stop the {{RemoteCacheManager}}. When I changed the test to stop the {{RemoteCacheManager}}, the test started hanging:
> {noformat}
> "HotRod-client-async-pool-139-1" #2880 daemon prio=5 os_prio=0 cpu=434.56ms elapsed=1621.24s tid=0x00007f43a6b99800 nid=0x19c0 waiting on condition  [0x00007f42ec9fd000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> 	at jdk.internal.misc.Unsafe.park(java.base at 11.0.3/Native Method)
> 	- parking to wait for  <0x00000000d3321350> (a java.util.concurrent.CompletableFuture$Signaller)
> 	at java.util.concurrent.locks.LockSupport.parkNanos(java.base at 11.0.3/LockSupport.java:234)
> 	at java.util.concurrent.CompletableFuture$Signaller.block(java.base at 11.0.3/CompletableFuture.java:1798)
> 	at java.util.concurrent.ForkJoinPool.managedBlock(java.base at 11.0.3/ForkJoinPool.java:3128)
> 	at java.util.concurrent.CompletableFuture.timedGet(java.base at 11.0.3/CompletableFuture.java:1868)
> 	at java.util.concurrent.CompletableFuture.get(java.base at 11.0.3/CompletableFuture.java:2021)
> 	at org.infinispan.client.hotrod.impl.Util.await(Util.java:46)
> 	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.size(RemoteCacheImpl.java:307)
> 	at org.infinispan.server.test.eviction.EvictionIT.lambda$testPutAllAsyncEviction$0(EvictionIT.java:73)
> 	at org.infinispan.server.test.eviction.EvictionIT$$Lambda$347/0x000000010074a440.run(Unknown Source)
> 	at java.util.concurrent.CompletableFuture$UniRun.tryFire(java.base at 11.0.3/CompletableFuture.java:783)
> 	at java.util.concurrent.CompletableFuture.postComplete(java.base at 11.0.3/CompletableFuture.java:506)
> 	at java.util.concurrent.CompletableFuture.complete(java.base at 11.0.3/CompletableFuture.java:2073)
> 	at org.infinispan.client.hotrod.impl.operations.HotRodOperation.complete(HotRodOperation.java:162)
> 	at org.infinispan.client.hotrod.impl.operations.PutAllOperation.acceptResponse(PutAllOperation.java:83)
> 	at org.infinispan.client.hotrod.impl.transport.netty.HeaderDecoder.decode(HeaderDecoder.java:144)
> 	at org.infinispan.client.hotrod.impl.transport.netty.HintedReplayingDecoder.callDecode(HintedReplayingDecoder.java:94)
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> 	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
> 	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
> 	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799)
> 	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:421)
> 	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:321)
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base at 11.0.3/ThreadPoolExecutor.java:1128)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base at 11.0.3/ThreadPoolExecutor.java:628)
> 	at java.lang.Thread.run(java.base at 11.0.3/Thread.java:834)
>    Locked ownable synchronizers:
> 	- <0x00000000ca248c30> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "main" #1 prio=5 os_prio=0 cpu=37300.10ms elapsed=2911.99s tid=0x00007f43a4023000 nid=0x37f7 in Object.wait()  [0x00007f43a9c21000]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(java.base at 11.0.3/Native Method)
> 	- waiting on <no object reference available>
> 	at java.lang.Object.wait(java.base at 11.0.3/Object.java:328)
> 	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:231)
> 	- waiting to re-lock in wait() <0x00000000ca174af8> (a io.netty.util.concurrent.DefaultPromise)
> 	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:33)
> 	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:32)
> 	at org.infinispan.client.hotrod.impl.transport.netty.ChannelFactory.destroy(ChannelFactory.java:216)
> 	at org.infinispan.client.hotrod.RemoteCacheManager.stop(RemoteCacheManager.java:365)
> 	at org.infinispan.client.hotrod.RemoteCacheManager.close(RemoteCacheManager.java:513)
> 	at org.infinispan.commons.junit.ClassResource.lambda$new$0(ClassResource.java:24)
> 	at org.infinispan.commons.junit.ClassResource$$Lambda$286/0x0000000100573040.accept(Unknown Source)
> 	at org.infinispan.commons.junit.ClassResource.after(ClassResource.java:41)
> 	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:50)
> 	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 	at org.jboss.arquillian.junit.Arquillian.run(Arquillian.java:167)
> 	at org.junit.runners.Suite.runChild(Suite.java:128)
> 	at org.junit.runners.Suite.runChild(Suite.java:27)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 	at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
> 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
> 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
> 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
> 	at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
> 	at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158)
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
>    Locked ownable synchronizers:
> 	- None
> {noformat}
> {{HotRod-client-async-pool}} threads are not appropriate for doing blocking cache operations at any time, but we need to do more than just change the test:
> * We need an asynchronous {{RemoteCache.size()}} alternative
> * Currently blocking operations like {{size()}} wait for a response from the server for 1 day, they should wait for a much smaller (and configurable) timeout.
> * {{RemoteCacheManager.stop()}} should have a timeout as well, but more importantly it should cancel any pending operation.
> * We should consider running all application code on a separate thread pool.



--
This message was sent by Atlassian Jira
(v7.13.5#713005)


More information about the infinispan-issues mailing list