[infinispan-issues] [JBoss JIRA] (ISPN-5420) Thread pools are depleted by ClusterTopologyManagerImpl.waitForView() and causing deadlock

RH Bugzilla Integration (JIRA) issues at jboss.org
Tue Jun 23 08:25:39 EDT 2015


    [ https://issues.jboss.org/browse/ISPN-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082700#comment-13082700 ] 

RH Bugzilla Integration commented on ISPN-5420:
-----------------------------------------------

Dave Stahl <dstahl at redhat.com> changed the Status of [bug 1208429|https://bugzilla.redhat.com/show_bug.cgi?id=1208429] from VERIFIED to CLOSED

> Thread pools are depleted by ClusterTopologyManagerImpl.waitForView() and causing deadlock
> ------------------------------------------------------------------------------------------
>
>                 Key: ISPN-5420
>                 URL: https://issues.jboss.org/browse/ISPN-5420
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 6.0.2.Final, 7.1.1.Final
>            Reporter: Dan Berindei
>            Assignee: Dan Berindei
>            Priority: Critical
>             Fix For: 8.0.0.Alpha1
>
>
> The join process was designed in the idea that a node would start its caches in sequential order, so {{ClusterTopologyManager.waitForView()}} would block at most once for each joining node. However, WildFly actually starts {{2 * Runtime.availableProcessors()}} caches in parallel, and this can be a problem when the machine has a lot of cores and multiple nodes.
> {{ClustertopologyManager.handleClusterView()}} only updates the {{viewId}} after it updated the cache topologies of each cache AND after it confirmed the availability of all the nodes with a {{POLICY_GET_STATUS}} RPC. This RPC can block, and it's very easy for the remote-executor thread pool on the coordinator to become overloades with threads like this:
> {noformat}
> "remote-thread-172" daemon prio=10 tid=0x00007f0cc48c0000 nid=0x28ca4 in Object.wait() [0x00007f0c5f25b000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at org.infinispan.topology.ClusterTopologyManagerImpl.waitForView(ClusterTopologyManagerImpl.java:357)
>         - locked <0x00000000ff3bd900> (a java.lang.Object)
>         at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123)
>         at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:162)
>         at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:144)
>         at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:276)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


More information about the infinispan-issues mailing list