[infinispan-issues] [JBoss JIRA] (ISPN-5420) Thread pools are depleted by ClusterTopologyManagerImpl.waitForView() and causing deadlock
Dan Berindei (JIRA)
issues at jboss.org
Wed Apr 29 10:58:46 EDT 2015
[ https://issues.jboss.org/browse/ISPN-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dan Berindei updated ISPN-5420:
-------------------------------
Priority: Critical (was: Major)
> Thread pools are depleted by ClusterTopologyManagerImpl.waitForView() and causing deadlock
> ------------------------------------------------------------------------------------------
>
> Key: ISPN-5420
> URL: https://issues.jboss.org/browse/ISPN-5420
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 6.0.2.Final, 7.1.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.2.0.Final
>
>
> The join process was designed in the idea that a node would start its caches in sequential order, so {{ClusterTopologyManager.waitForView()}} would block at most once for each joining node. However, WildFly actually starts {{2 * Runtime.availableProcessors()}} caches in parallel, and this can be a problem when the machine has a lot of cores and multiple nodes.
> {{ClustertopologyManager.handleClusterView()}} only updates the {{viewId}} after it updated the cache topologies of each cache AND after it confirmed the availability of all the nodes with a {{POLICY_GET_STATUS}} RPC. This RPC can block, and it's very easy for the remote-executor thread pool on the coordinator to become overloades with threads like this:
> {noformat}
> "remote-thread-172" daemon prio=10 tid=0x00007f0cc48c0000 nid=0x28ca4 in Object.wait() [0x00007f0c5f25b000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at org.infinispan.topology.ClusterTopologyManagerImpl.waitForView(ClusterTopologyManagerImpl.java:357)
> - locked <0x00000000ff3bd900> (a java.lang.Object)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:162)
> at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:144)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:276)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
More information about the infinispan-issues
mailing list