]
RH Bugzilla Integration commented on ISPN-5420:
-----------------------------------------------
Alan Field <afield(a)redhat.com> changed the Status of [bug
Thread pools are depleted by ClusterTopologyManagerImpl.waitForView()
and causing deadlock
------------------------------------------------------------------------------------------
Key: ISPN-5420
URL:
https://issues.jboss.org/browse/ISPN-5420
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 6.0.2.Final, 7.1.1.Final
Reporter: Dan Berindei
Assignee: Dan Berindei
Priority: Critical
Fix For: 8.0.0.Alpha1
The join process was designed in the idea that a node would start its caches in
sequential order, so {{ClusterTopologyManager.waitForView()}} would block at most once for
each joining node. However, WildFly actually starts {{2 * Runtime.availableProcessors()}}
caches in parallel, and this can be a problem when the machine has a lot of cores and
multiple nodes.
{{ClustertopologyManager.handleClusterView()}} only updates the {{viewId}} after it
updated the cache topologies of each cache AND after it confirmed the availability of all
the nodes with a {{POLICY_GET_STATUS}} RPC. This RPC can block, and it's very easy for
the remote-executor thread pool on the coordinator to become overloades with threads like
this:
{noformat}
"remote-thread-172" daemon prio=10 tid=0x00007f0cc48c0000 nid=0x28ca4 in
Object.wait() [0x00007f0c5f25b000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.infinispan.topology.ClusterTopologyManagerImpl.waitForView(ClusterTopologyManagerImpl.java:357)
- locked <0x00000000ff3bd900> (a java.lang.Object)
at
org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:123)
at
org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:162)
at
org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:144)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:276)
{noformat}