Yong Hao Gao created JBMESSAGING-1912:
-----------------------------------------
Summary: Closing a clustered connection that is undergoing failover can cause
deadlock
Key: JBMESSAGING-1912
URL:
https://issues.jboss.org/browse/JBMESSAGING-1912
Project: JBoss Messaging
Issue Type: Bug
Components: JMS Clustering
Affects Versions: 1.4.8.SP6, 1.4.0.SP3.CP14
Reporter: Yong Hao Gao
Assignee: Yong Hao Gao
Fix For: 1.4.0.SP3.CP15, 1.4.8.SP7
When a clustered connection is being closed while it is being in failover process, a
deadlock may happen. Here is the details:
First the connection closing causes ConnectionAspect.handleClose() to be invoked. In its
finally block the following will be executed:
ConsolidatedRemotingConnectionListener l =
remotingConnection.removeConnectionListener();
if (l != null)
{
l.clear();
}
The l.clear() is a synchronized method.
If just before the above code is reached the connection failure happens (such as server
node crashed or the network link is broken), JBM will detect it and start the failover
process for the connection, which includes calling FailoverValve2.close() to block any
future calls before failover is finished. The following code will be executed:
while (count > 0)
{
try
{
wait();
}
catch (InterruptedException ignore)
{
}
}
where the 'count' is the number of ongoing method invocations. The purpose if the
above code is wait for ongoing method calls (if any) to finish before closing the valve.
Note this failover process is executed by a separate thread from the connection closing
thread. And this thread, when executing up to the above code, is holding a lock of a
ConsolidatedRemotingConnectionListener object.
So in that case, the 'count' member will not be zero because the connection
closing process has already increased the count by 1. So the failover thread will wait for
the counter to be released. But as it holds the ConsolidatedRemotingConnectionListener
lock, the connection closing code will be stuck at the clear() method as it requires the
ConsolidatedRemotingConnectionListener lock. Therefore the connection closing stuck at
there and cannot finish, so the count will never be released to zero. In turn this cause
the failover thread to wait forever and never release the
ConsolidatedRemotingConnectionListener lock. A deadlock happens.
Here is the thread stack show the deadlock:
"Thread-3471" daemon prio=6 tid=0x0d1f6800 nid=0x1aa8 in Object.wait()
[0x2d0ef000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at org.jboss.jms.client.FailoverValve2.close(FailoverValve2.java:145)
- locked <0xa9733128> (a org.jboss.jms.client.FailoverValve2)
at
org.jboss.jms.client.FailoverCommandCenter.failureDetected(FailoverCommandCenter.java:92)
at
org.jboss.jms.client.container.ConnectionFailureListener.handleConnectionException(ConnectionFailureListener.java:62)
at
org.jboss.jms.client.remoting.ConsolidatedRemotingConnectionListener.handleConnectionException(ConsolidatedRemotingConnectionListener.java:84)
- locked <0xa9707f50> (a
org.jboss.jms.client.remoting.ConsolidatedRemotingConnectionListener)
at org.jboss.remoting.ConnectionValidator$3.run(ConnectionValidator.java:524)
Locked ownable synchronizers:
- None
"WorkManager(2)-32" daemon prio=6 tid=0x0bdd7c00 nid=0xfbc waiting for monitor
entry [0x3166f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.jboss.jms.client.remoting.ConsolidatedRemotingConnectionListener.clear(ConsolidatedRemotingConnectionListener.java:185)
- waiting to lock <0xa9707f50> (a
org.jboss.jms.client.remoting.ConsolidatedRemotingConnectionListener)
at
org.jboss.jms.client.container.ConnectionAspect.handleClose(ConnectionAspect.java:186)
at sun.reflect.GeneratedMethodAccessor775.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.jboss.aop.advice.PerInstanceAdvice.invoke(PerInstanceAdvice.java:122)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at
org.jboss.jms.client.container.FailoverValveInterceptor.invoke(FailoverValveInterceptor.java:114)
at org.jboss.aop.advice.PerInstanceInterceptor.invoke(PerInstanceInterceptor.java:86)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at org.jboss.jms.client.container.ClosedInterceptor.invoke(ClosedInterceptor.java:172)
at org.jboss.aop.advice.PerInstanceInterceptor.invoke(PerInstanceInterceptor.java:86)
at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
at
org.jboss.jms.client.delegate.ClientConnectionDelegate.close(ClientConnectionDelegate.java)
at org.jboss.jms.client.JBossConnection.close(JBossConnection.java:132)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira