Yong Hao Gao created JBMESSAGING-1918:
-----------------------------------------
Summary: Failover may take a long time blocking on sucker connection retry
Key: JBMESSAGING-1918
URL:
https://issues.jboss.org/browse/JBMESSAGING-1918
Project: JBoss Messaging
Issue Type: Bug
Components: JMS Clustering
Affects Versions: 1.4.8.SP6, 1.4.0.SP3.CP14
Reporter: Yong Hao Gao
Assignee: Yong Hao Gao
Fix For: 1.4.0.SP3.CP15, 1.4.8.SP7
When a node leave the cluster, its failover node will detects the sucker connection broken
and try to cleanup and retry. The cleanup and retry method may take a long time to finish
as the connection is already broken and the cleanup tries to make remote calls (although
they will eventually fail) as part of the cleanup. The methods
org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager.ConnectionInfo.cleanupConnection()
and
org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager.ConnectionInfo.retryConnection()
are all synchronized on ConnectionInfo.
On the other hand the node will at the same time performs failover for that node, and this
include removing suckers from the node. The method
org.jboss.messaging.core.impl.clusterconnection.ClusterConnectionManager.ConnectionInfo.removeSucker(String)
synchronizes on ConnectionInfo too.
So it's possible that removeSucker() will wait for a long time to get the lock, making
the failover process very slow in some situations.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira