JMS client hangs at FailoverValve2.close() when database hangs
--------------------------------------------------------------
Key: JBMESSAGING-1856
URL:
https://issues.jboss.org/browse/JBMESSAGING-1856
Project: JBoss Messaging
Issue Type: Bug
Components: JMS Clustering
Affects Versions: 1.4.7.GA
Reporter: Takayoshi Kimura
The JMS client hangs at the following situation:
* Clustered setup (JMS remote client -> clustered JMS servers -> database)
* The client looks up ClusteredConnectionFactory, consume a message in transacted session
and then calls session.rollback()
* The database hangs for long time from here
* The client calls conn.close(), no response from the server due to database hang
* The client got socket timeout during database hang
* The client hangs in FailoverValve2#close()
JMS client socket timeout is 300sec, and PostgreSQL JDBC connection timeout is 15min or so
by default.
If the JMS server detects database failure and return response to the client before client
timeout, the client doesn't hang. So the workaround is to set up failure detection for
database at JDBC level (-ds.xml) properly to avoid long hang up.
1. Setup clustered JBoss Messaging service. JBoss EAP 5.1.0, PostgreSQL
2. Deploy testDistributedQueue-service.xml
3. Boot both nodes
4. Execute $ sh run-client.sh "server:1099"
5. Pull out the cable after the "Sleep 20 sec, pull out a network cable to the
database server!" message
6. Wait 300 sec, remoting socket timeout
7. Get thread dump with kill -3 or jconsole, confirm it hangs
You can use iptables command to emulate the #5:
$ sudo /sbin/iptables -A INPUT -p tcp --dport 5432 -j DROP && sudo /sbin/iptables
-A OUTPUT -p tcp --dport 5432 -j DROP
5432 is port number used by PostgreSQL.
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira