[
https://issues.jboss.org/browse/JBMESSAGING-1456?page=com.atlassian.jira....
]
Jim Lebonitte commented on JBMESSAGING-1456:
--------------------------------------------
From reading through the comments it doesn't seem like it was ever
resolved but we are definitely seeing this issue in JBoss AS 6.1 with 3 clustered nodes.
It will work fine for a week or so then we start noticing messages are missing. When we
look in the jmx console we will see thousands being delivered and the same exact number in
the queue and it never goes down. When we bounce the servers the message beans then
consume the persisted messages because once the servers come back up they aren't in
the being delivered state anymore. Should I open a new issue? Problem is we don't
really know how to re-create it but it happens regularly in our system. It seems to have
started when we upgraded from JBoss 4 to JBoss 6. We weren't using the clustering of
the jms in jboss 4 and we would like to but for now we are going to try shutting it off to
see it helps.
Messages stuck in being-delivered state in cluster
--------------------------------------------------
Key: JBMESSAGING-1456
URL:
https://issues.jboss.org/browse/JBMESSAGING-1456
Project: JBoss Messaging
Issue Type: Bug
Affects Versions: 1.4.0.SP3_CP03, 1.4.0.SP3.CP07
Reporter: Justin Bertram
Assignee: Yong Hao Gao
Priority: Blocker
Fix For: 1.4.0.SP3.CP08, 1.4.4.GA
Attachments: DeliveringCount.png, kill3_thread_dump.txt, logs-and-config.zip,
MessageStucked.png, RemoveAllMessagesException.png, test-1456-jars.zip, thread_dump.txt
Messages become "stuck" in being-delivered state when clients use a clustered
XA connection factory in a cluster of at least 2 nodes.
JBoss setup:
-2 nodes of JBoss EAP 4.3 CP02
-commented out "ClusterPullConnectionFactory" in messaging-service.xml to
prevent message redistribution and eliminate the "message suckers" as the
potential culprit
-MySQL backend using the default mysql-persistence-service.xml (from
<JBOSS_HOME>/docs/examples/jms)
Client setup:
-both nodes have a client which is a separate process (i.e. not inside JBoss)
-clients are Spring based
-one client produces and consumes, the other client just consumes
-both clients use the ClusteredXAConnectionFactory from the default
connection-factories-service.xml
-both clients publish to and consume from "queue/testDistributedQueue"
-clients are configured to send persistent messages, use AUTO_ACKNOWLEDGE, and
transacted sessions
Symptoms of the issue:
-when running the clients I watch the JMX-Console for the
"queue/testDistributedQueue"
-as the consumers pull messages off the queue I can see the MessageCount and
DeliveringCount go to 0 every so often
-after a period of time (usually a few hours) the MessageCount and DeliveringCount
never go back to 0
-I "kill" the clients and wait for the DeliveringCount to go to 0, but it
never does
-after the clients are killed the ConsumerCount for the queue will drop, but never to 0
when messages are "stuck"
-a thread dump reveals at least one JBM server session that is apparently stuck (it
never goes away) - ostensibly this is the consumer that is showing in the JMX-Console for
"queue/testDistributedQueue"
-a "killall -3 java" doesn't produce anything from the clients so I know
their dead
-nothing is in any DLQ or expiry queue
-the database contains as many rows in the JBM_MSG and JBM_MSG_REF tables as the
DeliveringCount in the JMX-Console
-rebooting the node with the stuck messages frees the messages to be consumed (i.e.
un-sticks them)
Other notes:
-nothing else is happening on either node but running the client and running JBoss
-this only appears to happen when a clustered connection factory is used. I tested
using a normal connection factory and after 24 hours couldn't reproduce a stuck
message.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira