[jboss-jira] [JBoss JIRA] Commented: (JBMESSAGING-1631) Messages are piling up in the queues in clustered environment and not pulled by message sucker
Victor Starenky (JIRA)
jira-events at lists.jboss.org
Thu Jun 11 09:32:56 EDT 2009
[ https://jira.jboss.org/jira/browse/JBMESSAGING-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12471684#action_12471684 ]
Victor Starenky commented on JBMESSAGING-1631:
----------------------------------------------
We've switched to the latest jars for messaging that has fix for JBMESSAGING-1456 and as well switched all other required libaries including remoting (to the version 2.2.3 as per documentation).
Still this morning we see one node piling up topic messages. As far as I can tell we mostly experience this issue with topics. One or more nodes stop receiving messages and in JMX console we see thousands of them sitting on these nodes.
As I mentioned before we're pushing servers to the limits overnight so timeouts are likely caused by exessive load more than network.
> Messages are piling up in the queues in clustered environment and not pulled by message sucker
> ----------------------------------------------------------------------------------------------
>
> Key: JBMESSAGING-1631
> URL: https://jira.jboss.org/jira/browse/JBMESSAGING-1631
> Project: JBoss Messaging
> Issue Type: Bug
> Components: JMS Clustering
> Affects Versions: 1.4.0.SP3.CP08
> Environment: Cluster of a few JBoss 4.2.3 servers with JBM 1.4.2.GA-SP1 or 1.4.0.SP3_CP08 running on x64 windows servers.
> JBoss Remoting 2.5.1 or 2.2.3
> Clustered XA connection factory, clustered queues, both server and client (MDBs) are deployed as one application - identical deployment for all servers in a cluster
> Reporter: Victor Starenky
> Assignee: Howard Gao
> Fix For: 1.4.0.SP3.CP09, 1.4.5.GA
>
> Attachments: logs-and-config-prod.zip, logs-and-config-test.zip, TestRecommendationSentProcessorMDB.java
>
>
> We have an EJB3 application is running on a cluster of 7 servers.
> All servers have the same application farmed across them.
> We have a clustered queue and using clustered connection factory.
> Originally we ran into the bug JBMESSAGING-1456 and while testing the fix ran into a different problem. After the cluster was running for some time (usually overnight with lots of heavy background processing hapenning at that time) we see messages are "piling up" in some queues on some nodes.
> Sympthoms are:
> MessageCount is not zero (rather in the order of hundreeds), DeliveringCount is zero. These nodes have ConsumerCount=0 for the queues experiencing the problem. Message sucker is configured as far as I can tell. Looks like the problem might be related to client and/or server failover leaving some nodes without consumers while sucker not doing it's job (if I understand it correctly).
> Once we bump the timeout values much higher than they are in the original config files the problem seems to disappear or at least show up much less frequently. Specifically I'm talking about these values:
> messaging-service.xml:
> <attribute name="FailoverStartTimeout">180000</attribute>
> remoting-bisocket-service.xml:
> <attribute name="clientLeasePeriod" isParam="true">30000</attribute>
> <attribute name="validatorPingPeriod" isParam="true">30000</attribute>
> <attribute name="validatorPingTimeout" isParam="true">20000</attribute>
> <attribute name="registerCallbackListener">false</attribute>
> <attribute name="timeout" isParam="true">240000</attribute>
> While this serves as a temporary workaround we don't feel we can rely on JBM in a production clustered environment without having failover working properly.
> This is a big showstopper for us at the moment.
> Attached are the log files from the prod environment of 7 servers with the config files as well as log files from the test environment with just 2 servers (having same issue) and corresponding configs. The logs were produced by the test version of the messaging code with added logging as per JBMESSAGING-1456.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list