[JBoss JIRA] Created: (JBMESSAGING-1631) Messages are piling up in the queues in clustered environment and not pulled by message sucker
by Victor Starenky (JIRA)
Messages are piling up in the queues in clustered environment and not pulled by message sucker
----------------------------------------------------------------------------------------------
Key: JBMESSAGING-1631
URL: https://jira.jboss.org/jira/browse/JBMESSAGING-1631
Project: JBoss Messaging
Issue Type: Bug
Components: JMS Clustering
Affects Versions: 1.4.0.SP3.CP08
Environment: Cluster of a few JBoss 4.2.3 servers with JBM 1.4.2.GA-SP1 or 1.4.0.SP3_CP08 running on x64 windows servers.
JBoss Remoting 2.5.1 or 2.2.3
Clustered XA connection factory, clustered queues, both server and client (MDBs) are deployed as one application - identical deployment for all servers in a cluster
Reporter: Victor Starenky
Assignee: Tim Fox
We have an EJB3 application is running on a cluster of 7 servers.
All servers have the same application farmed across them.
We have a clustered queue and using clustered connection factory.
Originally we ran into the bug JBMESSAGING-1456 and while testing the fix ran into a different problem. After the cluster was running for some time (usually overnight with lots of heavy background processing hapenning at that time) we see messages are "piling up" in some queues on some nodes.
Sympthoms are:
MessageCount is not zero (rather in the order of hundreeds), DeliveringCount is zero. These nodes have ConsumerCount=0 for the queues experiencing the problem. Message sucker is configured as far as I can tell. Looks like the problem might be related to client and/or server failover leaving some nodes without consumers while sucker not doing it's job (if I understand it correctly).
Once we bump the timeout values much higher than they are in the original config files the problem seems to disappear or at least show up much less frequently. Specifically I'm talking about these values:
messaging-service.xml:
<attribute name="FailoverStartTimeout">180000</attribute>
remoting-bisocket-service.xml:
<attribute name="clientLeasePeriod" isParam="true">30000</attribute>
<attribute name="validatorPingPeriod" isParam="true">30000</attribute>
<attribute name="validatorPingTimeout" isParam="true">20000</attribute>
<attribute name="registerCallbackListener">false</attribute>
<attribute name="timeout" isParam="true">240000</attribute>
While this serves as a temporary workaround we don't feel we can rely on JBM in a production clustered environment without having failover working properly.
This is a big showstopper for us at the moment.
Attached are the log files from the prod environment of 7 servers with the config files as well as log files from the test environment with just 2 servers (having same issue) and corresponding configs. The logs were produced by the test version of the messaging code with added logging as per JBMESSAGING-1456.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 3 months
[JBoss JIRA] Created: (JBMESSAGING-1621) Provide complete message ordering in a clustered environment
by Yusuke Yamamoto (JIRA)
Provide complete message ordering in a clustered environment
------------------------------------------------------------
Key: JBMESSAGING-1621
URL: https://jira.jboss.org/jira/browse/JBMESSAGING-1621
Project: JBoss Messaging
Issue Type: Feature Request
Reporter: Yusuke Yamamoto
Assignee: Tim Fox
While JBMESSAGING-1416 states message ordering feature in a singleton configuration, customer is also requesting a complete message ordering feature which is applicable in a clustered environment.
Here's the background:
In the case JMS is used, both performance and scalability are priority.
There is a conflict between scalable JMS implementation and complete message ordering feature since order aware messages cannot be processed concurrently.
But order unaware queues should be still processed in parallel for better throughput.
Additionally, there are some cases that no downtime is acceptable. But HA-Singleton requires certain amount of time to fail-over the service to other node.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 3 months