[jboss-user] [Clustering/JBoss] - Cluster Recoverability Issues...

lance.hankins do-not-reply at jboss.com
Fri Sep 21 15:33:02 EDT 2007


Hi Guys,

We're seeing some cluster recoverability issues.     We're using JBoss 4.0.5 in a clustered configuration (just a tw

For the most part, everything works great.

A while back we had a quartz job that caused an OutOfMemoryException in one node of the cluster, after which, the whole cluster fell apart.

To try and reproduce this situation, I've created an admin only URL where I can cause one of the following two things on a single node of the cluster :

  1) call system.exit
  2) start a quartz job that purposefully runs the node out of memory.

I just ran a small test with scenario #1, and I can reproduce the problem.   Basically, I hit the URL on node1, causing the JVM to exit.   After that, node2 is still present, but the application is hamstrung (we get exceptions on any operation which touch JMS).

I've watched the logs on node2 when I cause node1 to die, and I do see the JMS queues/etc migrate from node1 (now dead) to node2.    Here are the migration type log messages (immediately after node1 has died) :


  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ReportServicePreExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ReportServicePreExecutionMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ReportServicePostExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ReportServicePostExecutionMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ReportServiceDownloadMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ReportServiceDownloadMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(DeliveryServiceMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for DeliveryServiceMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ApplicationEventsMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ApplicationEventsMdb
  | 2007-09-21 11:58:22,140 [MessageDispatcher up processing thread] [] [] INFO  [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.focus-rcl-cluster] New cluster view for partition focus-rcl-cluster (id: 2, delta: -1) : [10.10.11.14:1199]
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] I am (10.10.11.14:1199) received membershipChanged event:
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] Dead members: 1 ([10.10.11.13:1199])
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] New Members : 0 ([])
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] All Members : 1 ([10.10.11.14:1199])
  | 2007-09-21 11:58:22,656 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.web.tomcat.tc5.TomcatDeployer] deploy, ctxPath=/jbossmq-httpil, warUrl=.../deploy-hasingleton/jms/jbossmq-httpil.sar/jbossmq-httpil.war/
  | 2007-09-21 11:58:23,672 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.il.uil2.UILServerILService] JBossMQ UIL service available at : /0.0.0.0:8193
  | 2007-09-21 11:58:23,703 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.DLQ] Bound to JNDI name: queue/DLQ
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServicePreExecuteQueue] Bound to JNDI name: queue/rcl/reportServicePreExecuteQueue
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServiceExecuteQueue] Bound to JNDI name: queue/rcl/reportServiceExecuteQueue
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServiceDownloadQueue] Bound to JNDI name: queue/rcl/reportServiceDownloadQueue
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServicePostExecuteQueue] Bound to JNDI name: queue/rcl/reportServicePostExecuteQueue
  | 2007-09-21 11:58:23,734 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/deliveryServiceQueue] Bound to JNDI name: queue/rcl/deliveryServiceQueue
  | 2007-09-21 11:58:23,750 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Topic.rcl/events/reportEventsTopic] Bound to JNDI name: topic/rcl/events/reportEventsTopic
  | 2007-09-21 11:58:23,750 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Topic.rcl/events/applicationEventsTopic] Bound to JNDI name: topic/rcl/events/applicationEventsTopic
  | 2007-09-21 11:58:25,297 [MessageDispatcher up processing thread] [] [] INFO  [org.jboss.cache.TreeCache] viewAccepted(): [magnum:3542|2] [magnum:3542]
  | 2007-09-21 11:58:30,703 [JMSContainerInvoker(ReportServiceExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServiceExecutionMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(DeliveryServiceMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for DeliveryServiceMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(ReportServiceDownloadMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServiceDownloadMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(ReportServicePreExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServicePreExecutionMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(ApplicationEventsMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ApplicationEventsMdb
  | 2007-09-21 11:58:30,750 [JMSContainerInvoker(ReportServicePostExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServicePostExecutionMdb
  | 2007-09-21 11:58:30,828 [JMSContainerInvoker(ReportServiceExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServiceExecutionMdb
  | 2007-09-21 11:58:30,844 [JMSContainerInvoker(ReportServicePostExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServicePostExecutionMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(ApplicationEventsMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ApplicationEventsMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(DeliveryServiceMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for DeliveryServiceMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(ReportServicePreExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServicePreExecutionMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(ReportServiceDownloadMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServiceDownloadMdb
  | 


The exceptions we see after we kill node1 and then try to perform an operation that touches JMS on node2 are the following :


  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ReportServicePreExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ReportServicePreExecutionMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ReportServicePostExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ReportServicePostExecutionMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ReportServiceDownloadMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ReportServiceDownloadMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(DeliveryServiceMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for DeliveryServiceMdb
  | 2007-09-21 11:58:20,703 [JMSContainerInvoker(ApplicationEventsMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Waiting for reconnect internal 10000ms for ApplicationEventsMdb
  | 2007-09-21 11:58:22,140 [MessageDispatcher up processing thread] [] [] INFO  [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.focus-rcl-cluster] New cluster view for partition focus-rcl-cluster (id: 2, delta: -1) : [10.10.11.14:1199]
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] I am (10.10.11.14:1199) received membershipChanged event:
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] Dead members: 1 ([10.10.11.13:1199])
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] New Members : 0 ([])
  | 2007-09-21 11:58:22,156 [AsynchViewChangeHandler Thread] [] [] INFO  [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.focus-rcl-cluster] All Members : 1 ([10.10.11.14:1199])
  | 2007-09-21 11:58:22,656 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.web.tomcat.tc5.TomcatDeployer] deploy, ctxPath=/jbossmq-httpil, warUrl=.../deploy-hasingleton/jms/jbossmq-httpil.sar/jbossmq-httpil.war/
  | 2007-09-21 11:58:23,672 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.il.uil2.UILServerILService] JBossMQ UIL service available at : /0.0.0.0:8193
  | 2007-09-21 11:58:23,703 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.DLQ] Bound to JNDI name: queue/DLQ
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServicePreExecuteQueue] Bound to JNDI name: queue/rcl/reportServicePreExecuteQueue
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServiceExecuteQueue] Bound to JNDI name: queue/rcl/reportServiceExecuteQueue
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServiceDownloadQueue] Bound to JNDI name: queue/rcl/reportServiceDownloadQueue
  | 2007-09-21 11:58:23,719 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/reportServicePostExecuteQueue] Bound to JNDI name: queue/rcl/reportServicePostExecuteQueue
  | 2007-09-21 11:58:23,734 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Queue.rcl/deliveryServiceQueue] Bound to JNDI name: queue/rcl/deliveryServiceQueue
  | 2007-09-21 11:58:23,750 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Topic.rcl/events/reportEventsTopic] Bound to JNDI name: topic/rcl/events/reportEventsTopic
  | 2007-09-21 11:58:23,750 [AsynchKeyChangeHandler Thread] [] [] INFO  [org.jboss.mq.server.jmx.Topic.rcl/events/applicationEventsTopic] Bound to JNDI name: topic/rcl/events/applicationEventsTopic
  | 2007-09-21 11:58:25,297 [MessageDispatcher up processing thread] [] [] INFO  [org.jboss.cache.TreeCache] viewAccepted(): [magnum:3542|2] [magnum:3542]
  | 2007-09-21 11:58:30,703 [JMSContainerInvoker(ReportServiceExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServiceExecutionMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(DeliveryServiceMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for DeliveryServiceMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(ReportServiceDownloadMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServiceDownloadMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(ReportServicePreExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServicePreExecutionMdb
  | 2007-09-21 11:58:30,734 [JMSContainerInvoker(ApplicationEventsMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ApplicationEventsMdb
  | 2007-09-21 11:58:30,750 [JMSContainerInvoker(ReportServicePostExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Trying to reconnect to JMS provider for ReportServicePostExecutionMdb
  | 2007-09-21 11:58:30,828 [JMSContainerInvoker(ReportServiceExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServiceExecutionMdb
  | 2007-09-21 11:58:30,844 [JMSContainerInvoker(ReportServicePostExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServicePostExecutionMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(ApplicationEventsMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ApplicationEventsMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(DeliveryServiceMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for DeliveryServiceMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(ReportServicePreExecutionMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServicePreExecutionMdb
  | 2007-09-21 11:58:30,859 [JMSContainerInvoker(ReportServiceDownloadMdb) Reconnect] [] [] INFO  [org.jboss.ejb.plugins.jms.JMSContainerInvoker] Reconnected to JMS provider for ReportServiceDownloadMdb
  | 


Any ideas...? 

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4087403#4087403

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4087403



More information about the jboss-user mailing list