Hi,

We've had a number of bugs/support issues over the last year when JMS--which usually runs fine for weeks or months--either temporarily or permanently has one of the following problems:
a) Error publishing message: socket closed
b) Connection Factory not bound, JNDI tree loses its entry for the jms connetion factory
c) Corrupt Hsqldb prevents JMS from starting up.

The first two are "temporary, " i.e. restarting makes the problem "go away". The third requires deleting the hsqldb and before restarting the app server.

To clarify: this error occurs in production, published software where  all the JMS  code--with mdb's or hand-coded jms--normally runs fine. That is,  99.99% percent of the time all the jms-related code works without a hitch. These errors occur intermittently with no obvious cause.

The question:
Why do these errors occur? How do you proactively prevent or monitor for such errors?  Any notes from the trenches? Any survival guide?

I've searched the jboss wiki but haven't found found such documentation or notes.

Version Notes:  We run jboss 4.2.2GA but have seen similar problems on Jboss 3.2.6.


thanks in advance,


bill

Here are some log excerpts:

socket closed:
[2008-04-27 14:05:29,046] [ERROR] com.participate.util.j2ee.JMSMessagePublisher (JMSMessagePublisher.java:104) - error publishing message (retry# 10)
org.jboss.mq.SpyJMSException: Cannot send a message to the JMS server; - nested throwable: (java.net.SocketException: socket closed)
    at org.jboss.mq.Connection.sendToServer(Connection.java:1028)
    at org.jboss.mq.SpySession.sendMessage(SpySession.java:1005)
    at org.jboss.mq.SpyMessageProducer.send(SpyMessageProducer.java:265)
    at org.jboss.mq.SpyMessageProducer.send(SpyMessageProducer.java:199)
    at org.jboss.mq.SpyTopicPublisher.publish(SpyTopicPublisher.java:58)
    at com.participate.util.j2ee.JMSMessagePublisher.publishMessage(JMSMessagePublisher.java:89)

connection factory not bound:
	at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:580)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Unknown Source)
Caused by: javax.naming.NameNotFoundException: peJmsConnectionFactory not bound
at org.jnp.server.NamingServer.getBinding(NamingServer.java:529)
at org.jnp.server.NamingServer.getBinding(NamingServer.java:537)
at org.jnp.server.NamingServer.getObject(NamingServer.java:543)
at org.jnp.server.NamingServer.lookup(NamingServer.java:296)
at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:667)
at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:627)
at javax.naming.InitialContext.lookup(Unknown Source)
at


corrupt hypersonic db:
MBEANS THAT ARE THE ROOT CAUSE OF THE PROBLEM:
ObjectName: jboss:service=Hypersonic,database=localDB
state: FAILED
I Depend On:
Depends On Me: jboss.jca:service=ManagedConnectionFactory,name=DefaultDS
MBeanException: java.sql.SQLException: General error: java.lang.NullPointerException
Cause: java.sql.SQLException: General error: java.lang.NullPointerException

com.participate.util.j2ee.JmsUtil.getDefaultTopicConnectionFactory(JmsUtil.java:55)