Husgaard [
http://community.jboss.org/people/Husgaard] created the discussion
"Possible bug with JBoss Messaging on bisocket transport"
To view the discussion, visit:
http://community.jboss.org/message/634694#634694
--------------------------------------------------------------
Hi,
I am using JBoss Messaging 1.4.7.GA with the bisocket transport from JBoss Remoting
2.5.3.SP1 connecting to remote clients.
Ever since I started using this combination I have seen some stability issues on large and
busy production systems that I have been unable to reproduce in a test environment. I have
mostly seen these issues when the systems have been very busy, and I have noticed that
these issues become a lot worse when the network is not good (high packet loss and/or
parts of the network disconnecting at times).
Today I think I have found the cause of these issues, and it looks like a bug in the
interface between JBoss Messaging and JBoss Remoting. I would like some of you JBoss
Messaging experts here to comment on my observations before I open a JIRA ticket.
My findings starts with two thread dumps obtained at a production server experiencing
these issues. The thread dumps were taken 8 seconds apart, and both show the same stack
trace:
"WorkManager(2)-76" daemon prio=10 tid=0x00002aaae483c000 nid=0x3d40 in
Object.wait() [0x000000004770a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.jboss.remoting.transport.bisocket.BisocketClientInvoker.createSocket(BisocketClientInvoker.java:528)
- locked <0x000000070f9cb8f8> (a java.util.HashSet)
at
org.jboss.remoting.transport.socket.MicroSocketClientInvoker.getConnection(MicroSocketClientInvoker.java:1165)
at
org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:816)
at
org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:461)
at
org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:167)
at org.jboss.remoting.Client.invoke(Client.java:2034)
at org.jboss.remoting.Client.invoke(Client.java:877)
at org.jboss.remoting.Client.invokeOneway(Client.java:926) - not client side, means we
may block instead of executing in a new thread
at
org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(ServerInvokerCallbackHandler.java:835)
at
org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallbackOneway(ServerInvokerCallbackHandler.java:708)
- serverSide=false
at
org.jboss.jms.server.endpoint.ServerSessionEndpoint.performDelivery(ServerSessionEndpoint.java:1467)
at
org.jboss.jms.server.endpoint.ServerSessionEndpoint.handleDelivery(ServerSessionEndpoint.java:1379)
- locked <0x00000006a8c8dfd0> (a
org.jboss.jms.server.endpoint.ServerSessionEndpoint)
at
org.jboss.jms.server.endpoint.ServerConsumerEndpoint.handle(ServerConsumerEndpoint.java:328)
- locked <0x0000000695caa390> (a java.lang.Object)
at
org.jboss.messaging.core.impl.RoundRobinDistributor.handle(RoundRobinDistributor.java:119)
at
org.jboss.messaging.core.impl.MessagingQueue$DistributorWrapper.handle(MessagingQueue.java:590)
at
org.jboss.messaging.core.impl.ClusterRoundRobinDistributor.handle(ClusterRoundRobinDistributor.java:79)
at
org.jboss.messaging.core.impl.ChannelSupport.deliverInternal(ChannelSupport.java:665)
at
org.jboss.messaging.core.impl.MessagingQueue.deliverInternal(MessagingQueue.java:513)
at org.jboss.messaging.core.impl.ChannelSupport.handle(ChannelSupport.java:246)
- locked <0x0000000695caa498> (a java.lang.Object)
at
org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:2504)
at
org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.route(MessagingPostOffice.java:580)
at
org.jboss.jms.server.endpoint.ServerConnectionEndpoint.sendMessage(ServerConnectionEndpoint.java:779)
(snip)
This thread is trying to deliver a JMS message and is waiting in
BisocketClientInvoker.createSocket() for a remote client to open a connection back to the
server. At the same time the thread holds a read lock in MessagingPostOffice. This means
that any work in MessagingPostOffice that needs a write lock is blocked.
The problem in this case is that the remote client is no longer connected to the network,
so it will never connect back to the server. The wait in
BisocketClientInvoker.createSocket() will time out after some time. There is a retry loop
in BisocketClientInvoker.createSocket() which will retry a few times, but eventually the
call will fail and the read lock in MessagingPostOffice is released. If I read the code
correctly the read lock is held for 60 seconds, if the JBoss Remoting settings are
default.
Having the MessagingPostOffice lock exposed to client communication problems in this way
is not good, and probably not what you want.
Looking at the way JBoss Messaging calls JBoss Remoting when delivering a message to a
remote client, I see that you use one-way calls where the caller does not care about a
response, and the call is executed in another thread. But in JBoss Remoting there are two
way of doing this:
1. Starting a thread on the caller side to handle the call. This way the new thread will
take care of any communication problems, and the calling thread can return immediately so
the read lock in MessagingPostOffice is immediately released.
2. Starting a new thread on the remote side to handle the call. This way the calling
thread has to take care of any communication problems, and only when the invocation has
been delivered to the remote side will the remote side start a new thread to handle the
call.
Unfortunately it looks like JBoss Messaging is using the last way of doing one-way calls.
So my question is: Isn't this a bug? Shouldn't we use the first way of doing
one-way calls instead?
--------------------------------------------------------------
Reply to this message by going to Community
[
http://community.jboss.org/message/634694#634694]
Start a new discussion in JBoss Messaging at Community
[
http://community.jboss.org/choose-container!input.jspa?contentType=1&...]