[jboss-jira] [JBoss JIRA] Commented: (JBMESSAGING-1114) JBoss Remoting fails under load

Tim Fox (JIRA) jira-events at lists.jboss.org
Thu Oct 18 04:53:03 EDT 2007


    [ http://jira.jboss.com/jira/browse/JBMESSAGING-1114?page=comments#action_12383187 ] 
            
Tim Fox commented on JBMESSAGING-1114:
--------------------------------------

I found out why this is happening (why the callback handler can't be found).   :

1. An invocation fails because remoting timed out waiting for a TCP connection to be returned to the pool. (Pool size to small).
2. In this case remoting throws a java.net.SocketException:

throw new SocketException("Can not obtain client socket connection from pool. " +
           "Have waited " + (System.currentTimeMillis() - start) +
           " milliseconds for available connection (" + usedPooled + "in use)");

3. JBM catches the exception and since it is a java.net.SocketException, assumes some fatal problem has happened with the connection, and falsely initiates failover to another node.

4. Before failing over JBM closes the failed connection.

5. This results in the JBR Callback Connector getting closed.

6. All this while callbacks are still arriving from the server, since that connection is actually still fine.

7. The connector closing process removes the callbackhandler from the callback ServerInvoker so any more callbacks arrive barf with not being able to find the handler.

So:

After increasing the client pool size I'm not getting these exceptions any more, but imho there are two issues here:

1) Why does remoting throw SocketException? - I would say this is inappropriate since the socket is fine - I would suggest some kind of org.jboss.remoting.RemotingException should be thrown?
This would allow JBM to catch it and not initiate failover in this case.
In the mean-time I will have to do some kind of text comparison:

if (exception.getMessage().startsWith("Can not obtain client....))
{
//don't do failover
}
else
{
//do failover
}
This will work for now but is ugly and brittle.

2) The connector close process is not clean. I would suggest the code should be changed so the server threads are shut down at the beginning of the connector close process.
I.e.
a) Wait for current invocations on that server invoker to complete - and don't allow any more.
b) Once all are complete shut it down.
(I believe we also have a deadlock on JBoss AS that Clebert discovered that was something due to connectors closing. I don't know if this is related?) 

> JBoss Remoting fails under load
> -------------------------------
>
>                 Key: JBMESSAGING-1114
>                 URL: http://jira.jboss.com/jira/browse/JBMESSAGING-1114
>             Project: JBoss Messaging
>          Issue Type: Bug
>    Affects Versions: 1.4.0.GA
>            Reporter: Tim Fox
>         Assigned To: Tim Fox
>            Priority: Critical
>             Fix For: 1.4.0.SP1
>
>
> JBoss Remoting fails with various different errors when under extreme load.
> To replicate this, set up two clustered server nodes, using a MySQL database.
> These can both be on the same machine, using ServiceBindingManager.
> On a second machine run Ovidiu's messkit toolki, first to send some messages:
> mess -stat send -size 10240 50000 
> And then to receive them back using 50 concurrent consumers:
> mess -stat -sessions 50 receive all
> You will notice that JBoss Remoting fails with errors:
> I believe this is due to remoting incorrectly thinking a connection has failed and shutting down the connection. Perhaps due to the load, the ping does not get through in time to refresh the lease?
> I would like a remoting solution that *does not ping* from server to client - for us this is unnecessary.
> It also seems remoting is continually timing out and recreating connections - this could also be a source of error.
> How do we configure remoting so it does not do this?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list