[jboss-jira] [JBoss JIRA] (JBMESSAGING-1917) Networking Failure Locks Post Office

Doug Grove (JIRA) jira-events at lists.jboss.org
Mon Mar 12 16:13:47 EDT 2012


Doug Grove created JBMESSAGING-1917:
---------------------------------------

             Summary: Networking Failure Locks Post Office
                 Key: JBMESSAGING-1917
                 URL: https://issues.jboss.org/browse/JBMESSAGING-1917
             Project: JBoss Messaging
          Issue Type: Bug
          Components: Messaging Core
    Affects Versions: 1.4.8.SP5
         Environment: JBoss EAP .1.2
            Reporter: Doug Grove


Several network failure modes can cause the messaging post office to lock.  For example, loss of connectivity to a client can cause Messaging to stop accepting new clients and stop delivering messages.  Message can, however, still be delivered from the clients.

220 threads waiting to acquire MessagingPostOffice's ReaderLock.  Likely owned by:

"WorkManager(2)-7" daemon prio=10 tid=0x25108800 nid=0x1af7 in Object.wait() [0x1a675000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x3e6031f8> (a java.util.HashSet)
	at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.createSocket(BisocketClientInvoker.java:538)
	- locked <0x3e6031f8> (a java.util.HashSet)
	at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.getConnection(MicroSocketClientInvoker.java:1165)
	at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:816)
	at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:470)
	at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:169)
	at org.jboss.remoting.Client.invoke(Client.java:2070)
	at org.jboss.remoting.Client.invoke(Client.java:879)
	at org.jboss.remoting.Client.invokeOneway(Client.java:928)
	at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(ServerInvokerCallbackHandler.java:835)
	at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallbackOneway(ServerInvokerCallbackHandler.java:708)
	at org.jboss.jms.server.endpoint.ServerSessionEndpoint.performDelivery(ServerSessionEndpoint.java:1610)
	at org.jboss.jms.server.endpoint.ServerSessionEndpoint.handleDelivery(ServerSessionEndpoint.java:1522)
	- locked <0x3fd4bc18> (a org.jboss.jms.server.endpoint.ServerSessionEndpoint)
	at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.handle(ServerConsumerEndpoint.java:353)
	- locked <0x3e5960b8> (a java.lang.Object)
	at org.jboss.messaging.core.impl.RoundRobinDistributor.handle(RoundRobinDistributor.java:119)
	at org.jboss.messaging.core.impl.MessagingQueue$DistributorWrapper.handle(MessagingQueue.java:617)
	at org.jboss.messaging.core.impl.ClusterRoundRobinDistributor.handle(ClusterRoundRobinDistributor.java:79)
	at org.jboss.messaging.core.impl.ChannelSupport.deliverInternal(ChannelSupport.java:677)
	at org.jboss.messaging.core.impl.MessagingQueue.deliverInternal(MessagingQueue.java:540)
	at org.jboss.messaging.core.impl.ChannelSupport.handle(ChannelSupport.java:251)
	- locked <0x3e595e28> (a java.lang.Object)
	at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:3163)
	at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.route(MessagingPostOffice.java:980)


Thread looks to be in this while loop until the timeout passes:

   while ((isConnected()) && ((!this.pingFailed.flag) || (pingFailedTimeRemaining > 0L)) && ((timeout == 0) || (timeRemaining > 0L)))
    {
      synchronized (sockets)
      {
        try
        {
          sockets.wait(1000L);   //538
        }
        catch (InterruptedException e)
        {
          log.debug(this + " unexpected interrupt");
        }

        if (!sockets.isEmpty())
        {
          Iterator it = sockets.iterator();
          Socket socket = (Socket)it.next();
          it.remove();
          configureSocket(socket);
          log.debug(this + " found socket (" + this.listenerId + "): " + socket);
          return socket;
        }
      }

      if (savedControlOutputStream != this.controlOutputStream)
      {
        savedControlOutputStream = this.controlOutputStream;
        log.debug(this + " rewriting Bisocket.CREATE_ORDINARY_SOCKET on " + this.controlOutputStream);
        try
        {
          this.controlOutputStream.write(4);
          log.debug(this + " rewrote Bisocket.CREATE_ORDINARY_SOCKET");
        }
        catch (IOException e)
        {
          log.debug(this + " unable to rewrite Bisocket.CREATE_ORDINARY_SOCKET" + e.getMessage());
        }
      }

      long elapsed = System.currentTimeMillis() - start;
      if (timeout > 0)
        timeRemaining = timeout - elapsed;
      pingFailedTimeRemaining = pingFailedWindow - elapsed;
    }


so he pulls a client NIC and the thread on JBoss hangs on it per his timeout, but hangs while holding the MessagingPostOffice lock to back everything else up.  I don't see a way around that without a messaging code overhaul.  Just set a timely timeout and hope for no NIC failures :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list