[jboss-dev-forums] [Design of Messaging on JBoss (Messaging/JBoss)] - Ping issues under stress
jmesnil
do-not-reply at jboss.com
Thu Aug 13 09:27:34 EDT 2009
I managed to get ping issues and clients stuck doing a soak test on our messaging Labs.
Setup:
1 server (messaging-01) with 1 JMS topic
10 producers + 17 non-durable subscribers (splitted on 7 machines)
server config:
| Netty:
| <param key="jbm.remoting.netty.tcpnodelay" value="false" type="Boolean"/>
| <param key="jbm.remoting.netty.tcpsendbuffersize" value="1048576" type="Integer"/>
| <param key="jbm.remoting.netty.tcpreceivebuffersize" value="1048576" type="Integer"/>
|
| <security-enabled>false</security-enabled>
| <persistence-enabled>false</persistence-enabled>
|
JMS config:
<connection-factory name="ConnectionFactory">
<connector-ref connector-name="netty-connector"/>
<retry-interval>1000</retry-interval>
<retry-interval-multiplier>1.0</retry-interval-multiplier>
<reconnect-attempts>-1</reconnect-attempts>
<failover-on-server-shutdown>true</failover-on-server-shutdown>
<call-timeout>5000</call-timeout>
<block-on-non-persistent-send>false</block-on-non-persistent-send>
<block-on-persistent-send>true</block-on-persistent-send>
<pre-acknowledge>true</pre-acknowledge>
</connection-factory>
soak.properties:
| duration-in-minutes=120
| num-warmup-messages=100
| message-size=1024
| durable=false
| transacted=false
| batch-size=1000
| drain-queue=false
| destination-lookup=soakTopic
| connection-factory-lookup=/ConnectionFactory
| throttle-rate=-1
| dups-ok-acknowledge=false
| disable-message-id=true
| disable-message-timestamp=true
|
The clients (both producers and consumers) are connected to reconnect to the server
when they are notified by a JMSExceptionListener. Their connection factories will indefinitely try
to reconnect to the server.
1. I start the server. No activity:
| [java] **** Server Dump ****
| [java] date: Thu Aug 13 08:40:56 EDT 2009
| [java] heap memory: used=60.09 MB, max=1.92 GB
| [java] non-heap memory: used=60.09 MB, max=1.92 GB
| [java] # of thread: 16
| [java] # of conns: 0
| [java] JMS topics:
| [java] soakTopic: 0 subscription (0 non-durable) receiving 0 message
| [java] ********************
|
2. I then start 10 producers and 17 non-durable subscribers
At the beginning, all producers are producing and all subscribers are subscribing
3. But after a few seconds, the subscribers starts to warn about missing ping servers and failures to failover
| [java] Aug 13, 2009 8:51:30 AM org.jboss.messaging.core.logging.Logger warn
| [java] WARNING: Connection failure has been detected: Did not receive ping from server for org.jboss.messaging.integration.transports.netty.NettyConnection at 128ae45a[local= /172.16.8.12:41708, remote=/172.16.8.10:5445] [code=3]
| [java] Aug 13, 2009 8:51:30 AM org.jboss.messaging.core.logging.Logger warn
| [java] WARNING: Connection failure has been detected: Did not receive ping from server for org.jboss.messaging.integration.transports.netty.NettyConnection at 128ae45a[local= /172.16.8.12:41706, remote=/172.16.8.10:5445] [code=3]
| [java] Aug 13, 2009 8:51:36 AM org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver$3 onMessage
| [java] INFO: received 10000 messages in 11.60s (total: 117s)
| [java] Aug 13, 2009 8:51:53 AM org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver$3 onMessage
| [java] INFO: received 10000 messages in 16.80s (total: 133s)
| [java] Aug 13, 2009 8:52:09 AM org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver$3 onMessage
| [java] INFO: received 10000 messages in 16.07s (total: 150s)
| [java] Aug 13, 2009 8:52:15 AM org.jboss.messaging.core.logging.Logger warn
| [java] WARNING: Connection failure has been detected: Did not receive ping from server for org.jboss.messaging.integration.transports.netty.NettyConnection at 5f7779e3[local= /172.16.8.12:41735, remote=/172.16.8.10:5445] [code=3]
| [java] Aug 13, 2009 8:52:15 AM org.jboss.messaging.core.logging.Logger warn
| [java] WARNING: Connection failure has been detected: Did not receive ping from server for org.jboss.messaging.integration.transports.netty.NettyConnection at 1ce3570c[local= /172.16.8.12:41737, remote=/172.16.8.10:5445] [code=3]
| [java] Aug 13, 2009 8:52:20 AM org.jboss.messaging.core.logging.Logger error
| [java] SEVERE: Failed to handle failover
| [java] MessagingException[errorCode=3 message=Timed out waiting for response when sending packet 32]
| [java] at org.jboss.messaging.core.remoting.impl.ChannelImpl.sendBlocking(ChannelImpl.java:312)
| [java] at org.jboss.messaging.core.client.impl.ClientSessionImpl.handleFailover(ClientSessionImpl.java:751)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl.reattachSessions(ConnectionManagerImpl.java:794)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl.failoverOrReconnect(ConnectionManagerImpl.java:647)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl.handleConnectionFailure(ConnectionManagerImpl.java:518)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl.access$600(ConnectionManagerImpl.java:81)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl$DelegatingFailureListener.connectionFailed(ConnectionManagerImpl.java:1218)
| [java] at org.jboss.messaging.core.remoting.impl.RemotingConnectionImpl.callFailureListeners(RemotingConnectionImpl.java:393)
| [java] at org.jboss.messaging.core.remoting.impl.RemotingConnectionImpl.fail(RemotingConnectionImpl.java:244)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl$FailedConnectionAction$1.run(ConnectionManagerImpl.java:1174)
| [java] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
| [java] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
| [java] at java.lang.Thread.run(Thread.java:619)
| [java] javax.jms.JMSException: Connection is destroyed
| [java] at org.jboss.messaging.core.remoting.impl.ChannelImpl.sendBlocking(ChannelImpl.java:240)
| [java] at org.jboss.messaging.core.client.impl.ClientConsumerImpl.doCleanUp(ClientConsumerImpl.java:745)
| [java] at org.jboss.messaging.core.client.impl.ClientConsumerImpl.close(ClientConsumerImpl.java:315)
| [java] at org.jboss.messaging.jms.client.JBossMessageConsumer.close(JBossMessageConsumer.java:153)
| [java] at org.jboss.messaging.jms.client.JBossSession.close(JBossSession.java:275)
| [java] at org.jboss.messaging.jms.client.JBossConnection.close(JBossConnection.java:259)
| [java] at org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver.disconnect(SoakReconnectableReceiver.java:186)
| [java] at org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver.access$000(SoakReconnectableReceiver.java:41)
| [java] at org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver$2.onException(SoakReconnectableReceiver.java:99)
| [java] at org.jboss.messaging.jms.client.JBossConnection$JMSFailureListener$1.run(JBossConnection.java:550)
| [java] at java.lang.Thread.run(Thread.java:619)
| [java] Caused by: MessagingException[errorCode=2 message=Connection is destroyed]
| [java] ... 11 more
| [java] javax.jms.JMSException: Timed out waiting for response when sending packet 30
| [java] at org.jboss.messaging.core.remoting.impl.ChannelImpl.sendBlocking(ChannelImpl.java:312)
| [java] at org.jboss.messaging.core.client.impl.ConnectionManagerImpl.createSession(ConnectionManagerImpl.java:326)
| [java] at org.jboss.messaging.core.client.impl.ClientSessionFactoryImpl.createSessionInternal(ClientSessionFactoryImpl.java:977)
| [java] at org.jboss.messaging.core.client.impl.ClientSessionFactoryImpl.createSession(ClientSessionFactoryImpl.java:721)
| [java] at org.jboss.messaging.jms.client.JBossConnection.createSessionInternal(JBossConnection.java:463)
| [java] at org.jboss.messaging.jms.client.JBossConnection.createSession(JBossConnection.java:160)
| [java] at org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver.connect(SoakReconnectableReceiver.java:213)
| [java] at org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver.access$100(SoakReconnectableReceiver.java:41)
| [java] at org.jboss.jms.soak.example.reconnect.SoakReconnectableReceiver$2.onException(SoakReconnectableReceiver.java:100)
| [java] at org.jboss.messaging.jms.client.JBossConnection$JMSFailureListener$1.run(JBossConnection.java:550)
| [java] at java.lang.Thread.run(Thread.java:619)
| [java] Caused by: MessagingException[errorCode=3 message=Timed out waiting for response when sending packet 30]
| [java] ... 11 more
|
4. on the server, the number of remoting connection (and the memory) explodes:
| [java] **** Server Dump ****
| [java] date: Thu Aug 13 08:41:57 EDT 2009
| [java] heap memory: used=1.46 GB, max=1.88 GB
| [java] non-heap memory: used=1.46 GB, max=1.88 GB
| [java] # of thread: 62
| [java] # of conns: 112
| [java] JMS topics:
| [java] soakTopic: 17 subscription (17 non-durable) receiving 8337499 message
| [java] ********************
|
5. The exception continues on the subscribers
6. Things are messy on the server-side with more subscriptions than there are clients:
| [java] **** Server Dump ****
| [java] date: Thu Aug 13 08:43:57 EDT 2009
| [java] heap memory: used=1.70 GB, max=1.88 GB
| [java] non-heap memory: used=1.70 GB, max=1.88 GB
| [java] # of thread: 45
| [java] # of conns: 260
| [java] JMS topics:
| [java] soakTopic: 23 subscription (23 non-durable) receiving 10414915 message
| [java] ********************
|
6. the server start to clean up remoting connections and related resources
| [java] [Thread-0 (group:JBM-scheduled-threads-2030193044)] 08:44:25,717 WARNING [org.jboss.messaging.core.remoting.server.impl.RemotingServiceImpl] Did not receive initial ping from /172.16.8.15:38164, connection will be closed
| [java] [Thread-0 (group:JBM-scheduled-threads-2030193044)] 08:44:25,717 WARNING [org.jboss.messaging.core.remoting.server.impl.RemotingServiceImpl] Did not receive initial ping from /172.16.8.12:49942, connection will be closed
|
7. I also have some netty exception on the server:
| [java] [New I/O server worker #1-2] 08:44:31,281 WARNING [org.jboss.netty.channel.socket.nio.NioWorker] Unexpected exception in the selector loop.
| [java] org.jboss.netty.channel.ChannelException: Failed to register a socket to the selector.
| [java] at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:754)
| [java] at org.jboss.netty.channel.socket.nio.NioWorker.processRegisterTaskQueue(NioWorker.java:260)
| [java] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:199)
| [java] at org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:53)
| [java] at org.jboss.messaging.integration.transports.netty.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:184)
| [java] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
| [java] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
| [java] at java.lang.Thread.run(Thread.java:619)
| [java] Caused by: java.nio.channels.ClosedChannelException
| [java] at java.nio.channels.spi.AbstractSelectableChannel.configureBlocking(AbstractSelectableChannel.java:252)
| [java] at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:739)
| [java] ... 7 more
|
8. Failed connections now are unable to clean up related resource on the server
| [java] [Thread-3 (group:JBM-scheduled-threads-2030193044)] 08:47:55,595 SEVERE [org.jboss.messaging.core.management.impl.ManagementServiceImpl] Failed to call listener
| [java] java.lang.IllegalStateException: Cannot find queue info for queue b685ebae-d3b7-4a4d-81f1-862366829b1f8b8e8cc7-8806-11de-9b94-0015178e6d7c
| [java] at org.jboss.messaging.core.postoffice.impl.PostOfficeImpl.onNotification(PostOfficeImpl.java:387)
| [java] at org.jboss.messaging.core.management.impl.ManagementServiceImpl.sendNotification(ManagementServiceImpl.java:647)
| [java] at org.jboss.messaging.core.server.impl.ServerConsumerImpl.close(ServerConsumerImpl.java:272)
| [java] at org.jboss.messaging.core.server.impl.ServerSessionImpl.close(ServerSessionImpl.java:339)
| [java] at org.jboss.messaging.core.server.impl.ServerSessionImpl.doHandleClose(ServerSessionImpl.java:2217)
| [java] at org.jboss.messaging.core.server.impl.ServerSessionImpl.handleClose(ServerSessionImpl.java:882)
| [java] at org.jboss.messaging.core.server.impl.ServerSessionImpl.connectionFailed(ServerSessionImpl.java:1177)
| [java] at org.jboss.messaging.core.remoting.impl.RemotingConnectionImpl.callFailureListeners(RemotingConnectionImpl.java:393)
| [java] at org.jboss.messaging.core.remoting.impl.RemotingConnectionImpl.fail(RemotingConnectionImpl.java:244)
| [java] at org.jboss.messaging.core.remoting.server.impl.RemotingServiceImpl$FailedConnectionAction.run(RemotingServiceImpl.java:480)
| [java] at org.jboss.messaging.core.remoting.impl.Pinger.run(Pinger.java:116)
| [java] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
| [java] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
| [java] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
| [java] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
| [java] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
| [java] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
| [java] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
| [java] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
| [java] at java.lang.Thread.run(Thread.java:619)
|
9. During that time, the producers started to see missing pings from the server:
10. When things settle down, all clients are stuck
| [java] **** Server Dump ****
| [java] date: Thu Aug 13 09:22:10 EDT 2009
| [java] heap memory: used=1.70 GB, max=1.87 GB
| [java] non-heap memory: used=1.70 GB, max=1.87 GB
| [java] # of thread: 30
| [java] # of conns: 16
| [java] JMS topics:
| [java] soakTopic: 2 subscription (2 non-durable) receiving 0 message
| [java] ********************
|
I did the same run with <block-non-persistent-send> set to true and I did not get any error.
To clear things up, no servers were killed/restarted during these runs.
View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4249412#4249412
Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4249412
More information about the jboss-dev-forums
mailing list