[netty-dev] Problem during connection to the server which hangs the server

Trustin Lee tlee at redhat.com
Thu Oct 23 04:34:24 EDT 2008


Google shows the following pages:

* http://weblogs.java.net/blog/binod/archive/2006/12/glassfish_and_w.html
* http://forums.sun.com/thread.jspa?threadID=579675&messageID=2940535

Is there any firwall settings that triggers this issue in your Windows
machine?

HTH,

On Wed, Oct 22, 2008 at 12:48:26AM +0900, Frederic Bregier wrote:
> 
> Hi Trustin,
> 
> I'm not sure this problem comes from Netty or from my program.
> I'm really stuck with it, also because I am not able to write a simple
> example that reproduces this behaviour.
> So let me explain what I observe, and perhaps (magic?) you can find a clue?
> 
> The Server side accepts as many clients as they come.
> Both client and server have bootstrap.setOption("connectTimeoutMillis",x)
> ("child.connect..." for server) where x = 30 000 (so 30 seconds).
> Both use a ChannelPipelineFactory in order to create a new
> SimpleChannelHandler
> for each connection (for private data for each connection on both sides).
> 
> The client is hosted by JSP within Tomcat, where each new connection to
> Tomcat will
> launch one new connection to Netty server through a new Netty client. This
> connection
> is closed immediately, so its lifetime is the lifetime of the execution of 1
> JSP 
> request.
> 
> Everything is OK when I run the test manually or if I benchmark with low
> number
> of connections. But when I load with 20 concurrent users and repeat this
> load more
> than 10 000 (long time testing) on my Windows Vista Laptop using JDK 1.6.5
> with
> the very last version of Netty (from svn), everything is ok up to about 2400
> tries.
> 
> Server has the following starts: (exec is CachedThreadPool).
> NB: Since I don't know what is good for server or child, I repeat almost
> everything.
> Except "readWriteFair" since it hangs when set as "child.readWriteFair".
> 
> ChannelFactory factory = new NioServerSocketChannelFactory(exec, exec, 64);
> ServerBootstrap bootstrap = new ServerBootstrap(factory);
> bootstrap.setPipelineFactory(new
> IDMPipelineFactory(FileServerHandler.class));
> bootstrap.setOption("child.tcpNoDelay", true);
> bootstrap.setOption("child.keepAlive", true);
> bootstrap.setOption("child.reuseAddress", true);
> bootstrap.setOption("child.connectTimeoutMillis", 30000);
> //NEVER bootstrap.setOption("child.readWriteFair", true);
> bootstrap.setOption("tcpNoDelay", true);
> bootstrap.setOption("reuseAddress", true);
> bootstrap.setOption("connectTimeoutMillis", 30000);
> bootstrap.setOption("readWriteFair", true);
> bootstrap.bind(new InetSocketAddress(SERVER_PORT));
> 
> I've got the same behaviour with as simple as:
> ChannelFactory factory = new NioServerSocketChannelFactory(exec, exec, 64);
> ServerBootstrap bootstrap = new ServerBootstrap(factory);
> bootstrap.setPipelineFactory(new
> IDMPipelineFactory(FileServerHandler.class));
> bootstrap.setOption("child.tcpNoDelay", true);
> bootstrap.setOption("child.keepAlive", true);
> bootstrap.bind(new InetSocketAddress(SERVER_PORT));
> 
> Then I've got the following (after about 2400 requests):
> 
> - On client side (Tomcat JSP), the client is waiting for the server answer,
> and the
> test load is blocked until a timeout occurs from the client point of view.
> 
> "2008-10-21 15:02:57,868 WARN [http-8080-Processor20]
> openidm.web.client.IDMGet [NetworkHandler.java : 355] Connection (exemple:
> timeout 30000 < 20938) impossible since Failed to create a selector." 
> 
> is an example of logging. The last part of the message is the
> exception.getMessage
> (i.e. "Failed to create a selector.").
> 
> Note that the first number is the theorical timeout (30 seconds), the second
> number 
> is the practical time (in ms) between the creation of the channelHandler and
> this 
> logging. We can observe that the number measured is less than the one given
> as 
> configuration in bootstrap.
> 
> - On server side, the server hangs on the bootstrap for client with the
> following 
> message directly from Netty:
> 
> 2008-10-21 15:02:57,985 WARN [New I/O server boss #1 (channelId:
> 1f7ea494-011d-1000-b292-01fd0b48a883, /0.0.0.0:6666)]
> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink
> [LoggingHandler.java warn : 148] Failed to initialize an accepted socket.
> org.jboss.netty.channel.ChannelException: Failed to create a selector.
> 	at org.jboss.netty.channel.socket.nio.NioWorker.register(NioWorker.java:95)
> 	at
> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:209)
> 	at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> 	at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Unable to establish loopback connection
> 	at sun.nio.ch.PipeImpl$Initializer.run(PipeImpl.java:106)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at sun.nio.ch.PipeImpl.<init>(PipeImpl.java:122)
> 	at sun.nio.ch.SelectorProviderImpl.openPipe(SelectorProviderImpl.java:27)
> 	at java.nio.channels.Pipe.open(Pipe.java:133)
> 	at sun.nio.ch.WindowsSelectorImpl.<init>(WindowsSelectorImpl.java:105)
> 	at
> sun.nio.ch.WindowsSelectorProvider.openSelector(WindowsSelectorProvider.java:26)
> 	at java.nio.channels.Selector.open(Selector.java:209)
> 	at org.jboss.netty.channel.socket.nio.NioWorker.register(NioWorker.java:93)
> 	... 4 common frames omitted
> Caused by: java.net.ConnectException: Connection timed out: connect
> 	at sun.nio.ch.Net.connect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
> 	at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
> 	at sun.nio.ch.PipeImpl$Initializer.run(PipeImpl.java:78)
> 	... 12 common frames omitted
> 2008-10-21 15:03:19,589 WARN [New I/O server boss #1 (channelId:
> 1f7ea494-011d-1000-b292-01fd0b48a883, /0.0.0.0:6666)]
> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink
> [LoggingHandler.java warn : 148] Failed to initialize an accepted socket.
> org.jboss.netty.channel.ChannelException: Failed to create a selector.
> 	at org.jboss.netty.channel.socket.nio.NioWorker.register(NioWorker.java:95)
> 	at
> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:209)
> 	at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> 	at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Unable to establish loopback connection
> 	at sun.nio.ch.PipeImpl$Initializer.run(PipeImpl.java:106)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at sun.nio.ch.PipeImpl.<init>(PipeImpl.java:122)
> 	at sun.nio.ch.SelectorProviderImpl.openPipe(SelectorProviderImpl.java:27)
> 	at java.nio.channels.Pipe.open(Pipe.java:133)
> 	at sun.nio.ch.WindowsSelectorImpl.<init>(WindowsSelectorImpl.java:105)
> 	at
> sun.nio.ch.WindowsSelectorProvider.openSelector(WindowsSelectorProvider.java:26)
> 	at java.nio.channels.Selector.open(Selector.java:209)
> 	at org.jboss.netty.channel.socket.nio.NioWorker.register(NioWorker.java:93)
> 	... 4 common frames omitted
> Caused by: java.net.ConnectException: Connection timed out: connect
> 	at sun.nio.ch.Net.connect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
> 	at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
> 	at sun.nio.ch.PipeImpl$Initializer.run(PipeImpl.java:78)
> 	... 12 common frames omitted
> 	
> No error or exception is going to the channel (no creation) so I can't have
> more info.
> 
> When this log appears, the server process is going 100% CPU and is no more
> responsive 
> to the bootstrap associated to the client. However, I've got another
> bootstrap (bind)
> for special client (shutdown order) which implies the shutdown of everything
> (channels and ExecutorServices). This second bootstrap is functionnal (the
> connection
> is done, the message is get and answered and the function is going). But the
> server
> does not stopped (it does when I didn't get the previous error).
> 
> So I've done an output of the JVM running. Here it is:
> NB: There is only one code still running, and this is "New I/O server boss
> #1"
> 
> C:\Program Files\Java\jdk1.6.0_05\bin>jstack -l 10204
> 2008-10-21 15:04:07
> Full thread dump Java HotSpot(TM) Server VM (10.0-b19 mixed mode):
> 
> "DestroyJavaVM" prio=6 tid=0x02011c00 nid=0x10d0 waiting on condition
> [0x00000000..0x0035fd20]
>    java.lang.Thread.State: RUNNABLE
> 
>    Locked ownable synchronizers:
>         - None
> 
> "New I/O server boss #1 (channelId: 1f7ea494-011d-1000-b292-01fd0b48a883,
> /0.0.0.0:6666)" prio=6 tid=0x298bf400 nid=0x58c runnable
> [0x2995f000..0x2995f9e8]
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.Thread.yield(Native Method)
>         at
> org.jboss.netty.channel.socket.nio.NioWorker.register(NioWorker.java:102)
>         at
> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:209)
>         at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:78)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> 
>    Locked ownable synchronizers:
>         - <0x092dd910> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> 
> "Low Memory Detector" daemon prio=6 tid=0x020eac00 nid=0x27f0 runnable
> [0x00000000..0x00000000]
>    java.lang.Thread.State: RUNNABLE
> 
>    Locked ownable synchronizers:
>         - None
> 
> "CompilerThread1" daemon prio=10 tid=0x020e4c00 nid=0x1830 waiting on
> condition[0x00000000..0x2961f258]
>    java.lang.Thread.State: RUNNABLE
> 
>    Locked ownable synchronizers:
>         - None
> 
> "CompilerThread0" daemon prio=10 tid=0x020e0c00 nid=0x2038 waiting on
> condition[0x00000000..0x295cf2d8]
>    java.lang.Thread.State: RUNNABLE
> 
>    Locked ownable synchronizers:
>         - None
> 
> "Attach Listener" daemon prio=10 tid=0x020e0400 nid=0x2da8 waiting on
> condition[0x00000000..0x00000000]
>    java.lang.Thread.State: RUNNABLE
> 
>    Locked ownable synchronizers:
>         - None
> 
> "Signal Dispatcher" daemon prio=10 tid=0x020e0000 nid=0x1cf4 runnable
> [0x00000000..0x00000000]
>    java.lang.Thread.State: RUNNABLE
> 
>    Locked ownable synchronizers:
>         - None
> 
> "Finalizer" daemon prio=8 tid=0x020c9c00 nid=0x177c in Object.wait()
> [0x294df000..0x294dfd68]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x09264c50> (a java.lang.ref.ReferenceQueue$Lock)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
>         - locked <0x09264c50> (a java.lang.ref.ReferenceQueue$Lock)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
>         at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
> 
>    Locked ownable synchronizers:
>         - None
> 
> "Reference Handler" daemon prio=10 tid=0x020c6000 nid=0x2be8 in
> Object.wait() [0x2948f000..0x2948f9e8]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x092b75a0> (a java.lang.ref.Reference$Lock)
>         at java.lang.Object.wait(Object.java:485)
>         at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>         - locked <0x092b75a0> (a java.lang.ref.Reference$Lock)
> 
>    Locked ownable synchronizers:
>         - None
> 
> "VM Thread" prio=10 tid=0x020c2800 nid=0x2a94 runnable
> 
> "GC task thread#0 (ParallelGC)" prio=6 tid=0x008efc00 nid=0x2e80 runnable
> 
> "GC task thread#1 (ParallelGC)" prio=6 tid=0x02012800 nid=0x16a8 runnable
> 
> "VM Periodic Task Thread" prio=10 tid=0x020eb000 nid=0x1f54 waiting on
> condition
> 
> 
> JNI global references: 1163
> 
> 
> I don't know what's happening. I'm surely missing something...
> I know it will not be obvious without any functional example that reproduces
> this behaviour to know where could be the problem. But, who knows, perhaps
> you
> will find a clue... ?
> 
> Thank you,
> Frederic
> 
> NB: I've got another version where I try to implement something close to a
> pool
> of connection to Netty Server in order to keep low the cost with less
> connection/close,
> but I'm stuck with strange behaviour this time from what it seems to be
> Tomcat 
> servlet... (Content-Length: 0, Bogus chunk size or missing CR), and the
> problems seem
> to come from the client side this time, not the server side.
> 
> 
> -----
> Hardware/Software Architect
 
-- 
Trustin Lee, Principal Software Engineer, JBoss, a division of Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://lists.jboss.org/pipermail/netty-dev/attachments/20081023/faaed1b8/attachment.bin 


More information about the netty-dev mailing list