Shutting down the netty server takes long time when using SSLHandler.

Virat Gohil virat4lug at gmail.com
Wed Apr 15 10:12:46 EDT 2009


Hi Trustin,

Thanks for the in depth explanation! :)

I understand that changing the new behavior will cause "connection
reset by peer" on the client side, which is acceptable in our
scenario. Can I make a request for enhancement which will allow me to
control the behavior using some API(s) exposed by Netty? You can
provide it in some later release.

Meanwhile, I will use the future.awaitUninterruptibly(miliseconds);
function as a workaround. This will lead to some connections closing
properly while others terminating.

Thanks,

Virat

On Wed, Apr 15, 2009 at 7:30 PM, Trustin Lee <tlee at redhat.com> wrote:
> I realized that this issue can be resolved by closing the connection
> immediately after sending closure_notify (new behavior) instead of
> waiting for the client to respond with closure_notify (current
> behavior).  This is a legal behavior according to RFC2246 (7.2.1.
> Closure Alerts):
>
>   Each party is required to send a close_notify alert before closing
>   the write side of the connection. It is required that the other party
>   respond with a close_notify alert of its own and close down the
>   connection immediately, discarding any pending writes. It is not
>   required for the initiator of the close to wait for the responding
>   close_notify alert before closing the read side of the connection.
>
> However, doing this can cause a 'connection reset by peer' error on
> the client side because the client will try to respond to the
> closure_notify while the server closes the connection.
>
> Actually, there's nothing we can do with 'connection reset by peer'
> error and it is safe to ignore the exception.  The problem is that
> Netty cannot tell if a SocketException has been raised because of
> connection reset or not because simple string matcher will not work
> for localized messages.  It there is a definite way to detect the
> connection reset error then Netty could swallow it, which makes
> perfect sense.
>
> — Trustin Lee, http://gleamynode.net/
>
>
>
> On Wed, Apr 15, 2009 at 9:58 PM, Virat Gohil <virat4lug at gmail.com> wrote:
>> Hi Trustin,
>>
>> Thanks for the quick response!! :)
>>
>> I tried your suggestion on revision 1187, same result. I am willing to
>> test more suggestions :)
>>
>> Thanks,
>>
>> Virat
>>
>>
>> On Wed, Apr 15, 2009 at 6:04 PM, Trustin Lee <tlee at redhat.com> wrote:
>>> Please try Revision 1187 before making the modification I suggested.
>>> I've just checked in the potential fix for this issue.
>>>
>>> — Trustin Lee, http://gleamynode.net/
>>>
>>> On Wed, Apr 15, 2009 at 9:25 PM, Trustin Lee <tlee at redhat.com> wrote:
>>>> Hi Virat,
>>>>
>>>> On Wed, Apr 15, 2009 at 8:33 PM, Virat Gohil <virat.gohil at gmail.com> wrote:
>>>>> Hi All!
>>>>>
>>>>> I am facing a small problem shutting down my Netty based server with
>>>>> ~1200 connections.
>>>>>
>>>>> I am using the ChannelGroup as described in Getting Started guide,
>>>>> following is the code:
>>>>>
>>>>> public void stop()
>>>>>        {
>>>>>                if(timer!=null)
>>>>>                {
>>>>>                        timer.stop();
>>>>>                }
>>>>>                if(g!=null && factory!=null)
>>>>>                {
>>>>>                         ChannelGroupFuture future = g.close();
>>>>>                         future.awaitUninterruptibly();
>>>>>                        if(ch!=null)
>>>>>                        {
>>>>>                                ch.unbind();
>>>>>                        }
>>>>>                        try {
>>>>>                                bossExecutor.shutdownNow();
>>>>>                                workerExecutor.shutdownNow();
>>>>>                                workerExecutor.awaitTermination(3600, TimeUnit.SECONDS);
>>>>>                        bossExecutor.awaitTermination(3600, TimeUnit.SECONDS);
>>>>>                        } catch (InterruptedException e) {
>>>>>                                //print the exception
>>>>>                        }
>>>>>                        factory.releaseExternalResources();
>>>>>                }
>>>>>        }
>>>>>
>>>>> The execution gets stuck at future.awaitUninterruptibly(); I tried
>>>>> debugging the issue and found the following:
>>>>>
>>>>> 1. when g.close() is called the channel group creates a hashtable and
>>>>> creates a new DefaultChannelGroupFuture, which becomes the registered
>>>>> listener on all these channels.
>>>>> 2. Whenever channel.close() is called, the DefaultChannelGroupFuture
>>>>> gets called and increments the succes/failure count.
>>>>> 3. if the success+failure count=number of channels in the group, then
>>>>> the operation is considered finished and the thread waiting on the
>>>>> defaultchannelgroupfuture is released.
>>>>>
>>>>> I observed in SSLHandler that Channels.close() is called only if the
>>>>> received frame was empty and the inbound was finished:
>>>>> SSLHandler.java:406 (decode())
>>>>>       if (frame == null && engine.isInboundDone()) {
>>>>>            synchronized (closeFutures) {
>>>>>                for (;;) {
>>>>>                    ChannelFuture future = closeFutures.poll();
>>>>>                    if (future == null) {
>>>>>                        break;
>>>>>                    }
>>>>>                    Channels.close(ctx, future);
>>>>>                }
>>>>>            }
>>>>>        }
>>>>> Sometimes, either the frame is not null or inbound is not completed,
>>>>> this causes the SSLHandler to continue decoding.  This leads in
>>>>> DefaultChannelGroupFuture.childListener.operationComplete() being
>>>>> called after a very long time.
>>>>
>>>> 1) What is the state of the actual connection?  Is it closed or still connected?
>>>>
>>>> 2) What happens if you replace:
>>>>
>>>>        if (frame == null && engine.isInboundDone()) {
>>>> with:
>>>>
>>>>        if (frame == null && engine.isInboundDone() || !channel.isConnected()) {
>>>>
>>>> ?
>>>>
>>>>> What we would prefer to do, is to abandon the incomplete data in
>>>>> SSLHandler.decode() and close the channel immediately as soon as the
>>>>> server's shutdown method is called. Please let me know if I am missing
>>>>> something or if there is another way of achieving a faster shutdown.
>>>>
>>>> You are doing correctly and the ChannelGroupFuture should return
>>>> quickly.  Thanks for reporting the problem!
>>>>
>>>> Trustin
>>>>
>>>
>>> _______________________________________________
>>> netty-users mailing list
>>> netty-users at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/netty-users
>>>
>>
>> _______________________________________________
>> netty-users mailing list
>> netty-users at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/netty-users
>>
>
> _______________________________________________
> netty-users mailing list
> netty-users at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/netty-users
>




More information about the netty-users mailing list