If this issue is similar to the one we encountered on our machines, this
helps to reproduce it (HTTP2, TLSv1.3 is optional):
https://gist.github.com/lyind/ae076548cafb2cb0b46a0819b749d6f4#file-curl-ssltest-sh
The critical steps are:
1. Attach debugger to your server process (normally via remote
debugging)
2. On a client at least one layer 3 network segment away (behind one or
more IP routers):
start >= 6 requests in a loop (increase the chance of "vulnerable"
TCP connections at step 3)
3. Sever the connection WITHOUT one of the two (client, server)
operating system kernels being able to update TCP state:
pull LAN cable of the client or shutdown router at the client-side
4. Repeat 2) and 3) until:
1. $ netstat -tnp | grep java
-> some connections in CLOSE_WAIT (then look how the server
handles these, play with operating system tunables)
2. (optional) XNIO threads stuck at 100% CPU
3. (optional) unexpected exceptions occur
In theory one could also use some firewall or network emulator trickery
to cut the connections but that would be harder to get right and still
be quite complex for a unit or integration tests. Sounds more like a
case for the physical network test lab.
On 2020-03-04 06:24, Flavia Rainone wrote:
> Can someone provide a reproducer for this error?
>
> As for the old version of XNIO, it will be upgraded in Undertow
> 2.1.0.Final.
>
> On Mon, Mar 2, 2020 at 8:47 PM Stuart Douglas <sdouglas@redhat.com>
> wrote:
>
>> Hmm, maybe this is a bug in the HTTP/2 close code then, and somehow
>> the connection is not being closed if the client hangs up abruptly.
>> I had a quick look at the code though and I think it looks ok, but
>> maybe some more investigation is needed.
>>
>> Stuart
>>
>> On Tue, 3 Mar 2020 at 03:41, Nishant Kumar
>> <nishantkumar35@gmail.com> wrote:
>>
>> Yes, i have no control on client side. I am using HTTP2. I have
>> tried increasing open file limit to 400k but that consumes all
>> memory and system hangs. I will probably try to put a nginx in front
>> of Undertow and test.
>>
>> setServerOption(UndertowOptions.ENABLE_HTTP2, true)
>>
>> On Mon, Mar 2, 2020, 7:48 PM David Lloyd <david.lloyd@redhat.com>
>> wrote:
>> On Mon, Mar 2, 2020 at 7:56 AM Stan Rosenberg
>> <stan.rosenberg@acm.org> wrote:
>>>
>>> Stuck in CLOSE_WAIT is a symptom of the client-side not properly
>> shutting down [1].
>>
>> I would partially disagree. In the article you linked: "It all
>> starts
>> with a listening application that leaks sockets and forgets to call
>> close(). This kind of bug does happen in complex applications."
>> This
>> seems to be essentially what's happening here: the server isn't
>> completing the connection (for some reason), stranding the socket in
>> `CLOSE_WAIT`.
>>
>> We can't assume that the client is abandoning the connection after
>> `FIN_WAIT2` (the titular RFC violation); if the server stays in
>> `CLOSE_WAIT`, then even if the client dutifully stays in `FIN_WAIT2`
>> forever, the resolving condition still needs to be that the server
>> shuts down its side of the connection.
>>
>> This diagram is a useful visual aid, mapping TCP states to the XNIO
>> API:
>>
> https://www.lucidchart.com/publicSegments/view/524ec20a-5c40-4fd0-8bde-0a1c0a0046e1/image.png
>>
>> --
>> - DML
> _______________________________________________
> undertow-dev mailing list
> undertow-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/undertow-dev
>
> --
>
> Flavia Rainone
>
> Principal Software Engineer
>
> Red Hat [1]
>
> frainone@redhat.com
>
> [1]
>
>
>
> Links:
> ------
> [1] https://www.redhat.com
> _______________________________________________
> undertow-dev mailing list
> undertow-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/undertow-dev
_______________________________________________
undertow-dev mailing list
undertow-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/undertow-dev