I tried disabling HTTP/2 but still too many CLOSE_WAIT connections. I tried putting Nginx in front of the Java server but seems the same issue with it. It seems Nginx is also creating the same number of connections with Java.
server = Undertow.builder()
.addHttpListener(SERVER_LISTEN_PORT, SERVER_HOST)
.addHttpsListener(SERVER_SSL_LISTEN_PORT, SERVER_HOST, sslContext)
.setWorkerThreads(WORKER_THREAD)
.setServerOption(UndertowOptions.ENABLE_HTTP2, false)
.setServerOption(UndertowOptions.IDLE_TIMEOUT, 150000) // 150s
.setServerOption(UndertowOptions.NO_REQUEST_TIMEOUT, 150000) // 150s
.setServerOption(org.xnio.Options.SSL_SERVER_SESSION_CACHE_SIZE, 1024 * 20) // 20000 sessions
.setServerOption(org.xnio.Options.SSL_SERVER_SESSION_TIMEOUT, 1500) // 150s
.setIoThreads(IO_THREAD)
.setWorkerOption(org.xnio.Options.TCP_NODELAY, true)
.setSocketOption(org.xnio.Options.TCP_NODELAY, true)
.setSocketOption(org.xnio.Options.KEEP_ALIVE, true)
.setSocketOption(org.xnio.Options.REUSE_ADDRESSES, true)
.setSocketOption(org.xnio.Options.CONNECTION_HIGH_WATER, 100000)
.setSocketOption(org.xnio.Options.CONNECTION_LOW_WATER, 100000)
.setHandler(Handlers.routing().post("/", new RequestHandler(appContext)))
.build();
# netstat -nalp | grep -E ":80 |:443 " | awk '{split($4,a,":");print a[2] " " $6}'| sort | uniq -c
85918 443 CLOSE_WAIT
10279 443 ESTABLISHED
67 443 LAST_ACK
152 443 SYN_RECV
505 443 TIME_WAIT
31151 80 CLOSE_WAIT
3747 80 ESTABLISHED
108 80 LAST_ACK
146 80 SYN_RECV
2 LISTEN
Hmm, maybe this is a bug in the HTTP/2 close code then, and somehow the connection is not being closed if the client hangs up abruptly. I had a quick look at the code though and I think it looks ok, but maybe some more investigation is needed.
Stuart
Yes, i have no control on client side. I am using HTTP2. I have tried increasing open file limit to 400k but that consumes all memory and system hangs. I will probably try to put a nginx in front of Undertow and test.
setServerOption(UndertowOptions.ENABLE_HTTP2, true)
On Mon, Mar 2, 2020 at 7:56 AM Stan Rosenberg <stan.rosenberg@acm.org> wrote:
>
> Stuck in CLOSE_WAIT is a symptom of the client-side not properly shutting down [1].
I would partially disagree. In the article you linked: "It all starts
with a listening application that leaks sockets and forgets to call
close(). This kind of bug does happen in complex applications." This
seems to be essentially what's happening here: the server isn't
completing the connection (for some reason), stranding the socket in
`CLOSE_WAIT`.
We can't assume that the client is abandoning the connection after
`FIN_WAIT2` (the titular RFC violation); if the server stays in
`CLOSE_WAIT`, then even if the client dutifully stays in `FIN_WAIT2`
forever, the resolving condition still needs to be that the server
shuts down its side of the connection.
This diagram is a useful visual aid, mapping TCP states to the XNIO
API: https://www.lucidchart.com/publicSegments/view/524ec20a-5c40-4fd0-8bde-0a1c0a0046e1/image.png
--
- DML
--