Re: [undertow-dev] Help, please: Observing low Undertow throughput under heavy loads

Saturday, 17 January 2015

Hello,

cant help with specific Undertow experience in this issue, but in an
other NIO service we had seen similar lockups beeing resolved by
reverting to the Poll based selector provider.

-Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.PollSelectorProvider

(I guess for XNIO it would be -Dxnio.nio.selector.provider)

Maybe you can give it a try, even if it isnt the most desireable
solution .

BTW: what OS and JVM you are using?

Gruss
Bernd

Am Sat, 17 Jan 2015
12:42:34 +0800 schrieb Matt Clarkson <mclarkson(a)eyeota.com&gt;:

...
 Hi Undertow Team,

 We recently deployed a large platform for processing high-frequency
 http signals from around the Internet.  We are using undertow as our
 embedded http server and are experiencing some serious throughput
 issues.  Hoping you can help us to remedy them.  Here are our
 findings so far.

 -When we dump thread stacks using jstack for a loaded server, we
 observe that the I/O threads (1/core) are all blockng at
 sun.nio.ch.EPollArrayWrapper.epollWait(Native Method).
 -At the same time we see large numbers of  TCP Timeouts, TCP Listen
 Drops, and TCP Overflows, which would seem to imply that we are not
 processing connections fast enough
 -There are large numbers of sockets int TIME_WAIT status
 -TaskWorker threads are underutilized and most are in WAITING state
 sitting at
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)

 We've observed this situation even against a no-op end point which
 basically dispatches a handler, so we've eliminated almost all of our
 code from the equation.  We also removed HTTPS traffic to take SSL
 out of the equation.  CPU utilization on the boxes is very low and
 memory is fine as well.  Disk I/O is also not an issue... we don't
 write to disk when hitting the no-op endpoint

 We're currently runnning on c2-xlarge EC2 instances (8 gb ram/4
 cores) in 7 amazon regions.  We've tried tuning keepalive, IO thread
 count (currently set to 4) and core/max task worker count (40) to no
 avail.   We decided to move our compute instances  behind haproxy,
 which has improved the tcp failure rates but we are still seeing very
 low throughput (roughly 200-300 request/sec max)

 We are using 1.1.0-Final version of undertow.  We tried 1.2.0-Beta 6
 but after deploying our servers froze after about 10 minutes so we
 had to roll back.

 Do you have any tips on other things we can look at ?

 Thanks in advance,

 Matt C.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [undertow-dev] Help, please: Observing low Undertow throughput under heavy loads