Hi Stuart, Thanks for your reply:
We've observed this situation even against a no-op end point
which
basically
dispatches a handler, so we've eliminated almost all of our code
from the
equation. We also removed HTTPS traffic to take SSL out of the equation.
CPU
utilization on the boxes is very low and memory is fine as well. Disk
I/O
is
also not an issue... we don't write to disk when hitting the
no-op
endpoint
What JVM and OS version are you using? This sounds like it might be an NIO
issue, or some kind of NIO/TCP tuning issue.
> We're running 1.7.0_45-b18 on Amazon Linux
(amzn-ami-hvm-2014.09.1.x86_64-ebs
(ami-4b6f650e))
We're currently runnning on c2-xlarge EC2 instances (8 gb ram/4
cores) in
7
amazon regions. We've tried tuning keepalive, IO thread count
(currently
set
to 4) and core/max task worker count (40) to no avail. We decided to
move
our compute instances behind haproxy, which has improved the tcp failure
rates but we are still seeing very low throughput (roughly 200-300
request/sec max)
Is it this low even with the empty endpoint?
We took those measurements with our normal endpoints. We're in the process
of setting up some new tests against a more highly instrumented build to
get some fresh numbers. Will post when we have them.
We are using 1.1.0-Final version of undertow. We tried 1.2.0-Beta 6
but
after
deploying our servers froze after about 10 minutes so we had to roll
back.
Did you happen to get a thread dump or any info from 1.2.0.Beta6 when it
locked up?
I did but sadly I didn't keep it :-(. As I recall though, it was
similar
to the others...IO threads sitting on epoll and task workers parked waiting
for jobs.
I've upgraded one of our servers with Beta 6 tonight and am running it,
but so far it is performing normally. It's sitting behind HA Proxy, which
seems to be smoothing out the traffic so I may not be able to replicate the
issue until I can get it redeployed from behind HAP. Will advise further
when I've done that.
many thanks,
Matt
On Sun, Jan 18, 2015 at 4:57 AM, Stuart Douglas <sdouglas(a)redhat.com> wrote:
> ----- Original Message -----
> > From: "Matt Clarkson" <mclarkson(a)eyeota.com
> > To: undertow-dev(a)lists.jboss.org
> > Sent: Saturday, 17 January, 2015 3:42:34 PM
> > Subject: [undertow-dev] Help, please: Observing low Undertow throughput
> under heavy loads
>
> > Hi Undertow Team,
>
> > We recently deployed a large platform for processing
high-frequency http
> > signals from around the Internet. We are using undertow as our embedded
> http
> > server and are experiencing some serious throughput issues. Hoping you
> can
> > help us to remedy them. Here are our findings so far.
>
> > -When we dump thread stacks using jstack for a loaded
server, we observe
> that
> > the I/O threads (1/core) are all blockng at
> > sun.nio.ch.EPollArrayWrapper.epollWait(Native Method).
> > -At the same time we see large numbers of TCP Timeouts, TCP Listen
> Drops, and
> > TCP Overflows, which would seem to imply that we are not processing
> > connections fast enough
> > -There are large numbers of sockets int TIME_WAIT status
> > -TaskWorker threads are underutilized and most are in WAITING state
> sitting
> > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>
>
We've observed this situation even
against a no-op end point which
> basically
> > dispatches a handler, so we've eliminated almost all of our code from the
> > equation. We also removed HTTPS traffic to take SSL out of the equation.
> CPU
> > utilization on the boxes is very low and memory is fine as well. Disk
> I/O is
>
also not an issue... we don't write to disk when hitting the
no-op
> endpoint
>
> What JVM and OS version are you
using? This sounds like it might be an NIO
> issue, or some kind of NIO/TCP tuning issue.
> > We're currently runnning on c2-xlarge EC2
instances (8 gb ram/4 cores)
> in 7
>
amazon regions. We've tried tuning keepalive, IO thread count
(currently
> set
> > to 4) and core/max task worker count (40) to no avail. We decided to move
> > our compute instances behind haproxy, which has improved the tcp failure
> > rates but we are still seeing very low throughput (roughly 200-300
> > request/sec max)
> Is it this low even with the empty endpoint?
>
>
We are using
1.1.0-Final version of undertow. We tried 1.2.0-Beta 6 but
> after
> > deploying our servers froze after about 10 minutes so we had to roll
> back.
> Did you happen to get a thread dump or any info from
1.2.0.Beta6 when it
> locked up?
> Thanks,
> Stuart
>
> > Do you have any tips on
other things we can look at ?
>
> > Thanks in advance,
>
> > Matt C.
>
> > _______________________________________________
> > undertow-dev mailing list
> > undertow-dev(a)lists.jboss.org
> >
https://lists.jboss.org/mailman/listinfo/undertow-dev