[undertow-dev] Loss of perfomance between 1.3.0.Beta9 and 1.3.18.Final

Jim Crossley jim at crossleys.org
Thu Mar 10 10:02:38 EST 2016


Hi Stuart,

Your fix definitely has an impact. Here's what I'm seeing:

Serial ab (unmodified bench.sh), non-persistent connections:

Beta9:                                   25229
1.3.18.Final:                            16007
1.3.18.Final+XNIO 3.3.6.Final-SNAPSHOT:  23153

Parallel ab, non-persistent connections:

Beta9:                                    3740
1.3.18.Final:                             2447
1.3.18.Final+XNIO 3.3.6.Final-SNAPSHOT:   6675

Serial ab, persistent connections (-k option):

Beta9:                                  108847
1.3.18.Final:                           106262
1.3.18.Final+XNIO 3.3.6.Final-SNAPSHOT: 107984

My numbers are lower than yours. My hardware is a quad core Lenovo T430s
running Linux 3.13.0-37-generic x86_64... due to be replaced in May ;)

Thanks,
Jim

On Wed, Mar 9, 2016 at 8:49 PM, Stuart Douglas <sdouglas at redhat.com> wrote:

>
> TLDR: Looks like there is a problem in XNIO accept handling, I have
> submitted a PR, after my PR it should be considerably faster than Beta9 (at
> least on my machine). This problem only affects non-persistent connections.
>
>
>
> I have looked into this a bit more, I have done some testing on my linux
> machine (4 core CentOS).
>
> Running the benchmark as described with the NO_REQUEST_TIMEOUT set to -1 I
> get:
>
> Beta9:          23031
> 1.3.18.Final:   22578
>
> However adding -k to the ab command to use persistent connections gives a
> different result:
>
> Beta9:        169100
> 1.3.18.Final  173462
>
> This implies that the issue is only with non persistent connections.
> Something that did change between Beta9 and 1.3.18.Final is XNIO connection
> handling, which has been changed to use a dedicated accept thread. Looking
> into this I have noticed a potential problem that is causing contention. I
> have submitted a fix at https://github.com/xnio/xnio/pull/94.
>
> Something else to note is that ab was 100% maxed out on a CPU core when
> running these tests. In general ab is not a great load driver as it is
> single threaded.
>
> The results get much more interesting if you modify the bench.sh slightly
> by adding an '&' to the end of the ab line and removing the 'sleep 5' call.
> This makes all 5 ab instances run at once, which is enough to max out the
> CPU on my machine (I also multiplied the number of requests by 5, the
> current number is a bit low).
>
> Using this approach I get the following results (3.3.6.Final-SNAPSHOT
> includes the PR I linked above):
>
> Non persistent connections:
>
> Beta9:                                    8,810
> 1.3.18.Final:                             6,578
> 1.3.18.Final + XNIO 3.3.6.Final-SNAPSHOT: 10,285
>
> When under heavy load the accept thread approach performs much better than
> the old approach (which was what we saw in Specj), however the XNIO bug was
> causing problems.
>
> Jim, because you saw a much greater performance loss that I did would you
> be able to re-run some of these tests with my XNIO changes and verify that
> this also fixes the issue for you (ideally using multiple ab instances to
> really load the machine)? If there are still problems I would like to know
> what sort of hardware you are seeing this on.
>
>
> Stuart
>
>
>
>
> ----- Original Message -----
> > From: "Andrig T. Miller" <anmiller at redhat.com>
> > To: "Jim Crossley" <jim at crossleys.org>
> > Cc: "Stuart Douglas" <sdouglas at redhat.com>, undertow-dev at lists.jboss.org
> > Sent: Thursday, 10 March, 2016 8:28:37 AM
> > Subject: Re: [undertow-dev] Loss of perfomance between 1.3.0.Beta9 and
>       1.3.18.Final
> >
> > Stuart,
> >
> > I'm not sure what this undertow-speed app does, but I'm nervous that
> tuning
> > for it, may undo some of the improvements we made in the performance lab.
> > One thing I would suggest though, is using perf on Linux and creating a
> > flamegraph for Beta9 and the Final. It's likely that some methods that
> were
> > being inlined are no longer being inlined, and the flamegraphs and the
> > underlying perf data will show that. In order to use perf you have to set
> > the JVM parameter to preserve frame pointers:
> >
> > -XX:+PreserveFramePointer
> >
> > Ping anyone from the performance team, and they can help you with the
> setup
> > of perf, and generating the flame graphs.
> >
> > Andy
> >
> > ----- Original Message -----
> >
> > > Hi Stuart,
> >
> > > Toby asked me to try on my machine, and I see an even bigger
> > > throughput disparity. I'm using his test app:
> > > https://github.com/tobias/undertow-speed
> >
> > > In one shell I run 'mvn clean compile exec:java' and in another I run
> > > './bench-avg.sh' and I get this output:
> >
> > > jim at minty ~/apps/undertow-speed $ ./bench-avg.sh
> > > Requests per second: 14717.03 [#/sec] (mean)
> > > Requests per second: 14527.32 [#/sec] (mean)
> > > Requests per second: 14288.32 [#/sec] (mean)
> > > Requests per second: 14375.64 [#/sec] (mean)
> > > Requests per second: 14653.08 [#/sec] (mean)
> > > Average: 14512
> >
> > > Then I run 'mvn clean compile exec:java -Pbeta9' in the first shell
> > > and re-run bench-avg.sh. I get this:
> >
> > > jim at minty ~/apps/undertow-speed $ ./bench-avg.sh
> > > Requests per second: 24069.72 [#/sec] (mean)
> > > Requests per second: 25002.35 [#/sec] (mean)
> > > Requests per second: 24885.36 [#/sec] (mean)
> > > Requests per second: 25261.30 [#/sec] (mean)
> > > Requests per second: 24800.82 [#/sec] (mean)
> > > Average: 24803.4
> >
> > > As you can see, quite a bit more than 7%, beta9 yields almost 70%
> > > better throughput for me!
> >
> > > I set the option you suggested for 1.3.18 and I get slightly better
> > > numbers:
> >
> > > jim at minty ~/apps/undertow-speed $ ./bench-avg.sh
> > > Requests per second: 15749.52 [#/sec] (mean)
> > > Requests per second: 15309.83 [#/sec] (mean)
> > > Requests per second: 15909.15 [#/sec] (mean)
> > > Requests per second: 16228.10 [#/sec] (mean)
> > > Requests per second: 16118.84 [#/sec] (mean)
> > > Average: 15862.6
> >
> > > But nowhere close to beta9.
> >
> > > Can you clone his app and reproduce locally?
> >
> > > Thanks,
> > > Jim
> >
> > > On Tue, Mar 8, 2016 at 6:35 PM, Stuart Douglas <sdouglas at redhat.com>
> wrote:
> > > > Can you re-run but with the following setting:
> > > >
> > > > .setServerOption(UndertowOptions.NO_REQUEST_TIMEOUT, -1)
> > > >
> > > > The default changed between these versions, so now idle connections
> will
> > > > eventually be timed out (otherwise browsers can hold connections for
> a
> > > > very time long which was causing people to have issues with FD
> > > > exhaustion).
> > > >
> > > > Stuart
> > > >
> > > > ----- Original Message -----
> > > >> From: "Stuart Douglas" <sdouglas at redhat.com>
> > > >> To: "Toby Crawley" <toby at tcrawley.org>
> > > >> Cc: undertow-dev at lists.jboss.org
> > > >> Sent: Monday, 7 March, 2016 10:59:27 AM
> > > >> Subject: Re: [undertow-dev] Loss of perfomance between 1.3.0.Beta9
> and
> > > >> 1.3.18.Final
> > > >>
> > > >> This is not a known issue, I will investigate.
> > > >>
> > > >> Stuart
> > > >>
> > > >> ----- Original Message -----
> > > >> > From: "Toby Crawley" <toby at tcrawley.org>
> > > >> > To: undertow-dev at lists.jboss.org
> > > >> > Sent: Saturday, 5 March, 2016 7:29:00 AM
> > > >> > Subject: [undertow-dev] Loss of perfomance between 1.3.0.Beta9 and
> > > >> > 1.3.18.Final
> > > >> >
> > > >> > Is there a known decrease in throughput (measured with req/s)
> > > >> > between 1.3.0.Beta9 and 1.3.18.Final? We currently ship the former
> > > >> > with Immutant, and were looking at upgrading to the latter in the
> next
> > > >> > release, but noticed a decrease in throughput with a simple
> Clojure
> > > >> > benchmark app.
> > > >> >
> > > >> > I have replicated the basics of our benchmark app in Java[1], and
> saw
> > > >> > a
> > > >> > decrease in req/s between the two versions of ~7% when testing
> with ab
> > > >> > and averaging the output of several runs.
> > > >> >
> > > >> > Is there something that changed between those versions that is
> known
> > > >> > to have reduced performance?
> > > >> >
> > > >> > - Toby
> > > >> >
> > > >> > [1]: https://github.com/tobias/undertow-speed
> > > >> > _______________________________________________
> > > >> > undertow-dev mailing list
> > > >> > undertow-dev at lists.jboss.org
> > > >> > https://lists.jboss.org/mailman/listinfo/undertow-dev
> > > >> >
> > > >> _______________________________________________
> > > >> undertow-dev mailing list
> > > >> undertow-dev at lists.jboss.org
> > > >> https://lists.jboss.org/mailman/listinfo/undertow-dev
> > > >>
> > > > _______________________________________________
> > > > undertow-dev mailing list
> > > > undertow-dev at lists.jboss.org
> > > > https://lists.jboss.org/mailman/listinfo/undertow-dev
> > > _______________________________________________
> > > undertow-dev mailing list
> > > undertow-dev at lists.jboss.org
> > > https://lists.jboss.org/mailman/listinfo/undertow-dev
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/undertow-dev/attachments/20160310/1d154622/attachment.html 


More information about the undertow-dev mailing list