<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Mar 10, 2016 at 4:29 PM, Stuart Douglas <span dir="ltr"><<a href="mailto:sdouglas@redhat.com" target="_blank">sdouglas@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br></span><span class="">> Serial ab, persistent connections (-k option):<br>
><br>
> Beta9: 108847<br>
> 1.3.18.Final: 106262<br>
> 1.3.18.Final+XNIO 3.3.6.Final-SNAPSHOT: 107984<br>
<br>
</span>How many requests were you running for this test? In general to get stable numbers for persistent connections you should use more requests (as they run much faster).<br>
<span class="HOEnZb"><font color="#888888"><br></font></span></blockquote><div><br></div><div>The original value in bench.sh: 100K. I just tried 500K and it yielded about the same number.</div><div><br></div><div>Jim</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="HOEnZb"><font color="#888888">
Stuart<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
><br>
> My numbers are lower than yours. My hardware is a quad core Lenovo T430s<br>
> running Linux 3.13.0-37-generic x86_64... due to be replaced in May ;)<br>
><br>
> Thanks,<br>
> Jim<br>
><br>
> On Wed, Mar 9, 2016 at 8:49 PM, Stuart Douglas <<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>> wrote:<br>
><br>
> ><br>
> > TLDR: Looks like there is a problem in XNIO accept handling, I have<br>
> > submitted a PR, after my PR it should be considerably faster than Beta9 (at<br>
> > least on my machine). This problem only affects non-persistent connections.<br>
> ><br>
> ><br>
> ><br>
> > I have looked into this a bit more, I have done some testing on my linux<br>
> > machine (4 core CentOS).<br>
> ><br>
> > Running the benchmark as described with the NO_REQUEST_TIMEOUT set to -1 I<br>
> > get:<br>
> ><br>
> > Beta9: 23031<br>
> > 1.3.18.Final: 22578<br>
> ><br>
> > However adding -k to the ab command to use persistent connections gives a<br>
> > different result:<br>
> ><br>
> > Beta9: 169100<br>
> > 1.3.18.Final 173462<br>
> ><br>
> > This implies that the issue is only with non persistent connections.<br>
> > Something that did change between Beta9 and 1.3.18.Final is XNIO connection<br>
> > handling, which has been changed to use a dedicated accept thread. Looking<br>
> > into this I have noticed a potential problem that is causing contention. I<br>
> > have submitted a fix at <a href="https://github.com/xnio/xnio/pull/94" rel="noreferrer" target="_blank">https://github.com/xnio/xnio/pull/94</a>.<br>
> ><br>
> > Something else to note is that ab was 100% maxed out on a CPU core when<br>
> > running these tests. In general ab is not a great load driver as it is<br>
> > single threaded.<br>
> ><br>
> > The results get much more interesting if you modify the bench.sh slightly<br>
> > by adding an '&' to the end of the ab line and removing the 'sleep 5' call.<br>
> > This makes all 5 ab instances run at once, which is enough to max out the<br>
> > CPU on my machine (I also multiplied the number of requests by 5, the<br>
> > current number is a bit low).<br>
> ><br>
> > Using this approach I get the following results (3.3.6.Final-SNAPSHOT<br>
> > includes the PR I linked above):<br>
> ><br>
> > Non persistent connections:<br>
> ><br>
> > Beta9: 8,810<br>
> > 1.3.18.Final: 6,578<br>
> > 1.3.18.Final + XNIO 3.3.6.Final-SNAPSHOT: 10,285<br>
> ><br>
> > When under heavy load the accept thread approach performs much better than<br>
> > the old approach (which was what we saw in Specj), however the XNIO bug was<br>
> > causing problems.<br>
> ><br>
> > Jim, because you saw a much greater performance loss that I did would you<br>
> > be able to re-run some of these tests with my XNIO changes and verify that<br>
> > this also fixes the issue for you (ideally using multiple ab instances to<br>
> > really load the machine)? If there are still problems I would like to know<br>
> > what sort of hardware you are seeing this on.<br>
> ><br>
> ><br>
> > Stuart<br>
> ><br>
> ><br>
> ><br>
> ><br>
> > ----- Original Message -----<br>
> > > From: "Andrig T. Miller" <<a href="mailto:anmiller@redhat.com">anmiller@redhat.com</a>><br>
> > > To: "Jim Crossley" <<a href="mailto:jim@crossleys.org">jim@crossleys.org</a>><br>
> > > Cc: "Stuart Douglas" <<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>>, <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > Sent: Thursday, 10 March, 2016 8:28:37 AM<br>
> > > Subject: Re: [undertow-dev] Loss of perfomance between 1.3.0.Beta9 and<br>
> > 1.3.18.Final<br>
> > ><br>
> > > Stuart,<br>
> > ><br>
> > > I'm not sure what this undertow-speed app does, but I'm nervous that<br>
> > tuning<br>
> > > for it, may undo some of the improvements we made in the performance lab.<br>
> > > One thing I would suggest though, is using perf on Linux and creating a<br>
> > > flamegraph for Beta9 and the Final. It's likely that some methods that<br>
> > were<br>
> > > being inlined are no longer being inlined, and the flamegraphs and the<br>
> > > underlying perf data will show that. In order to use perf you have to set<br>
> > > the JVM parameter to preserve frame pointers:<br>
> > ><br>
> > > -XX:+PreserveFramePointer<br>
> > ><br>
> > > Ping anyone from the performance team, and they can help you with the<br>
> > setup<br>
> > > of perf, and generating the flame graphs.<br>
> > ><br>
> > > Andy<br>
> > ><br>
> > > ----- Original Message -----<br>
> > ><br>
> > > > Hi Stuart,<br>
> > ><br>
> > > > Toby asked me to try on my machine, and I see an even bigger<br>
> > > > throughput disparity. I'm using his test app:<br>
> > > > <a href="https://github.com/tobias/undertow-speed" rel="noreferrer" target="_blank">https://github.com/tobias/undertow-speed</a><br>
> > ><br>
> > > > In one shell I run 'mvn clean compile exec:java' and in another I run<br>
> > > > './bench-avg.sh' and I get this output:<br>
> > ><br>
> > > > jim@minty ~/apps/undertow-speed $ ./bench-avg.sh<br>
> > > > Requests per second: 14717.03 [#/sec] (mean)<br>
> > > > Requests per second: 14527.32 [#/sec] (mean)<br>
> > > > Requests per second: 14288.32 [#/sec] (mean)<br>
> > > > Requests per second: 14375.64 [#/sec] (mean)<br>
> > > > Requests per second: 14653.08 [#/sec] (mean)<br>
> > > > Average: 14512<br>
> > ><br>
> > > > Then I run 'mvn clean compile exec:java -Pbeta9' in the first shell<br>
> > > > and re-run bench-avg.sh. I get this:<br>
> > ><br>
> > > > jim@minty ~/apps/undertow-speed $ ./bench-avg.sh<br>
> > > > Requests per second: 24069.72 [#/sec] (mean)<br>
> > > > Requests per second: 25002.35 [#/sec] (mean)<br>
> > > > Requests per second: 24885.36 [#/sec] (mean)<br>
> > > > Requests per second: 25261.30 [#/sec] (mean)<br>
> > > > Requests per second: 24800.82 [#/sec] (mean)<br>
> > > > Average: 24803.4<br>
> > ><br>
> > > > As you can see, quite a bit more than 7%, beta9 yields almost 70%<br>
> > > > better throughput for me!<br>
> > ><br>
> > > > I set the option you suggested for 1.3.18 and I get slightly better<br>
> > > > numbers:<br>
> > ><br>
> > > > jim@minty ~/apps/undertow-speed $ ./bench-avg.sh<br>
> > > > Requests per second: 15749.52 [#/sec] (mean)<br>
> > > > Requests per second: 15309.83 [#/sec] (mean)<br>
> > > > Requests per second: 15909.15 [#/sec] (mean)<br>
> > > > Requests per second: 16228.10 [#/sec] (mean)<br>
> > > > Requests per second: 16118.84 [#/sec] (mean)<br>
> > > > Average: 15862.6<br>
> > ><br>
> > > > But nowhere close to beta9.<br>
> > ><br>
> > > > Can you clone his app and reproduce locally?<br>
> > ><br>
> > > > Thanks,<br>
> > > > Jim<br>
> > ><br>
> > > > On Tue, Mar 8, 2016 at 6:35 PM, Stuart Douglas <<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>><br>
> > wrote:<br>
> > > > > Can you re-run but with the following setting:<br>
> > > > ><br>
> > > > > .setServerOption(UndertowOptions.NO_REQUEST_TIMEOUT, -1)<br>
> > > > ><br>
> > > > > The default changed between these versions, so now idle connections<br>
> > will<br>
> > > > > eventually be timed out (otherwise browsers can hold connections for<br>
> > a<br>
> > > > > very time long which was causing people to have issues with FD<br>
> > > > > exhaustion).<br>
> > > > ><br>
> > > > > Stuart<br>
> > > > ><br>
> > > > > ----- Original Message -----<br>
> > > > >> From: "Stuart Douglas" <<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>><br>
> > > > >> To: "Toby Crawley" <<a href="mailto:toby@tcrawley.org">toby@tcrawley.org</a>><br>
> > > > >> Cc: <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > > >> Sent: Monday, 7 March, 2016 10:59:27 AM<br>
> > > > >> Subject: Re: [undertow-dev] Loss of perfomance between 1.3.0.Beta9<br>
> > and<br>
> > > > >> 1.3.18.Final<br>
> > > > >><br>
> > > > >> This is not a known issue, I will investigate.<br>
> > > > >><br>
> > > > >> Stuart<br>
> > > > >><br>
> > > > >> ----- Original Message -----<br>
> > > > >> > From: "Toby Crawley" <<a href="mailto:toby@tcrawley.org">toby@tcrawley.org</a>><br>
> > > > >> > To: <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > > >> > Sent: Saturday, 5 March, 2016 7:29:00 AM<br>
> > > > >> > Subject: [undertow-dev] Loss of perfomance between 1.3.0.Beta9 and<br>
> > > > >> > 1.3.18.Final<br>
> > > > >> ><br>
> > > > >> > Is there a known decrease in throughput (measured with req/s)<br>
> > > > >> > between 1.3.0.Beta9 and 1.3.18.Final? We currently ship the former<br>
> > > > >> > with Immutant, and were looking at upgrading to the latter in the<br>
> > next<br>
> > > > >> > release, but noticed a decrease in throughput with a simple<br>
> > Clojure<br>
> > > > >> > benchmark app.<br>
> > > > >> ><br>
> > > > >> > I have replicated the basics of our benchmark app in Java[1], and<br>
> > saw<br>
> > > > >> > a<br>
> > > > >> > decrease in req/s between the two versions of ~7% when testing<br>
> > with ab<br>
> > > > >> > and averaging the output of several runs.<br>
> > > > >> ><br>
> > > > >> > Is there something that changed between those versions that is<br>
> > known<br>
> > > > >> > to have reduced performance?<br>
> > > > >> ><br>
> > > > >> > - Toby<br>
> > > > >> ><br>
> > > > >> > [1]: <a href="https://github.com/tobias/undertow-speed" rel="noreferrer" target="_blank">https://github.com/tobias/undertow-speed</a><br>
> > > > >> > _______________________________________________<br>
> > > > >> > undertow-dev mailing list<br>
> > > > >> > <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > > >> > <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>
> > > > >> ><br>
> > > > >> _______________________________________________<br>
> > > > >> undertow-dev mailing list<br>
> > > > >> <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > > >> <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>
> > > > >><br>
> > > > > _______________________________________________<br>
> > > > > undertow-dev mailing list<br>
> > > > > <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > > > <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>
> > > > _______________________________________________<br>
> > > > undertow-dev mailing list<br>
> > > > <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>
> > > > <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>
> > ><br>
> ><br>
><br>
</div></div></blockquote></div><br></div></div>