<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Mar 10, 2016 at 4:29 PM, Stuart Douglas <span dir="ltr">&lt;<a href="mailto:sdouglas@redhat.com" target="_blank">sdouglas@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br></span><span class="">&gt; Serial ab, persistent connections (-k option):<br>

&gt;<br>

&gt; Beta9:                                  108847<br>

&gt; 1.3.18.Final:                           106262<br>

&gt; 1.3.18.Final+XNIO 3.3.6.Final-SNAPSHOT: 107984<br>

<br>

</span>How many requests were you running for this test? In general to get stable numbers for persistent connections you should use more requests (as they run much faster).<br>

<span class="HOEnZb"><font color="#888888"><br></font></span></blockquote><div><br></div><div>The original value in bench.sh: 100K. I just tried 500K and it yielded about the same number.</div><div><br></div><div>Jim</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="HOEnZb"><font color="#888888">

Stuart<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

&gt;<br>

&gt; My numbers are lower than yours. My hardware is a quad core Lenovo T430s<br>

&gt; running Linux 3.13.0-37-generic x86_64... due to be replaced in May ;)<br>

&gt;<br>

&gt; Thanks,<br>

&gt; Jim<br>

&gt;<br>

&gt; On Wed, Mar 9, 2016 at 8:49 PM, Stuart Douglas &lt;<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>&gt; wrote:<br>

&gt;<br>

&gt; &gt;<br>

&gt; &gt; TLDR: Looks like there is a problem in XNIO accept handling, I have<br>

&gt; &gt; submitted a PR, after my PR it should be considerably faster than Beta9 (at<br>

&gt; &gt; least on my machine). This problem only affects non-persistent connections.<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt; I have looked into this a bit more, I have done some testing on my linux<br>

&gt; &gt; machine (4 core CentOS).<br>

&gt; &gt;<br>

&gt; &gt; Running the benchmark as described with the NO_REQUEST_TIMEOUT set to -1 I<br>

&gt; &gt; get:<br>

&gt; &gt;<br>

&gt; &gt; Beta9:          23031<br>

&gt; &gt; 1.3.18.Final:   22578<br>

&gt; &gt;<br>

&gt; &gt; However adding -k to the ab command to use persistent connections gives a<br>

&gt; &gt; different result:<br>

&gt; &gt;<br>

&gt; &gt; Beta9:        169100<br>

&gt; &gt; 1.3.18.Final  173462<br>

&gt; &gt;<br>

&gt; &gt; This implies that the issue is only with non persistent connections.<br>

&gt; &gt; Something that did change between Beta9 and 1.3.18.Final is XNIO connection<br>

&gt; &gt; handling, which has been changed to use a dedicated accept thread. Looking<br>

&gt; &gt; into this I have noticed a potential problem that is causing contention. I<br>

&gt; &gt; have submitted a fix at <a href="https://github.com/xnio/xnio/pull/94" rel="noreferrer" target="_blank">https://github.com/xnio/xnio/pull/94</a>.<br>

&gt; &gt;<br>

&gt; &gt; Something else to note is that ab was 100% maxed out on a CPU core when<br>

&gt; &gt; running these tests. In general ab is not a great load driver as it is<br>

&gt; &gt; single threaded.<br>

&gt; &gt;<br>

&gt; &gt; The results get much more interesting if you modify the bench.sh slightly<br>

&gt; &gt; by adding an &#39;&amp;&#39; to the end of the ab line and removing the &#39;sleep 5&#39; call.<br>

&gt; &gt; This makes all 5 ab instances run at once, which is enough to max out the<br>

&gt; &gt; CPU on my machine (I also multiplied the number of requests by 5, the<br>

&gt; &gt; current number is a bit low).<br>

&gt; &gt;<br>

&gt; &gt; Using this approach I get the following results (3.3.6.Final-SNAPSHOT<br>

&gt; &gt; includes the PR I linked above):<br>

&gt; &gt;<br>

&gt; &gt; Non persistent connections:<br>

&gt; &gt;<br>

&gt; &gt; Beta9:                                    8,810<br>

&gt; &gt; 1.3.18.Final:                             6,578<br>

&gt; &gt; 1.3.18.Final + XNIO 3.3.6.Final-SNAPSHOT: 10,285<br>

&gt; &gt;<br>

&gt; &gt; When under heavy load the accept thread approach performs much better than<br>

&gt; &gt; the old approach (which was what we saw in Specj), however the XNIO bug was<br>

&gt; &gt; causing problems.<br>

&gt; &gt;<br>

&gt; &gt; Jim, because you saw a much greater performance loss that I did would you<br>

&gt; &gt; be able to re-run some of these tests with my XNIO changes and verify that<br>

&gt; &gt; this also fixes the issue for you (ideally using multiple ab instances to<br>

&gt; &gt; really load the machine)? If there are still problems I would like to know<br>

&gt; &gt; what sort of hardware you are seeing this on.<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt; Stuart<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt;<br>

&gt; &gt; ----- Original Message -----<br>

&gt; &gt; &gt; From: &quot;Andrig T. Miller&quot; &lt;<a href="mailto:anmiller@redhat.com">anmiller@redhat.com</a>&gt;<br>

&gt; &gt; &gt; To: &quot;Jim Crossley&quot; &lt;<a href="mailto:jim@crossleys.org">jim@crossleys.org</a>&gt;<br>

&gt; &gt; &gt; Cc: &quot;Stuart Douglas&quot; &lt;<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>&gt;, <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; Sent: Thursday, 10 March, 2016 8:28:37 AM<br>

&gt; &gt; &gt; Subject: Re: [undertow-dev] Loss of perfomance between 1.3.0.Beta9 and<br>

&gt; &gt;       1.3.18.Final<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; Stuart,<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; I&#39;m not sure what this undertow-speed app does, but I&#39;m nervous that<br>

&gt; &gt; tuning<br>

&gt; &gt; &gt; for it, may undo some of the improvements we made in the performance lab.<br>

&gt; &gt; &gt; One thing I would suggest though, is using perf on Linux and creating a<br>

&gt; &gt; &gt; flamegraph for Beta9 and the Final. It&#39;s likely that some methods that<br>

&gt; &gt; were<br>

&gt; &gt; &gt; being inlined are no longer being inlined, and the flamegraphs and the<br>

&gt; &gt; &gt; underlying perf data will show that. In order to use perf you have to set<br>

&gt; &gt; &gt; the JVM parameter to preserve frame pointers:<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; -XX:+PreserveFramePointer<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; Ping anyone from the performance team, and they can help you with the<br>

&gt; &gt; setup<br>

&gt; &gt; &gt; of perf, and generating the flame graphs.<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; Andy<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; ----- Original Message -----<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; Hi Stuart,<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; Toby asked me to try on my machine, and I see an even bigger<br>

&gt; &gt; &gt; &gt; throughput disparity. I&#39;m using his test app:<br>

&gt; &gt; &gt; &gt; <a href="https://github.com/tobias/undertow-speed" rel="noreferrer" target="_blank">https://github.com/tobias/undertow-speed</a><br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; In one shell I run &#39;mvn clean compile exec:java&#39; and in another I run<br>

&gt; &gt; &gt; &gt; &#39;./bench-avg.sh&#39; and I get this output:<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; jim@minty ~/apps/undertow-speed $ ./bench-avg.sh<br>

&gt; &gt; &gt; &gt; Requests per second: 14717.03 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 14527.32 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 14288.32 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 14375.64 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 14653.08 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Average: 14512<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; Then I run &#39;mvn clean compile exec:java -Pbeta9&#39; in the first shell<br>

&gt; &gt; &gt; &gt; and re-run bench-avg.sh. I get this:<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; jim@minty ~/apps/undertow-speed $ ./bench-avg.sh<br>

&gt; &gt; &gt; &gt; Requests per second: 24069.72 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 25002.35 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 24885.36 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 25261.30 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 24800.82 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Average: 24803.4<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; As you can see, quite a bit more than 7%, beta9 yields almost 70%<br>

&gt; &gt; &gt; &gt; better throughput for me!<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; I set the option you suggested for 1.3.18 and I get slightly better<br>

&gt; &gt; &gt; &gt; numbers:<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; jim@minty ~/apps/undertow-speed $ ./bench-avg.sh<br>

&gt; &gt; &gt; &gt; Requests per second: 15749.52 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 15309.83 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 15909.15 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 16228.10 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Requests per second: 16118.84 [#/sec] (mean)<br>

&gt; &gt; &gt; &gt; Average: 15862.6<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; But nowhere close to beta9.<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; Can you clone his app and reproduce locally?<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; Thanks,<br>

&gt; &gt; &gt; &gt; Jim<br>

&gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; On Tue, Mar 8, 2016 at 6:35 PM, Stuart Douglas &lt;<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>&gt;<br>

&gt; &gt; wrote:<br>

&gt; &gt; &gt; &gt; &gt; Can you re-run but with the following setting:<br>

&gt; &gt; &gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt; .setServerOption(UndertowOptions.NO_REQUEST_TIMEOUT, -1)<br>

&gt; &gt; &gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt; The default changed between these versions, so now idle connections<br>

&gt; &gt; will<br>

&gt; &gt; &gt; &gt; &gt; eventually be timed out (otherwise browsers can hold connections for<br>

&gt; &gt; a<br>

&gt; &gt; &gt; &gt; &gt; very time long which was causing people to have issues with FD<br>

&gt; &gt; &gt; &gt; &gt; exhaustion).<br>

&gt; &gt; &gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt; Stuart<br>

&gt; &gt; &gt; &gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt; ----- Original Message -----<br>

&gt; &gt; &gt; &gt; &gt;&gt; From: &quot;Stuart Douglas&quot; &lt;<a href="mailto:sdouglas@redhat.com">sdouglas@redhat.com</a>&gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; To: &quot;Toby Crawley&quot; &lt;<a href="mailto:toby@tcrawley.org">toby@tcrawley.org</a>&gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; Cc: <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; &gt; &gt;&gt; Sent: Monday, 7 March, 2016 10:59:27 AM<br>

&gt; &gt; &gt; &gt; &gt;&gt; Subject: Re: [undertow-dev] Loss of perfomance between 1.3.0.Beta9<br>

&gt; &gt; and<br>

&gt; &gt; &gt; &gt; &gt;&gt; 1.3.18.Final<br>

&gt; &gt; &gt; &gt; &gt;&gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; This is not a known issue, I will investigate.<br>

&gt; &gt; &gt; &gt; &gt;&gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; Stuart<br>

&gt; &gt; &gt; &gt; &gt;&gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; ----- Original Message -----<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; From: &quot;Toby Crawley&quot; &lt;<a href="mailto:toby@tcrawley.org">toby@tcrawley.org</a>&gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; To: <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; Sent: Saturday, 5 March, 2016 7:29:00 AM<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; Subject: [undertow-dev] Loss of perfomance between 1.3.0.Beta9 and<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; 1.3.18.Final<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; Is there a known decrease in throughput (measured with req/s)<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; between 1.3.0.Beta9 and 1.3.18.Final? We currently ship the former<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; with Immutant, and were looking at upgrading to the latter in the<br>

&gt; &gt; next<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; release, but noticed a decrease in throughput with a simple<br>

&gt; &gt; Clojure<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; benchmark app.<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; I have replicated the basics of our benchmark app in Java[1], and<br>

&gt; &gt; saw<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; a<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; decrease in req/s between the two versions of ~7% when testing<br>

&gt; &gt; with ab<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; and averaging the output of several runs.<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; Is there something that changed between those versions that is<br>

&gt; &gt; known<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; to have reduced performance?<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; - Toby<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; [1]: <a href="https://github.com/tobias/undertow-speed" rel="noreferrer" target="_blank">https://github.com/tobias/undertow-speed</a><br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; _______________________________________________<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; undertow-dev mailing list<br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt; <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>

&gt; &gt; &gt; &gt; &gt;&gt; &gt;<br>

&gt; &gt; &gt; &gt; &gt;&gt; _______________________________________________<br>

&gt; &gt; &gt; &gt; &gt;&gt; undertow-dev mailing list<br>

&gt; &gt; &gt; &gt; &gt;&gt; <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; &gt; &gt;&gt; <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>

&gt; &gt; &gt; &gt; &gt;&gt;<br>

&gt; &gt; &gt; &gt; &gt; _______________________________________________<br>

&gt; &gt; &gt; &gt; &gt; undertow-dev mailing list<br>

&gt; &gt; &gt; &gt; &gt; <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; &gt; &gt; <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>

&gt; &gt; &gt; &gt; _______________________________________________<br>

&gt; &gt; &gt; &gt; undertow-dev mailing list<br>

&gt; &gt; &gt; &gt; <a href="mailto:undertow-dev@lists.jboss.org">undertow-dev@lists.jboss.org</a><br>

&gt; &gt; &gt; &gt; <a href="https://lists.jboss.org/mailman/listinfo/undertow-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/mailman/listinfo/undertow-dev</a><br>

&gt; &gt; &gt;<br>

&gt; &gt;<br>

&gt;<br>

</div></div></blockquote></div><br></div></div>