[JBoss JIRA] (MODCLUSTER-372) Number of registered contexts negatively affects mod_cluster performance
by Radoslav Husar (JIRA)
[ https://issues.jboss.org/browse/MODCLUSTER-372?page=com.atlassian.jira.pl... ]
Radoslav Husar reassigned MODCLUSTER-372:
-----------------------------------------
Assignee: Jean-Frederic Clere (was: Michal Babacek)
> Number of registered contexts negatively affects mod_cluster performance
> ------------------------------------------------------------------------
>
> Key: MODCLUSTER-372
> URL: https://issues.jboss.org/browse/MODCLUSTER-372
> Project: mod_cluster
> Issue Type: Bug
> Affects Versions: 1.2.4.Final, 1.2.6.Final
> Environment: RHEL6, other platforms are to be confirmed.
> Reporter: Michal Babacek
> Assignee: Jean-Frederic Clere
> Labels: mod_cluster, performace
> Fix For: 1.3.0.Alpha2, 1.2.8.Final
>
> Attachments: 4-workers-1-context-balancer-cpu.png, 4-workers-61-context-accessing-1-balancer-cpu.png, 9-workers-1-context-balancer-cpu.png, 9-workers-31-context-accessing-1-balancer-cpu.png, 9-workers-31-context-round-robin-balancer-cpu.png, callgrind.zip, httpd.conf
>
>
> There is a performance concern regarding CPU usage on the Apache HTTP Server with mod_cluster set up as a load balancer. Perf tests revealed that the major variable affecting the CPU usage is the overall number of registered contexts. It's noteworthy that these contexts actually don't need to be accessed at all, it's enough that they are present.
> The first test [9 workers, 31 contexts, round robin|https://issues.jboss.org/browse/MODCLUSTER-372#9workers,31contexts,...] depicts a behavior where all contexts are being accessed in a round robin fashion. Note the CPU usage. If we compare it to the [9 workers, 1 context|https://issues.jboss.org/browse/MODCLUSTER-372#9workers,1context] test, it becomes apparent that there is something wrong with the CPU usage. As it is clear from the [9 workers, 31 contexts, access 1|https://issues.jboss.org/browse/MODCLUSTER-372#9workers,31contexts,acce...] test, accessing only one of these contexts doesn't help much.
> The last two tests, [4 workers, 61 contexts, accessing 1|https://issues.jboss.org/browse/MODCLUSTER-372#4workers,61contexts,acce...] and [4 workers, 1 context|https://issues.jboss.org/browse/MODCLUSTER-372#4workers,1context] confirm the results, environment differs only in number of nodes and number of contexts.
> [^httpd.conf] attached, stay tuned for some profiler outputs...
> h3. 9 workers, 31 contexts, round robin
> !9-workers-31-context-round-robin-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |9|150|150|8843|147.4|15|60|0|8843|100%|
> |9|250|250|14422|240.3|37|215|0|14422|100%|
> |9|350|350|17551|292.5|196|443|0|17551|100%|
> |9|450|450|21303|355.0|269|603|0|21303|100%|
> |9|550|550|24818|413.6|325|686|0|24818|100%|
> |9|650|650|26804|446.7|469|800|0|26804|100%|
> |9|750|750|29930|498.8|507|985|0|29930|100%|
> |9|850|850|30665|511.0|665|1185|0|30665|100%|
> |9|950|950|34421|573.6|647|1316|0|34421|100%|
> |9|1050|1050|35067|584.4|800|1487|0|35067|100%|
> |9|1150|1150|36024|600.4|925|1612|0|36024|100%|
> |9|1250|1250|36635|610.5|1030|1815|0|36635|100%|
> |9|1350|1350|38395|639.9|1096|1942|0|38395|100%|
> |9|1450|1450|39713|661.8|1170|2466|0|39713|100%|
> |9|1550|1550|39455|657.5|1392|2340|0|39455|100%|
> |9|1650|1650|39849|664.1|1465|4240|0|39849|100%|
> |9|1750|1750|42435|707.2|1454|6511|0|42435|100%|
> |9|1850|1850|44714|745.2|1498|4866|0|44714|100%|
> |9|1950|1911|46072|767.8|1554|43016|0|46072|100%|
> |9|2050|1911|44496|741.5|1617|4953|39|44457|99%|
> |9|2150|1911|43715|728.5|1632|5348|125|43590|99%|
> |9|2250|1911|41112|685.1|1764|9800|155|40957|99%|
> h3. 9 workers, 31 contexts, accessing only one of them
> !9-workers-31-context-accessing-1-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |9|150|150|8954|149.2|4|42|0|8954|100%|
> |9|250|250|14897|248.3|7|33|0|14897|100%|
> |9|350|350|20784|346.4|8|68|0|20784|100%|
> |9|450|450|26748|445.8|8|69|0|26748|100%|
> |9|550|550|32553|542.5|11|92|0|32553|100%|
> |9|650|650|38558|642.6|10|60|0|38558|100%|
> |9|750|750|43571|726.1|29|329|0|43571|100%|
> |9|850|850|46133|768.8|99|494|0|46133|100%|
> |9|950|950|50854|847.5|120|501|0|50854|100%|
> |9|1050|1050|54451|907.4|154|584|0|54451|100%|
> |9|1150|1150|59961|999.3|138|674|0|59961|100%|
> |9|1250|1250|62567|1,042.6|198|675|0|62567|100%|
> |9|1350|1350|61939|1,032.2|301|799|0|61939|100%|
> |9|1450|1450|67920|1,131.9|276|844|0|67920|100%|
> |9|1550|1550|73151|1,219.1|261|861|0|73151|100%|
> |9|1650|1650|73937|1,232.2|332|955|0|73937|100%|
> |9|1750|1750|73516|1,225.2|423|1046|0|73516|100%|
> |9|1850|1850|72556|1,209.1|515|1264|0|72556|100%|
> |9|1950|1911|78613|1,310.1|454|50273|0|78613|100%|
> |9|2050|1911|80141|1,335.6|431|1225|39|80102|99%|
> |9|2150|1911|76979|1,282.9|490|1338|127|76852|99%|
> |9|2250|1911|78048|1,300.7|464|1305|136|77912|99%|
> h3. 9 workers, 1 context
> !9-workers-1-context-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |9|150|150|8965|149.4|2|12|0|8965|100%|
> |9|250|250|14965|249.4|2|22|0|14965|100%|
> |9|350|350|20950|349.1|2|23|0|20950|100%|
> |9|450|450|26941|449.0|2|26|0|26941|100%|
> |9|550|550|32937|548.9|1|26|0|32937|100%|
> |9|650|650|38900|648.3|1|19|0|38900|100%|
> |9|750|750|44918|748.6|1|11|0|44918|100%|
> |9|850|850|50902|848.3|2|22|0|50902|100%|
> |9|950|950|56878|947.9|1|14|0|56878|100%|
> |9|1050|1050|62874|1,047.8|2|12|0|62874|100%|
> |9|1150|1150|68845|1,147.3|2|99|0|68845|100%|
> |9|1250|1250|74851|1,247.4|2|103|0|74851|100%|
> |9|1350|1350|80826|1,347.0|2|100|0|80826|100%|
> |9|1450|1450|86806|1,446.7|2|19|0|86806|100%|
> |9|1550|1550|92817|1,546.8|2|52|0|92817|100%|
> |9|1650|1650|98774|1,646.1|2|18|0|98774|100%|
> |9|1750|1750|104755|1,745.8|2|18|0|104755|100%|
> |9|1850|1850|110734|1,845.4|2|20|0|110734|100%|
> |9|1950|1910|113419|1,890.2|9|41855|0|113419|100%|
> |9|2050|1911|114437|1,907.1|2|77962|39|114397|99%|
> |9|2150|1911|114481|1,907.9|2|15|128|114353|99%|
> |9|2250|1911|114545|1,908.9|2|24|144|114401|99%|
> h3. 4 workers, 61 contexts, accessing only one of them
> !4-workers-61-context-accessing-1-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |4|500|500|29796|496.6|6|52|0|29796|100%|
> |4|650|650|38706|645.0|7|149|0|38706|100%|
> |4|800|800|47585|793.0|8|129|0|47585|100%|
> |4|950|950|54467|907.7|43|377|0|54467|100%|
> |4|1100|1100|62500|1,041.6|54|396|0|62500|100%|
> |4|1250|1250|69446|1,157.3|81|512|0|69446|100%|
> |4|1400|1400|76217|1,270.2|97|517|0|76217|100%|
> |4|1550|1550|80216|1,336.8|152|810|0|80216|100%|
> |4|1700|1700|80797|1,346.5|271|864|0|80797|100%|
> |4|1850|1850|94172|1,569.3|182|822|0|94172|100%|
> |4|2000|1916|91014|1,516.8|253|48650|0|91014|100%|
> |4|2150|1916|95852|1,597.4|205|848|83|95769|99%|
> h3. 4 workers, 1 context
> !4-workers-1-context-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |4|500|500|29922|498.7|2|39|0|29922|100%|
> |4|650|650|38923|648.7|1|39|0|38923|100%|
> |4|800|800|47916|798.5|1|17|0|47916|100%|
> |4|950|950|56896|948.2|1|17|0|56896|100%|
> |4|1100|1100|65889|1,098.1|1|115|0|65889|100%|
> |4|1250|1250|74874|1,247.8|1|101|0|74874|100%|
> |4|1400|1400|83818|1,396.8|1|17|0|83818|100%|
> |4|1550|1550|92830|1,547.0|1|17|0|92830|100%|
> |4|1700|1700|101805|1,696.6|1|11|0|101805|100%|
> |4|1850|1850|110785|1,846.3|1|11|0|110785|100%|
> |4|2000|1916|113747|1,895.6|10|53108|0|113747|100%|
> |4|2150|1916|114825|1,913.6|1|24|83|114742|99%|
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 11 months
[JBoss JIRA] (MODCLUSTER-381) [Patch] Stability and portability issues in native code
by Radoslav Husar (JIRA)
[ https://issues.jboss.org/browse/MODCLUSTER-381?page=com.atlassian.jira.pl... ]
Radoslav Husar updated MODCLUSTER-381:
--------------------------------------
Issue Type: Bug (was: Patch)
Fix Version/s: 1.3.0.Alpha2
1.2.8.Final
Indeed great contribution. By the way, you can also open a pull request on github (here https://github.com/modcluster/mod_cluster ) to get all the git/github perks.
The issues you outlined are major, marking the issue as "bug".
> [Patch] Stability and portability issues in native code
> -------------------------------------------------------
>
> Key: MODCLUSTER-381
> URL: https://issues.jboss.org/browse/MODCLUSTER-381
> Project: mod_cluster
> Issue Type: Bug
> Affects Versions: 1.2.6.Final
> Reporter: Jan Andres
> Assignee: Jean-Frederic Clere
> Fix For: 1.3.0.Alpha2, 1.2.8.Final
>
> Attachments: mod_cluster-1.2.6-20140109.diff
>
>
> Hi,
> Please find attached a patch for some issues I came across while deploying mod_cluster for Apache on Solaris/SPARC. The patch applies to the 1.2.6 release and addresses a total of 3 issues:
> 1. A memory leak in add_balancers_workers(). The current code will allocate new memory from the config pool for the balancer's "sticky" parameter on each update but throw away the pointer (thus losing memory) if the value generated equals the balancer's current value. The patch changes this to first do the apr_psprintf() from a temporary pool and only apr_pstrdup() this temporary string into the config pool in case it differs from the old value and actually needs to be updated.
> 2. Memory alignment issues. Some parts of the code use custom memory alignment logic and will generally align memory blocks to a multiple of sizeof(void*). This poses a problem on some architectures such a 64-bit SPARC CPU running a "32-bit application" which really means the process has a 32-bit memory space (i.e. sizeof(void*) is 4), but can still use native 64-bit instructions to operate on data. On SPARC and some other architectures, these instructions require 8-byte data alignment and will lead to a Bus error if this is not observed. The patch replaces the custom alignment logic by invocations of APR_ALIGN_DEFAULT() which also makes the code somewhat more concise and readable.
> 3. Heap corruption issues on process exit, sometimes resulting in core dumps, because mod_proxy_cluster's watchdog thread never terminates and continues to run while the main thread is already freeing up resources and destroying all the memory pools. The patch adds a mechanism to signal the watchdog thread to terminate and wait for it to exit prior to the freeing of the memory resources. This mechanism is invoked via a pre-cleanup hook on the config pool.
> Regards,
> Jan
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 11 months