[mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-372) Number of registered contexts negatively affects mod_cluster performance

Mon Nov 18 04:51:07 EST 2013

    [ https://issues.jboss.org/browse/MODCLUSTER-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924355#comment-12924355 ] 

Michal Babacek commented on MODCLUSTER-372:
-------------------------------------------

I've identified these bottle necks:
Three the most expensive functions and their callers:

 # Function {{ap_slotmem_mem}} from mod_slotmem/sharedmem_util.c called by:
  * {{get_context}} from mod_manager/context.c
  * {{get_node}} from mod_manager/node.c
  * {{loc_read_node}} from mod_manager/mod_manager.c
  * {{find_node_context_host}} from mod_proxy_cluster/mod_proxy_cluster.c
  * {{loc_read_context}} mod_manager/context.c
  * {{read_context_table}} mod_proxy_cluster/mod_proxy_cluster.c
  * {{manager_info}} from mod_manager/mod_manager.c

 # Function {{ap_slotmem_do}} from mod_slotmem/sharedmem_util.c called mostly by
  * httpd_request and httpd_core functions from httpd sources.

 # Function {{find_node_context_host}} from mod_proxy_cluster/mod_proxy_cluster.c called by
  * httpd sources, mostly on request_process and request_connection.

So, IMHO, {{find_node_context_host}} is indeed the trouble :)

Attaching the profiling valgrind logs [^callgrind.zip], created as: 
{noformat}
valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes --collect-jumps=yes --compress-strings=no --compress-pos=no --collect-systime=yes ./httpd -f /tmp/hudson/httpd/conf/httpd.conf -E /tmp/hudson/httpd/logs/httpd.log
{noformat}
with this debug setting:
{code}
<IfModule worker.c>
ThreadLimit         50
StartServers        1
ServerLimit         1
MinSpareThreads     50
MaxSpareThreads     50
MaxClients          50
ThreadsPerChild     50
MaxRequestsPerChild 0
</IfModule>
{code}
and 4 worker nodes, 65 contexts each, and several dozens of concurrent client sessions.

> Number of registered contexts negatively affects mod_cluster performance
> ------------------------------------------------------------------------
>
>                 Key: MODCLUSTER-372
>                 URL: https://issues.jboss.org/browse/MODCLUSTER-372
>             Project: mod_cluster
>          Issue Type: Bug
>    Affects Versions: 1.2.4.Final, 1.2.6.Final
>         Environment: RHEL6, other platforms are to be confirmed.
>            Reporter: Michal Babacek
>            Assignee: Michal Babacek
>              Labels: mod_cluster, performace
>             Fix For: 1.2.7.Final, 1.3.0.Alpha2
>
>         Attachments: 4-workers-1-context-balancer-cpu.png, 4-workers-61-context-accessing-1-balancer-cpu.png, 9-workers-1-context-balancer-cpu.png, 9-workers-31-context-accessing-1-balancer-cpu.png, 9-workers-31-context-round-robin-balancer-cpu.png, httpd.conf
>
>
> There is a performance concern regarding CPU usage on the Apache HTTP Server with mod_cluster set up as a load balancer. Perf tests revealed that the major variable affecting the CPU usage is the overall number of registered contexts. It's noteworthy that these contexts actually don't need to be accessed at all, it's enough that they are present.
> The first test [9 workers, 31 contexts, round robin|https://issues.jboss.org/browse/MODCLUSTER-372#9workers,31contexts,roundrobin] depicts a behavior where all contexts are being accessed in a round robin fashion. Note the CPU usage. If we compare it to the [9 workers, 1 context|https://issues.jboss.org/browse/MODCLUSTER-372#9workers,1context] test, it becomes apparent that there is something wrong with the CPU usage. As it is clear from the [9 workers, 31 contexts, access 1|https://issues.jboss.org/browse/MODCLUSTER-372#9workers,31contexts,accessingonlyoneofthem] test, accessing only one of these contexts doesn't help much.
> The last two tests, [4 workers, 61 contexts, accessing 1|https://issues.jboss.org/browse/MODCLUSTER-372#4workers,61contexts,accessingonlyoneofthem] and [4 workers, 1 context|https://issues.jboss.org/browse/MODCLUSTER-372#4workers,1context] confirm the results, environment differs only in number of nodes and number of contexts.
> [^httpd.conf] attached, stay tuned for some profiler outputs...
> h3. 9 workers, 31 contexts, round robin
> !9-workers-31-context-round-robin-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |9|150|150|8843|147.4|15|60|0|8843|100%|
> |9|250|250|14422|240.3|37|215|0|14422|100%|
> |9|350|350|17551|292.5|196|443|0|17551|100%|
> |9|450|450|21303|355.0|269|603|0|21303|100%|
> |9|550|550|24818|413.6|325|686|0|24818|100%|
> |9|650|650|26804|446.7|469|800|0|26804|100%|
> |9|750|750|29930|498.8|507|985|0|29930|100%|
> |9|850|850|30665|511.0|665|1185|0|30665|100%|
> |9|950|950|34421|573.6|647|1316|0|34421|100%|
> |9|1050|1050|35067|584.4|800|1487|0|35067|100%|
> |9|1150|1150|36024|600.4|925|1612|0|36024|100%|
> |9|1250|1250|36635|610.5|1030|1815|0|36635|100%|
> |9|1350|1350|38395|639.9|1096|1942|0|38395|100%|
> |9|1450|1450|39713|661.8|1170|2466|0|39713|100%|
> |9|1550|1550|39455|657.5|1392|2340|0|39455|100%|
> |9|1650|1650|39849|664.1|1465|4240|0|39849|100%|
> |9|1750|1750|42435|707.2|1454|6511|0|42435|100%|
> |9|1850|1850|44714|745.2|1498|4866|0|44714|100%|
> |9|1950|1911|46072|767.8|1554|43016|0|46072|100%|
> |9|2050|1911|44496|741.5|1617|4953|39|44457|99%|
> |9|2150|1911|43715|728.5|1632|5348|125|43590|99%|
> |9|2250|1911|41112|685.1|1764|9800|155|40957|99%|
> h3. 9 workers, 31 contexts, accessing only one of them
> !9-workers-31-context-accessing-1-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |9|150|150|8954|149.2|4|42|0|8954|100%|
> |9|250|250|14897|248.3|7|33|0|14897|100%|
> |9|350|350|20784|346.4|8|68|0|20784|100%|
> |9|450|450|26748|445.8|8|69|0|26748|100%|
> |9|550|550|32553|542.5|11|92|0|32553|100%|
> |9|650|650|38558|642.6|10|60|0|38558|100%|
> |9|750|750|43571|726.1|29|329|0|43571|100%|
> |9|850|850|46133|768.8|99|494|0|46133|100%|
> |9|950|950|50854|847.5|120|501|0|50854|100%|
> |9|1050|1050|54451|907.4|154|584|0|54451|100%|
> |9|1150|1150|59961|999.3|138|674|0|59961|100%|
> |9|1250|1250|62567|1,042.6|198|675|0|62567|100%|
> |9|1350|1350|61939|1,032.2|301|799|0|61939|100%|
> |9|1450|1450|67920|1,131.9|276|844|0|67920|100%|
> |9|1550|1550|73151|1,219.1|261|861|0|73151|100%|
> |9|1650|1650|73937|1,232.2|332|955|0|73937|100%|
> |9|1750|1750|73516|1,225.2|423|1046|0|73516|100%|
> |9|1850|1850|72556|1,209.1|515|1264|0|72556|100%|
> |9|1950|1911|78613|1,310.1|454|50273|0|78613|100%|
> |9|2050|1911|80141|1,335.6|431|1225|39|80102|99%|
> |9|2150|1911|76979|1,282.9|490|1338|127|76852|99%|
> |9|2250|1911|78048|1,300.7|464|1305|136|77912|99%|
> h3. 9 workers, 1 context
> !9-workers-1-context-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |9|150|150|8965|149.4|2|12|0|8965|100%|
> |9|250|250|14965|249.4|2|22|0|14965|100%|
> |9|350|350|20950|349.1|2|23|0|20950|100%|
> |9|450|450|26941|449.0|2|26|0|26941|100%|
> |9|550|550|32937|548.9|1|26|0|32937|100%|
> |9|650|650|38900|648.3|1|19|0|38900|100%|
> |9|750|750|44918|748.6|1|11|0|44918|100%|
> |9|850|850|50902|848.3|2|22|0|50902|100%|
> |9|950|950|56878|947.9|1|14|0|56878|100%|
> |9|1050|1050|62874|1,047.8|2|12|0|62874|100%|
> |9|1150|1150|68845|1,147.3|2|99|0|68845|100%|
> |9|1250|1250|74851|1,247.4|2|103|0|74851|100%|
> |9|1350|1350|80826|1,347.0|2|100|0|80826|100%|
> |9|1450|1450|86806|1,446.7|2|19|0|86806|100%|
> |9|1550|1550|92817|1,546.8|2|52|0|92817|100%|
> |9|1650|1650|98774|1,646.1|2|18|0|98774|100%|
> |9|1750|1750|104755|1,745.8|2|18|0|104755|100%|
> |9|1850|1850|110734|1,845.4|2|20|0|110734|100%|
> |9|1950|1910|113419|1,890.2|9|41855|0|113419|100%|
> |9|2050|1911|114437|1,907.1|2|77962|39|114397|99%|
> |9|2150|1911|114481|1,907.9|2|15|128|114353|99%|
> |9|2250|1911|114545|1,908.9|2|24|144|114401|99%|
> h3. 4 workers, 61 contexts, accessing only one of them
> !4-workers-61-context-accessing-1-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |4|500|500|29796|496.6|6|52|0|29796|100%|
> |4|650|650|38706|645.0|7|149|0|38706|100%|
> |4|800|800|47585|793.0|8|129|0|47585|100%|
> |4|950|950|54467|907.7|43|377|0|54467|100%|
> |4|1100|1100|62500|1,041.6|54|396|0|62500|100%|
> |4|1250|1250|69446|1,157.3|81|512|0|69446|100%|
> |4|1400|1400|76217|1,270.2|97|517|0|76217|100%|
> |4|1550|1550|80216|1,336.8|152|810|0|80216|100%|
> |4|1700|1700|80797|1,346.5|271|864|0|80797|100%|
> |4|1850|1850|94172|1,569.3|182|822|0|94172|100%|
> |4|2000|1916|91014|1,516.8|253|48650|0|91014|100%|
> |4|2150|1916|95852|1,597.4|205|848|83|95769|99%|
> h3. 4 workers, 1 context
> !4-workers-1-context-balancer-cpu.png|thumbnail!
> ||Nodes||Sessions||Ses.Active||Samples||samples/s||mean resp. ms||max resp. ms||conn. errors||valid samples||%||
> |4|500|500|29922|498.7|2|39|0|29922|100%|
> |4|650|650|38923|648.7|1|39|0|38923|100%|
> |4|800|800|47916|798.5|1|17|0|47916|100%|
> |4|950|950|56896|948.2|1|17|0|56896|100%|
> |4|1100|1100|65889|1,098.1|1|115|0|65889|100%|
> |4|1250|1250|74874|1,247.8|1|101|0|74874|100%|
> |4|1400|1400|83818|1,396.8|1|17|0|83818|100%|
> |4|1550|1550|92830|1,547.0|1|17|0|92830|100%|
> |4|1700|1700|101805|1,696.6|1|11|0|101805|100%|
> |4|1850|1850|110785|1,846.3|1|11|0|110785|100%|
> |4|2000|1916|113747|1,895.6|10|53108|0|113747|100%|
> |4|2150|1916|114825|1,913.6|1|24|83|114742|99%|

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira