]
Michal Karm Babacek commented on MODCLUSTER-543:
------------------------------------------------
[~gzaronikas], could you take a look, pls? We need to form a general guidelines on
mod_cluster / BalancerMember compatibility. Beware of MODCLUSTER-430.
BalancerMember directives don't work and casue SegFaults
---------------------------------------------------------
Key: MODCLUSTER-543
URL:
https://issues.jboss.org/browse/MODCLUSTER-543
Project: mod_cluster
Issue Type: Bug
Components: Native (httpd modules)
Affects Versions: 1.3.3.Final, 1.2.13.Final
Environment: RHEL (others definitely too)
Reporter: Michal Karm Babacek
Assignee: George Zaronikas
Labels: balancerMember, proxy
Fix For: 1.3.6.Final
Attachments: clusterbench.war, mod_cluster.conf, proxy_test.conf, tses.war
There has been an ongoing discussion about interoperability between BalancerMember and
ProxyPass directives and mod_cluster. This is a follow up on MODCLUSTER-391 and especially
MODCLUSTER-356.
h3. TL;DR
* BalancerMember directives don't work as expected (at all)
* it is possible to use it to cause SegFault in httpd
* If these directives are *supposed to work*, then I have a wrong configuration or it is
a bug to be fixed
* If they are *not supposed to work* in conjunction with mod_cluster, then I should stop
trying to test these and remove all ever-failing scenarios from the test suite
h3. Configuration and goal
* two web apps, [^clusterbench.war] and [^tses.war], both deployed on each of two
tomcats
* one web app is in excluded contexts (it is [^tses.war])
* the other one ([^clusterbench]) is registered with mod_cluster balancer
* main server: {{\*:2080}}
* mod_cluster VirtualHost: {{\*:8747}}
* proxyPass BalancerMember VirtualHost {{\*:2081}}
* I want to access [^clusterbench.war] via {{\*:8747}} and {{\*:2080}} (works (/)), and
[^tses.war] via {{\*:2081}} (fails (x))
* see [^proxy_test.conf] for BalancerMember configuration (taken from httpd 2.2.26 test
run, you must edit Location access)
* see [^mod_cluster.conf] for mod_cluster configuration (taken from httpd 2.2.26 test
run, as above)
h3. Test
* (/) check, that only [^clusterbench.war] is registered and everything is cool:
[mod_cluster-manager
console|https://gist.github.com/Karm/26015dabf446360b0e019da6c907bed5]
* (/) [^clusterbench.war] on mod_cluster VirtualHost works: {{curl
http://192.168.122.172:8747/clusterbench/requestinfo}}
* (/) [^clusterbench.war] on main server also works: {{curl
http://192.168.122.172:2080/clusterbench/requestinfo}} (it works due to MODCLUSTER-430)
* httpd 2.2.26 / mod_cluster 1.2.13.Final:
** (x) [^tses.war] on BalancerMember ProxyPass VirtualHost fails: {{curl
http://192.168.122.172:2081/tses}} with: {noformat}mod_proxy_cluster.c(2374): proxy:
byrequests balancer FAILED
proxy: CLUSTER: (balancer://xqacluster). All workers are in error state
{noformat} and it doesn't matter whether I configure the same balancer (qacluster)
for both mod_cluster and additional BalancerMemebr directives or if I have two balancers
(this case).
** (x) [^clusterbench.war] on BalancerMember ProxyPass VirtualHost sometimes works and
sometimes causes SegFault {{curl
http://192.168.122.172:2081/clusterbench/requestinfo}}
(see below)
* httpd 2.4.23 / mod_cluster 1.3.3.Final:
** (x) [^tses.war] on BalancerMember ProxyPass VirtualHost fails with {{curl
http://192.168.122.172:2081/tses}} SegFault, *always* (see below)
** (/) [^clusterbench.war] on BalancerMember ProxyPass VirtualHost works {{curl
http://192.168.122.172:2081/clusterbench/requestinfo}}
h3. Intermittent and stable SegFaults
h4. httpd 2.2.26 / mod_cluster 1.2.13.Final (EWS 2.1.1)
With the aforementioned setup, it is possible to cause SegFault roughly in 50% of
requests to {{curl
http://192.168.122.172:2081/clusterbench/requestinfo}} on httpd 2.2.26
mod_cluster 1.2.13.Final, the rest passes fine and the web app is served.
*Offending line:*
[
mod_proxy_cluster.c:3843|https://github.com/modcluster/mod_cluster/blob/1...]
*Trace:*
{noformat}
#0 proxy_cluster_pre_request (worker=<optimized out>, balancer=<optimized
out>, r=0x5555558be3e0, conf=0x5555558767d8, url=0x7fffffffdd40) at
mod_proxy_cluster.c:3843
#1 0x00007ffff0cfe3d6 in proxy_run_pre_request (worker=worker@entry=0x7fffffffdd38,
balancer=balancer@entry=0x7fffffffdd30, r=r@entry=0x5555558be3e0,
conf=conf@entry=0x5555558767d8, url=url@entry=0x7fffffffdd40) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/modules/proxy/mod_proxy.c:2428
#2 0x00007ffff0d01ef2 in ap_proxy_pre_request (worker=worker@entry=0x7fffffffdd38,
balancer=balancer@entry=0x7fffffffdd30, r=r@entry=0x5555558be3e0,
conf=conf@entry=0x5555558767d8, url=url@entry=0x7fffffffdd40) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/modules/proxy/proxy_util.c:1512
#3 0x00007ffff0cfeabb in proxy_handler (r=0x5555558be3e0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/modules/proxy/mod_proxy.c:952
#4 0x00005555555805e0 in ap_run_handler (r=0x5555558be3e0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/config.c:157
#5 0x00005555555809a9 in ap_invoke_handler (r=r@entry=0x5555558be3e0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/config.c:376
#6 0x000055555558dc58 in ap_process_request (r=r@entry=0x5555558be3e0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/modules/http/http_request.c:282
#7 0x000055555558aff8 in ap_process_http_connection (c=0x5555558ae2f0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/modules/http/http_core.c:190
#8 0x0000555555587010 in ap_run_process_connection (c=0x5555558ae2f0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/connection.c:43
#9 0x00005555555873b0 in ap_process_connection (c=c@entry=0x5555558ae2f0,
csd=<optimized out>) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/connection.c:190
#10 0x0000555555592b5b in child_main (child_num_arg=child_num_arg@entry=0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/mpm/prefork/prefork.c:667
#11 0x0000555555592fae in make_child (s=0x5555557bf880, slot=0) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/mpm/prefork/prefork.c:712
#12 0x0000555555593b6e in ap_mpm_run (_pconf=_pconf@entry=0x5555557ba158,
plog=<optimized out>, s=s@entry=0x5555557bf880)
at /builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/mpm/prefork/prefork.c:988
#13 0x000055555556b50e in main (argc=8, argv=0x7fffffffe268) at
/builddir/build/BUILD/httpd-EWS_2.1.1.CR1/server/main.c:753
{noformat}
h4. httpd 2.4.23 / mod_cluster 1.3.3.Final (JBCS 2.4.23)
With the aforementioned setup, it is *always* possible to SegFault httpd by accessing
[^tses.war] on BalancerMember ProxyPass VirtualHos: {{curl
http://192.168.122.172:2081/tses}}.
*Offending line:*
[
mod_proxy_cluster.c:2230|https://github.com/modcluster/mod_cluster/blob/1...]
*Trace:*
{noformat}
#0 0x00007fffe61a598f in internal_find_best_byrequests (balancer=0x55555593ad38,
conf=0x555555918dd8, r=0x5555559a6630, domain=0x0, failoverdomain=0,
vhost_table=0x5555559a5c98, context_table=0x5555559a5e00, node_table=0x5555559a6088)
at mod_proxy_cluster.c:2230
#1 0x00007fffe61a90c8 in find_best_worker (balancer=0x55555593ad38, conf=0x555555918dd8,
r=0x5555559a6630, domain=0x0, failoverdomain=0, vhost_table=0x5555559a5c98,
context_table=0x5555559a5e00, node_table=0x5555559a6088, recurse=1) at
mod_proxy_cluster.c:3457
#2 0x00007fffe61a9f4d in proxy_cluster_pre_request (worker=0x7fffffffdb68,
balancer=0x7fffffffdb60, r=0x5555559a6630, conf=0x555555918dd8, url=0x7fffffffdb70)
at mod_proxy_cluster.c:3825
#3 0x00007fffec2fd9a6 in proxy_run_pre_request (worker=worker@entry=0x7fffffffdb68,
balancer=balancer@entry=0x7fffffffdb60, r=r@entry=0x5555559a6630,
conf=conf@entry=0x555555918dd8, url=url@entry=0x7fffffffdb70) at mod_proxy.c:2853
#4 0x00007fffec302652 in ap_proxy_pre_request (worker=worker@entry=0x7fffffffdb68,
balancer=balancer@entry=0x7fffffffdb60, r=r@entry=0x5555559a6630,
conf=conf@entry=0x555555918dd8, url=url@entry=0x7fffffffdb70) at proxy_util.c:1956
#5 0x00007fffec2fe1dc in proxy_handler (r=0x5555559a6630) at mod_proxy.c:1108
#6 0x00005555555aeff0 in ap_run_handler (r=r@entry=0x5555559a6630) at config.c:170
#7 0x00005555555af539 in ap_invoke_handler (r=r@entry=0x5555559a6630) at config.c:434
#8 0x00005555555c5b2a in ap_process_async_request (r=0x5555559a6630) at
http_request.c:410
#9 0x00005555555c5e04 in ap_process_request (r=r@entry=0x5555559a6630) at
http_request.c:445
#10 0x00005555555c1ded in ap_process_http_sync_connection (c=0x555555950050) at
http_core.c:210
#11 ap_process_http_connection (c=0x555555950050) at http_core.c:251
#12 0x00005555555b9470 in ap_run_process_connection (c=c@entry=0x555555950050) at
connection.c:42
#13 0x00005555555b99c8 in ap_process_connection (c=c@entry=0x555555950050,
csd=<optimized out>) at connection.c:226
#14 0x00007fffec513a30 in child_main (child_num_arg=child_num_arg@entry=0,
child_bucket=child_bucket@entry=0) at prefork.c:723
#15 0x00007fffec513c70 in make_child (s=0x55555582d400, slot=slot@entry=0,
bucket=bucket@entry=0) at prefork.c:767
#16 0x00007fffec51521d in prefork_run (_pconf=<optimized out>, plog=0x5555558313a8,
s=0x55555582d400) at prefork.c:979
#17 0x0000555555592aae in ap_run_mpm (pconf=pconf@entry=0x555555804188,
plog=0x5555558313a8, s=0x55555582d400) at mpm_common.c:94
#18 0x000055555558bb18 in main (argc=8, argv=0x7fffffffe1a8) at main.c:783
{noformat}
h3. About the test
This test has always been failing in one way or another: not serving URL (HTTP 404),
returning All workers in Error state (HTTP 503). SegFault has been slipping under the
radar for some time, because the test ended up on assert earlier in the scenario - on the
first HTTP 503.
We should clearly document which BalancerMember integration is supported and which is
not. Furthermore, we must not SegFault even if user tries to do something weird, we must
log an error message instead.