[
https://issues.jboss.org/browse/MODCLUSTER-407?page=com.atlassian.jira.pl...
]
Aaron Ogburn updated MODCLUSTER-407:
------------------------------------
Steps to Reproduce:
1) Configure jboss with worker-timeout="1" in the modcluster subsystem
2) Start httpd and JBoss. Run httpd on a multicore system (4+ cores).
3) Confirm JBoss is reachable through httpd/mod_cluster then kill JBoss so the mod_cluster
worker-timeout retry logic is used
4) Load up httpd with highly concurrent request traffic for JBoss for some time.
Then check for stalled requests/threads. Each request should finish by ~1 second. But
this could take minutes once stalled. You can check access logs with %T to check response
times once they're done, pstack to check threads, or the mod_status page (it'll
show may threads in W state with many seconds since their requests started, which keeps
growing)..
was:
1) Configure jboss with worker-timeout="1" in the modcluster subsystem
2) Start httpd and JBoss
3) Confirm JBoss is reachable through httpd/mod_cluster then kill JBoss so the mod_cluster
worker-timeout retry logic is used
4) Load up httpd with requests for JBoss (a couple seconds holding refresh in a browser
even will do the trick)
Then check for stalled requests/threads. Each request should finish by ~1 second. But
this could take minutes once stalled. You can check access logs with %T to check response
times once they're done, pstack to check threads, or the mod_status page (it'll
show may threads in W state with many seconds since their requests started, which keeps
growing)..
worker-timeout can cause httpd thread stalls
--------------------------------------------
Key: MODCLUSTER-407
URL:
https://issues.jboss.org/browse/MODCLUSTER-407
Project: mod_cluster
Issue Type: Bug
Affects Versions: 1.2.8.Final
Reporter: Aaron Ogburn
Assignee: Jean-Frederic Clere
Fix For: 1.3.1.Final, 1.2.9.Final
Setting a modcluster worker-timeout can stall requests and threads on the httpd side when
the requests are received with workers in a down state. A stack of the problem thread
looks like the following (recursive loops through mod_proxy_cluster from #160 to #2):
#0 0x00007ff8eb547533 in select () from /lib64/libc.so.6
#1 0x00007ff8eba39185 in apr_sleep () from /usr/lib64/libapr-1.so.0
#2 0x00007ff8e84be0d1 in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
...
#160 0x00007ff8e84beb9f in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
#161 0x00007ff8e88d2116 in proxy_run_pre_request () from /etc/httpd/modules/mod_proxy.so
#162 0x00007ff8e88d9186 in ap_proxy_pre_request () from /etc/httpd/modules/mod_proxy.so
#163 0x00007ff8e88d63c2 in ?? () from /etc/httpd/modules/mod_proxy.so
--
This message was sent by Atlassian JIRA
(v6.2.3#6260)