[mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-407) worker-timeout can cause httpd thread stalls

Wed Jun 4 15:56:15 EDT 2014

     [ https://issues.jboss.org/browse/MODCLUSTER-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron Ogburn updated MODCLUSTER-407:
------------------------------------

    Steps to Reproduce: 
1) Configure jboss with worker-timeout="1" in the modcluster subsystem
2) Start httpd and JBoss.  Run httpd on a multicore system (4+ cores).
3) Confirm JBoss is reachable through httpd/mod_cluster then kill JBoss so the mod_cluster worker-timeout retry logic is used
4) Load up httpd with highly concurrent request traffic for JBoss for some time.

Then check for stalled requests/threads.  Each request should finish by ~1 second.  But this could take minutes once stalled.  You can check access logs with %T to check response times once they're done, pstack to check threads, or the mod_status page (it'll show may threads in W state with many seconds since their requests started, which keeps growing)..

  was:
1) Configure jboss with worker-timeout="1" in the modcluster subsystem
2) Start httpd and JBoss
3) Confirm JBoss is reachable through httpd/mod_cluster then kill JBoss so the mod_cluster worker-timeout retry logic is used
4) Load up httpd with requests for JBoss (a couple seconds holding refresh in a browser even will do the trick)

Then check for stalled requests/threads.  Each request should finish by ~1 second.  But this could take minutes once stalled.  You can check access logs with %T to check response times once they're done, pstack to check threads, or the mod_status page (it'll show may threads in W state with many seconds since their requests started, which keeps growing)..

> worker-timeout can cause httpd thread stalls
> --------------------------------------------
>
>                 Key: MODCLUSTER-407
>                 URL: https://issues.jboss.org/browse/MODCLUSTER-407
>             Project: mod_cluster
>          Issue Type: Bug
>    Affects Versions: 1.2.8.Final
>            Reporter: Aaron Ogburn
>            Assignee: Jean-Frederic Clere
>             Fix For: 1.3.1.Final, 1.2.9.Final
>
>
> Setting a modcluster worker-timeout can stall requests and threads on the httpd side when the requests are received with workers in a down state.  A stack of the problem thread looks like the following (recursive loops through mod_proxy_cluster from #160 to #2):
> #0  0x00007ff8eb547533 in select () from /lib64/libc.so.6
> #1  0x00007ff8eba39185 in apr_sleep () from /usr/lib64/libapr-1.so.0
> #2  0x00007ff8e84be0d1 in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
> ...
> #160 0x00007ff8e84beb9f in ?? () from /etc/httpd/modules/mod_proxy_cluster.so
> #161 0x00007ff8e88d2116 in proxy_run_pre_request () from /etc/httpd/modules/mod_proxy.so
> #162 0x00007ff8e88d9186 in ap_proxy_pre_request () from /etc/httpd/modules/mod_proxy.so
> #163 0x00007ff8e88d63c2 in ?? () from /etc/httpd/modules/mod_proxy.so

--
This message was sent by Atlassian JIRA
(v6.2.3#6260)