From jira-events at lists.jboss.org Mon Sep 16 10:13:04 2013 Content-Type: multipart/mixed; boundary="===============8392960797184041012==" MIME-Version: 1.0 From: Weinan Li (JIRA) To: mod_cluster-issues at lists.jboss.org Subject: [mod_cluster-issues] [JBoss JIRA] (MODCLUSTER-356) BalancerMember directives cause httpd crash when using mod_cluster 1.2.x Date: Mon, 16 Sep 2013 10:13:04 -0400 Message-ID: In-Reply-To: JIRA.12499493.1377680902000@jira02.app.mwc.hst.phx2.redhat.com --===============8392960797184041012== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable [ https://issues.jboss.org/browse/MODCLUSTER-356?page=3Dcom.atlassian.j= ira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D128044= 19#comment-12804419 ] = Weinan Li edited comment on MODCLUSTER-356 at 9/16/13 10:11 AM: ---------------------------------------------------------------- Yeah if I restart httpd it will always throw: {code} Error null sending STATUS command to min/10.0.1.13:10001, configuration wil= l be reset: null {code} And here is the log from httpd: {code} [Mon Sep 16 22:04:51 2013] [notice] Apache/2.2.25 (Unix) mod_ssl/2.2.25 Ope= nSSL/1.0.1e mod_cluster/1.2.6.Final configured -- resuming normal operations [Mon Sep 16 22:04:55 2013] [warn] manager_handler STATUS error: MEM: Can't = read node [Mon Sep 16 22:04:56 2013] [warn] manager_handler STATUS error: MEM: Can't = read node [Mon Sep 16 22:05:05 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:06 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:06 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:06 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:06 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:08 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:09 2013] [notice] Balancer other-server-group changed [Mon Sep 16 22:05:09 2013] [notice] Balancer other-server-group changed {code} If I access the cluster it will throw 503 error: {code} min:~ weinanli$ telnet 10.0.1.13 80 Trying 10.0.1.13... Connected to mini. Escape character is '^]'. GET / 503 Service Temporarily Unavailable

Service Temporarily Unavailable

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

Connection closed by foreign host. min:~ weinanli$ {code} And the log reported by httpd: {code} [Mon Sep 16 22:05:46 2013] [error] proxy: CLUSTER: (balancer://other-server= -group). All workers are in error state [Mon Sep 16 22:05:46 2013] [error] proxy: CLUSTER: (balancer://other-server= -group). All workers are in error state [Mon Sep 16 22:05:47 2013] [error] proxy: CLUSTER: (balancer://other-server= -group). All workers are in error state {code} I could see the nodes are registered from mod_cluster-management: {code} mod_cluster/1.2.6.Final Auto Refresh show DUMP output show INFO output Node 770b156a-97b5-372f-9750-717824bf9aed (ajp://10.0.1.19:8009): Enable Contexts Disable Contexts Balancer: other-server-group,LBGroup: ,Flushpackets: Off,Flushwait: 10000,P= ing: 10000000,Smax: 1,Ttl: 60000000,Status: OK,Elected: 0,Read: 0,Transferr= ed: 0,Connected: 0,Load: 100 Node 30b18166-b3c4-3388-8fbe-50af073c7657 (ajp://10.0.1.13:8009): Enable Contexts Disable Contexts Balancer: other-server-group,LBGroup: ,Flushpackets: Off,Flushwait: 10000,P= ing: 10000000,Smax: 1,Ttl: 60000000,Status: OK,Elected: 0,Read: 0,Transferr= ed: 0,Connected: 0,Load: 100 {code} So I believe when httpd start up all the nodes are in error state because o= f that STATUS error: {code} manager_handler STATUS error: MEM: Can't read node {code} = was (Author: weinanli): Yeah if I restart httpd/EAP sometimes it will throw: {code} Error null sending STATUS command to min/10.0.1.13:10001, configuration wil= l be reset: null {code} And this just happen during server restart to me. = The httpd crashing stopped after I've upgrade mod_cluster to 1.2.6.Final in= httpd. No crashing anymore, and no error log from both EAP and httpd sides= , and node could be seen correctly from '/mod_cluster-manager'. From wiresh= ark analyse the CPING/CPONG of AJP port between EAP/httpd are fine. But the "503 Service Temporarily Unavailable" still exists. = The node is not serving. I've also tried to upgrade mod_cluster in EAP side also to 1.2.6.Final and = the problem still there. Not sure where the problem could be. Because Wireshark doesn't understand t= he MCMP in 10001 port, so it's hard for me to see if there are any problems. = > BalancerMember directives cause httpd crash when using mod_cluster 1.2.x > ------------------------------------------------------------------------ > > Key: MODCLUSTER-356 > URL: https://issues.jboss.org/browse/MODCLUSTER-356 > Project: mod_cluster > Issue Type: Bug > Affects Versions: 1.2.4.Final > Environment: SLES-11-SP1 x86_64 running httpd 2.2.24 or 2.4.6 wit= h any mod_cluster-1.2.x version and JBoss-AS-7.2.0.Final > Reporter: Marco Danti > Assignee: Jean-Frederic Clere > Labels: balancerMember, crash, httpd, mod_cluster > > httpd dumps core in mod_proxy_cluster.c if the httpd.conf file contains B= alancerMember directives. > this happens at least with the following two setups: > 1) httpd-2.4.6 and mod_cluster-1.2.x > 2) httpd-2.2.24 and mod_cluster-1.2.3.Final -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira --===============8392960797184041012==--