]
Jean-Frederic Clere updated MODCLUSTER-369:
-------------------------------------------
Priority: Major (was: Critical)
httpd should remove lost node/worker
------------------------------------
Key: MODCLUSTER-369
URL:
https://issues.jboss.org/browse/MODCLUSTER-369
Project: mod_cluster
Issue Type: Bug
Affects Versions: 1.2.0.Final, 1.2.6.Final
Environment: httpd 2.2.24+tomcat 7.0.37
Reporter: Stefano Nichele
Assignee: Jean-Frederic Clere
Attachments: httpd-error.log
supposing this env running on amazon:
- one virtual machine running HTTPD (let's call it FE)
- two virtual machines running tomcat (let's call them NODE01 and NODE02)
If NODE02 virtual machine dies or crash for any reason, HTTPD still report it in
mod_cluster-manager sometime view with status ok and sometime with status notok.
Moreover some requests are still sent to the dead node.
Please note i replicated this behavior just closing netwrok traffic using iptables on
NODE02:
iptables -A OUTPUT -p tcp -m state --state NEW,ESTABLISHED -m tcp --dport 6666 -j DROP
iptables -A INPUT -m state --state NEW,ESTABLISHED -p tcp --dport 8009 -j DROP
In this way tomcat instance is not able to send its status and HTTPD is not able to sends
traffic on port 8009.
In my opinion two issues here:
1. if HTTPD doesn't recevice any STATUS from a worker, the worker mustb e at least
marked as NOTOK
2. it seems that HTTPD considers the worker available during its retry policy and this
causes that some requests are still forwarded to that node and it causes the worker
appears as flapping.
Original
threads:https://community.jboss.org/thread/234235
Marking this is as critical since amazon is one of the main player in virtualization and
if on a production environment an instance disappears (it can happen in any moment) the
whole production environemnet is affected (since some requests are still send to the dead
node).
In attahcment httpd error log file produced using modcluster 1.2.6.
at 08:32:30 HTTPD has been restarted (with two wrokers OK)
at 08:33:10 instance NODE02 (IP: 10.2.2.2) disappears
at 08:33:25 NODE2 is marked as NOTOK
at 08:33:32 NODE02 is marked as OK
at 08:33:41 NODE2 is marked as NOTOK
...and so on...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: