[mod_cluster-dev] mod-cluster and kill -9

Brian Stansberry brian.stansberry at redhat.com
Wed Aug 5 11:55:46 EDT 2009


I don't see why a hard kill would cause 5xx responses to the client 
(except perhaps for a few requests that were being handled by the killed 
server when it was killed.)  The httpd side should try to route future 
requests to the failed node and when it can't connect fail them over.

"Very long timeouts" on subsequent requests sounds like something that 
should solvable via configuration as well. (Paul, JFC please comment.) 
The backend node is dead; there's no reason it should be taking the 
httpd side a "very long" time to detect that when a new request comes in.

(I'm not saying MODCLUSTER-66 isn't important; just that the symptoms 
you are describing seem more severe than is warranted. If people aren't 
using HAModClusterService, the MODCLUSTER-66 fix won't help them, but 
they still shouldn't be seeing what you describe.)

Paul Ferraro wrote:
> On Wed, 2009-08-05 at 16:48 +0200, Bela Ban wrote:
>> Hi Paul,
>>
>> wanted to reiterate the importance of [1] for the next release of 
>> mod-cluster.
>>
>> I'm constantly running into this when I deploy httpd/mod-cluster and 
>> JBoss 5.1.0 on Amazon's EC2 cloud. The easy way to stop an instance (OS 
>> + JBoss) on EC2 is to 'terminate' it via the AWS Console, which shuts 
>> down the OS ("shutdown -h now").
>>
>> Unfortunately, our AMIs only had an S98jboss in /etc/rc4.d for starting 
>> JBoss, but no corresponding K98jboss for stopping it *gracefully* on 
>> shutdown. Therefore the process was always killed via -9.
>>
>> This caused very long timeouts and 5XX HTTP responses, until 
>> httd/mod-cluster finally figured out that the worker crashed and failed 
>> over to a different worker.
>>
>> As a workaround, I created a K98jboss link so now JBoss is shut down 
>> gracefully when the host is terminated.
>>
>> However, I figure we can get into this situation in many different ways, 
>> e.g.
>>
>>     * Not providing a K98jboss script on EC2
>>     * Killing JBoss with -9 via a script (I've seen this many more than once
>>     * Pulling a blade out of the rack. A crude way of shutting down an
>>       instance, but that's normal in large clusters !
>>
>> Can we have this feature in mod-cluster 1.1 ?
>>
>>
>> [1] https://jira.jboss.org/jira/browse/MODCLUSTER-66
> 
> Yes.  I've set the target version accordingly.
> 
> _______________________________________________
> mod_cluster-dev mailing list
> mod_cluster-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/mod_cluster-dev


-- 
Brian Stansberry
Lead, AS Clustering
JBoss by Red Hat


More information about the mod_cluster-dev mailing list