[mod_cluster-dev] mod-cluster and kill -9

Sun Aug 9 01:57:48 EDT 2009

Brian Stansberry wrote:
> I don't see why a hard kill would cause 5xx responses to the client
> (except perhaps for a few requests that were being handled by the killed
> server when it was killed.) The httpd side should try to route future
> requests to the failed node and when it can't connect fail them over.
>
> "Very long timeouts" on subsequent requests sounds like something that
> should solvable via configuration as well. (Paul, JFC please comment.)
> The backend node is dead; there's no reason it should be taking the
> httpd side a "very long" time to detect that when a new request comes in.
>
> (I'm not saying MODCLUSTER-66 isn't important; just that the symptoms
> you are describing seem more severe than is warranted. If people aren't
> using HAModClusterService, the MODCLUSTER-66 fix won't help them, but
> they still shouldn't be seeing what you describe.)

Right. The errors I've been seeing are probably caused by a long timeout 
value for ping/pong.

When the socket connection to worker W is closed, doesn't 
httpd/mod-cluster automatically remove W from its worker list ? Can this 
be configured ?

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss