[mod_cluster-dev] Session temporarily unavailable

Brian Stansberry brian.stansberry at redhat.com
Wed Jul 22 12:01:38 EDT 2009


The MCMP SUSPECT should get a response informing what suspected nodes 
are/aren't still alive. That information can then be made available to 
management tools.

Bela Ban wrote:
> Paul Ferraro wrote:
>>>> If you are running mod_cluster via HAModClusterService and node1 crashed
>>>> (instead of a clean shut down), node2 will not send any MCMP messages to
>>>> httpd on behalf of node1 when it receives a view change.
>>> Ah, yes, I remember now ! I just killed -9 node1 and mod-cluster-manager
>>> didn't remove node1 from its worker list.
>>> Why don't we do this ? We cannot rely on a node being shutdown 
>>> gracefully !
>> Because currently a given node cannot distinguish between a crashed
>> member and a network partition
> OK, so {A,B} would remove C and D and {C,D} would remove A and B...
> How about we send an MCMP SUSPECT message to httpd ? This would cause 
> httpd to double-check, e.g. via the regular AJP cping/cpong or socket 
> connection checks, and if the doube-check fails, remove a worker ?
> For the regular case (a crash), this would be faster than having to wait 
> until httpd checks the connection.
> So a view change would be nothing else than trigger an immediate 
> connection check on httpd.

