[mod_cluster-dev] Session temporarily unavailable

Bela Ban bban at redhat.com
Wed Jul 22 11:40:29 EDT 2009

Paul Ferraro wrote:
>>> If you are running mod_cluster via HAModClusterService and node1 crashed
>>> (instead of a clean shut down), node2 will not send any MCMP messages to
>>> httpd on behalf of node1 when it receives a view change.
>> Ah, yes, I remember now ! I just killed -9 node1 and mod-cluster-manager
>> didn't remove node1 from its worker list.
>> Why don't we do this ? We cannot rely on a node being shutdown 
>> gracefully !
> Because currently a given node cannot distinguish between a crashed
> member and a network partition

OK, so {A,B} would remove C and D and {C,D} would remove A and B...

How about we send an MCMP SUSPECT message to httpd ? This would cause 
httpd to double-check, e.g. via the regular AJP cping/cpong or socket 
connection checks, and if the doube-check fails, remove a worker ?

For the regular case (a crash), this would be faster than having to wait 
until httpd checks the connection.

So a view change would be nothing else than trigger an immediate 
connection check on httpd.

Bela Ban
Lead JGroups / Clustering Team

More information about the mod_cluster-dev mailing list