[mod_cluster-dev] Session temporarily unavailable
Bela Ban
bban at redhat.com
Wed Jul 22 11:40:29 EDT 2009
Paul Ferraro wrote:
>
>>>
>>> If you are running mod_cluster via HAModClusterService and node1 crashed
>>> (instead of a clean shut down), node2 will not send any MCMP messages to
>>> httpd on behalf of node1 when it receives a view change.
>> Ah, yes, I remember now ! I just killed -9 node1 and mod-cluster-manager
>> didn't remove node1 from its worker list.
>>
>> Why don't we do this ? We cannot rely on a node being shutdown
>> gracefully !
>
> Because currently a given node cannot distinguish between a crashed
> member and a network partition
OK, so {A,B} would remove C and D and {C,D} would remove A and B...
How about we send an MCMP SUSPECT message to httpd ? This would cause
httpd to double-check, e.g. via the regular AJP cping/cpong or socket
connection checks, and if the doube-check fails, remove a worker ?
For the regular case (a crash), this would be faster than having to wait
until httpd checks the connection.
So a view change would be nothing else than trigger an immediate
connection check on httpd.
--
Bela Ban
Lead JGroups / Clustering Team
JBoss
More information about the mod_cluster-dev
mailing list