[mod_cluster-dev] Session temporarily unavailable
Brian Stansberry
brian.stansberry at redhat.com
Wed Jul 22 12:01:38 EDT 2009
+1.
The MCMP SUSPECT should get a response informing what suspected nodes
are/aren't still alive. That information can then be made available to
management tools.
Bela Ban wrote:
>
> Paul Ferraro wrote:
>>>> If you are running mod_cluster via HAModClusterService and node1 crashed
>>>> (instead of a clean shut down), node2 will not send any MCMP messages to
>>>> httpd on behalf of node1 when it receives a view change.
>>> Ah, yes, I remember now ! I just killed -9 node1 and mod-cluster-manager
>>> didn't remove node1 from its worker list.
>>>
>>> Why don't we do this ? We cannot rely on a node being shutdown
>>> gracefully !
>> Because currently a given node cannot distinguish between a crashed
>> member and a network partition
>
> OK, so {A,B} would remove C and D and {C,D} would remove A and B...
>
> How about we send an MCMP SUSPECT message to httpd ? This would cause
> httpd to double-check, e.g. via the regular AJP cping/cpong or socket
> connection checks, and if the doube-check fails, remove a worker ?
>
> For the regular case (a crash), this would be faster than having to wait
> until httpd checks the connection.
>
> So a view change would be nothing else than trigger an immediate
> connection check on httpd.
>
--
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com
More information about the mod_cluster-dev
mailing list