[mod_cluster-dev] Session temporarily unavailable

Wed Jul 22 13:06:50 EDT 2009

Ah - this solves the partition problem nicely.

I think a more primitive command, like PING <jvmRoute> is preferable,
since it simplifies the proxy by keeping the suspect/remove logic on the
server-side.

On Wed, 2009-07-22 at 12:01 -0400, Brian Stansberry wrote:
> +1.
> 
> The MCMP SUSPECT should get a response informing what suspected nodes 
> are/aren't still alive. That information can then be made available to 
> management tools.
> 
> Bela Ban wrote:
> > 
> > Paul Ferraro wrote:
> >>>> If you are running mod_cluster via HAModClusterService and node1 crashed
> >>>> (instead of a clean shut down), node2 will not send any MCMP messages to
> >>>> httpd on behalf of node1 when it receives a view change.
> >>> Ah, yes, I remember now ! I just killed -9 node1 and mod-cluster-manager
> >>> didn't remove node1 from its worker list.
> >>>
> >>> Why don't we do this ? We cannot rely on a node being shutdown 
> >>> gracefully !
> >> Because currently a given node cannot distinguish between a crashed
> >> member and a network partition
> > 
> > OK, so {A,B} would remove C and D and {C,D} would remove A and B...
> > 
> > How about we send an MCMP SUSPECT message to httpd ? This would cause 
> > httpd to double-check, e.g. via the regular AJP cping/cpong or socket 
> > connection checks, and if the doube-check fails, remove a worker ?
> > 
> > For the regular case (a crash), this would be faster than having to wait 
> > until httpd checks the connection.
> > 
> > So a view change would be nothing else than trigger an immediate 
> > connection check on httpd.