[mod_cluster-dev] Handling crashed/hung AS nodes

Brian Stansberry brian.stansberry at redhat.com
Fri Mar 27 10:20:03 EDT 2009


Bela Ban wrote:
> 
> 
> Paul Ferraro wrote:
>> Currently, the HAModClusterService (where httpd communication is
>> coordinated by an HA singleton) does not react to crashed/hung members.
>> Specifically, when the HA singleton gets a callback that the group
>> membership changes, it does not send any REMOVE-APP messages to httpd on
>> behalf of the member that just left. Currently, httpd will detect the
>> failure (via a disconnected socket) on its own and sets its internal
>> state accordingly, e.g. a STATUS message will return NOTOK.
> 
> I assume this feature is on the roadmap ? Brian and I have been talking 
> about it and I think this is a crucial feature. I suggest we add this 
> before GA. Thoughts ?
> 

I just commented in detail on another post. Whether I think it's crucial 
depends on the details. Is there any other factor you're thinking of 
that we're not covering?

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com



More information about the mod_cluster-dev mailing list