On 08/13/2009 06:18 PM, Bela Ban wrote:
jean-frederic clere wrote:
> On 08/13/2009 04:56 PM, Bela Ban wrote:
>> But this works in mod_jk doesn't it ?
>
> mod_jk has a static configuration it is not going to remove something
> from the configuration.
>
>>
>> Besides, when we have *non-clustered* workers, there is no HA singleton
>> telling httpd to remove the crashed worker, so httpd has to do it
>> itself.
>>
>> I mean, why can't httpd simply remove a worker W is the socket
>> connection to W is closed (by W crashing) ?
>
> First an entry corresponding shouldn't disturb mod_cluster, see
> MODCLUSTER-92 about telling it is broken. Of course it would possible
> to remove broken workers after a while, but I think it must be a
> switch-able option and the while should be a parameter... Please
> create a JIRA.
I think what should happen is
- httpd periodically tests the connection to worker W
Via cping/cpong but only if the worker is not active (no requests for a
while).
- httpd detects connection loss to W
- httpd removes W from its worker tables, so requests are not dispatched
to W
As soon as the worker W is marked in error the request are not forwarded
to it. (That is already the since the very first beta of mod_cluster).
So removing would be removing a node, virtualhosts and contexts.
- httpd also starts a timer, going off in configurable intervals,
which
tests W again. If W comes up again, httpd adds it again
NO:
- We can't unremove stuff.
- May be the customer has changed his configuration and doesn't want
that httpd to proxy this node.
- the JAVA part will create it anyway (your comment below).
Actually, I think the last point is not necessary, because when W is
started again, mod-cluster/jboss will tell httpd ! We should actually
*not* implement the last point because W might never get started again !
Yep ;-)
Cheers
Jean-Frederic
WDYT ?