[mod_cluster-dev] Failover failed

Mon Dec 15 16:01:55 EST 2008

Paul Ferraro wrote:
>> Secondly, why is load-demo.war/WEB-INF/web.xml not marked as
>> <distributable/> ?
>>
> Session replication was not part of our original scope for the demo.
> Though - I see no reason not to enable it.

Got you

>>
>>
>> #2
>>
>> * Start httpd
>> * Start node1
>> * Start node2
> Which server.xml Listener are you using? ModClusterListener,
> ModClusterService, or HAModClusterService? Only the latter 2 use the
> config from mod-cluster-jboss-beans.xml.

HAModClusterService

>
>> * Start the demo with 80 clients
>> * Observer the same as above: node1 serves roughly 40 sessions and
>> so does node2
>> * Kill node2
> Gracefully shutdown? or killed?

CTRL-C.

>> * Now I have 80 failed clients and 0 active clients !
>
> OK - that's weird. Let me try to reproduce this on my end.

OK

>
>> * If I call Server.shutdown() on node2 (via JMX), then node1 serves
>> 40 sessions. But why does node1 not pick up the other 40 sessions
>> from left node2 ? Is this the client, which simply terminates
>> threads which got a 500 ?
> Yes

Why don't we change that, and ignore HTTP error responses ? This way, 
eventually, all clients would fail over to the running node.

>> * I guess we don't send a DISABLE-NODE, STOP-SERVER or DISABLE-APP
>> to httpd when we kill a server ? Can we do this, e.g. by adding a
>> shutdown hook ? Or can I at least try this out through the command
>> line ? What's the recommended way of doing this currently (JMX
>> console) ?
> The [STOP_APP, REMOVE-APP, REMOVE-APP *] message sequence is only sent
> during graceful shutdown of a server.

So Server.shutdown() should have triggered this right ?

> If the node is killed, then
> mod_cluster httpd modules will follow traditional failover logic (same
> as mod_jk/mod_proxy).

OK

> I've a tentative plan to enhance this in a later release to utilize
> jgroups view change to allow the ha singleton master to send these
> message on behalf of the failed server. This sounds more useful than it
> actually is, since with high volume, it is likely that httpd will have
> already detected the failure before the master has the chance to send
> the corresponding shutdown message sequence

Agreed. But the graceful shutdown case (maybe complemented with a 
shutdown hook) should allow us to tell httpd to redirect requests for 
all apps of a given node before the node shuts down.

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss - a division of Red Hat