[mod_cluster-dev] Failover failed
Paul Ferraro
paul.ferraro at redhat.com
Wed Dec 17 10:40:00 EST 2008
Bela,
I'm seeing similar strange asymmetric load balancing when using 2 jboss
instances + httpd all on the same machine. I'd like to run some tests
on the cluster lab, but I see that you've got a processing running. Are
you still using this? If not, can you kill it?
Paul
On Mon, 2008-12-15 at 14:06 -0500, Paul Ferraro wrote:
> On Mon, 2008-12-15 at 15:25 +0100, Bela Ban wrote:
> > First off, shouldn't we set the maxThreads in server.xml in the AJP
> > connector to a higher value that the (default I think) value of 40 ? The
> > demo creates 80 clients and if we have only 1 node up and running, it
> > has to serve 80 connections !
> >
> > Secondly, why is load-demo.war/WEB-INF/web.xml not marked as
> > <distributable/> ?
> >
> Session replication was not part of our original scope for the demo.
> Though - I see no reason not to enable it.
>
> > Then, I ran 2 experiments and they didn't behave as I expected them. Can
> > you confirm/reject my assumptions ?
> >
> > #1
> >
> > * Run httpd, then only node1. The only change I made from the
> > default is to set maxThreads=100 in the AJP connector, and to use
> > SimpleLoadbalanceFactorPolicy
> > * Run the demo app with 80 clients
> > * You'll see that node1 was serving 80 sessions
> > * Start node2, keep the demo running
> > * My expectation is that now node1's sessions will go down to 40 and
> > node will serve roughly 40 sessions
> > * Result: node2 picked up 46 sessions, and node1 went down to 34. I
> > waited for the session life timeout (120 secs), but the result was
> > still the same. Question: why didn't both nodes serve 40 sessions
> > each ?
> > * Update: the # of sessions in node1 and node2 got closer after some
> > time, I guess my session life of 120 secs was the reason for this !
> > * Update 2: I see that with a session life of 10, node1 and node2
> > take turns at allocating new sessions, sometimes node1 is on top,
> > sometimes node2, but on average both nodes serve 40 sessions !
> > Great ! Disregard this first item !
> >
> >
> > #2
> >
> > * Start httpd
> > * Start node1
> > * Start node2
> Which server.xml Listener are you using? ModClusterListener,
> ModClusterService, or HAModClusterService? Only the latter 2 use the
> config from mod-cluster-jboss-beans.xml.
>
> > * Start the demo with 80 clients
> > * Observer the same as above: node1 serves roughly 40 sessions and
> > so does node2
> > * Kill node2
> Gracefully shutdown? or killed?
> > * Now I have 80 failed clients and 0 active clients !
> :(
> OK - that's weird. Let me try to reproduce this on my end.
>
> > * If I call Server.shutdown() on node2 (via JMX), then node1 serves
> > 40 sessions. But why does node1 not pick up the other 40 sessions
> > from left node2 ? Is this the client, which simply terminates
> > threads which got a 500 ?
> Yes.
> > * I guess we don't send a DISABLE-NODE, STOP-SERVER or DISABLE-APP
> > to httpd when we kill a server ? Can we do this, e.g. by adding a
> > shutdown hook ? Or can I at least try this out through the command
> > line ? What's the recommended way of doing this currently (JMX
> > console) ?
> The [STOP_APP, REMOVE-APP, REMOVE-APP *] message sequence is only sent
> during graceful shutdown of a server. If the node is killed, then
> mod_cluster httpd modules will follow traditional failover logic (same
> as mod_jk/mod_proxy).
> I've a tentative plan to enhance this in a later release to utilize
> jgroups view change to allow the ha singleton master to send these
> message on behalf of the failed server. This sounds more useful than it
> actually is, since with high volume, it is likely that httpd will have
> already detected the failure before the master has the chance to send
> the corresponding shutdown message sequence.
>
>
> _______________________________________________
> mod_cluster-dev mailing list
> mod_cluster-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/mod_cluster-dev
More information about the mod_cluster-dev
mailing list