Failover failed

Monday, 15 December 2008

First off, shouldn't we set the maxThreads in server.xml in the AJP 
connector to a higher value that the (default I think) value of 40 ? The 
demo creates 80 clients and if we have only 1 node up and running, it 
has to serve 80 connections !

Secondly, why is load-demo.war/WEB-INF/web.xml not marked as 
<distributable/> ?

Then, I ran 2 experiments and they didn't behave as I expected them. Can 
you confirm/reject my assumptions ?

#1

    * Run httpd, then only node1. The only change I made from the
      default is to set maxThreads=100 in the AJP connector, and to use
      SimpleLoadbalanceFactorPolicy
    * Run the demo app with 80 clients
    * You'll see that node1 was serving 80 sessions
    * Start node2, keep the demo running
    * My expectation is that now node1's sessions will go down to 40 and
      node will serve roughly 40 sessions
    * Result: node2 picked up 46 sessions, and node1 went down to 34. I
      waited for the session life timeout (120 secs), but the result was
      still the same. Question: why didn't both nodes serve 40 sessions
      each ?
    * Update: the # of sessions in node1 and node2 got closer after some
      time, I guess my session life of 120 secs was the reason for this !
    * Update 2: I see that with a session life of 10, node1 and node2
      take turns at allocating new sessions, sometimes node1 is on top,
      sometimes node2, but on average both nodes serve 40 sessions !
      Great ! Disregard this first item !

#2

    * Start httpd
    * Start node1
    * Start node2
    * Start the demo with 80 clients
    * Observer the same as above: node1 serves roughly 40 sessions and
      so does node2
    * Kill node2
    * Now I have 80 failed clients and 0 active clients !
    * If I call Server.shutdown() on node2 (via JMX), then node1 serves
      40 sessions. But why does node1 not pick up the other 40 sessions
      from left node2 ? Is this the client, which simply terminates
      threads which got a 500 ?
    * I guess we don't send a DISABLE-NODE, STOP-SERVER or DISABLE-APP
      to httpd when we kill a server ? Can we do this, e.g. by adding a
      shutdown hook ? Or can I at least try this out through the command
      line ? What's the recommended way of doing this currently (JMX
      console) ?

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss - a division of Red Hat

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008