[jboss-cvs] JBossAS SVN: r67559 - projects/docs/trunk/Clustering_Guide/en-US.

Wed Nov 28 08:54:48 EST 2007

Author: skittoli at redhat.com
Date: 2007-11-28 08:54:47 -0500 (Wed, 28 Nov 2007)
New Revision: 67559

Modified:
   projects/docs/trunk/Clustering_Guide/en-US/Author_Group.xml
   projects/docs/trunk/Clustering_Guide/en-US/Book_Info.xml
   projects/docs/trunk/Clustering_Guide/en-US/Clustering_Guide.xml
Log:
updates

Modified: projects/docs/trunk/Clustering_Guide/en-US/Author_Group.xml
===================================================================

--- projects/docs/trunk/Clustering_Guide/en-US/Author_Group.xml	2007-11-28 13:54:21 UTC (rev 67558)
+++ projects/docs/trunk/Clustering_Guide/en-US/Author_Group.xml	2007-11-28 13:54:47 UTC (rev 67559)
@@ -1,5 +1,34 @@
 <?xml version='1.0'?>
 <!DOCTYPE authorgroup PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
-]>
-
-<authorgroup><corpauthor> JBoss Community Documentation Project </corpauthor></authorgroup>
+	  ]>
+<authorgroup>
+	<author>
+		<firstname>Brian</firstname>
+		<surname>Stansberry</surname>
+		<!--<affiliation>
+			<shortaffil></shortaffil>
+			<jobtitle></jobtitle>
+			<orgname>Red Hat</orgname>
+		</affiliation>-->
+	</author>
+	<author>
+		<firstname>Galder</firstname>
+		<surname>Zamarreno</surname>
+		<!--<affiliation>
+			<shortaffil></shortaffil>
+			<jobtitle></jobtitle>
+			<orgname>Red Hat</orgname>
+		</affiliation>-->
+	</author>
+	
+	
+	<editor>
+		<firstname>Samson</firstname>
+		<surname>Kittoli</surname>
+		<!--	<affiliation>
+			<shortaffil></shortaffil>
+			<jobtitle></jobtitle>
+			<orgname>Red Hat</orgname>
+		</affiliation>-->
+	</editor>
+</authorgroup>

Modified: projects/docs/trunk/Clustering_Guide/en-US/Book_Info.xml
===================================================================
--- projects/docs/trunk/Clustering_Guide/en-US/Book_Info.xml	2007-11-28 13:54:21 UTC (rev 67558)
+++ projects/docs/trunk/Clustering_Guide/en-US/Book_Info.xml	2007-11-28 13:54:47 UTC (rev 67559)
@@ -1,31 +1,15 @@
 <?xml version="1.0" standalone="no"?>
-<!DOCTYPE bookinfo PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
-	<!ENTITY % RH-ENTITIES SYSTEM "Common_Config/rh-entities.ent"> 
-	%RH-ENTITIES;
+<!DOCTYPE bookinfo PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [ ]>
 
-	<!ENTITY % SERVICES-PLAN-ENTITIES-EN SYSTEM "./services-plan.ent">
-	%SERVICES-PLAN-ENTITIES-EN;
-]>
 <bookinfo>
-	<title>JBASS Clustering Guide</title>
-	<subtitle>Clustering Guide</subtitle>
-	<issuenum>4.0</issuenum>
-	<productnumber>1</productnumber>
+	<title>JBoss Application Server Clustering Guide</title>
+	<issuenum>4.2</issuenum>
+	<productnumber>2</productnumber>
 	<abstract>
 		<para>
-			This book represents is about installing Jboss Application Server.
+			This book is the Jboss Application Server clustering guide.
 		</para>
 	</abstract>
-	<copyright>
-		<year>2006</year>
-		<holder>&FORMAL-RHI;</holder>
-	</copyright>
-	<authorgroup>
-		<corpauthor>&FORMAL-RHI;</corpauthor>
-	</authorgroup>
-	<mediaobject>
-		<imageobject>
-			<imagedata fileref="images/esologo.png"></imagedata>
-		</imageobject>
-	</mediaobject>
+	<subtitle>Authors</subtitle>
+	<xi:include href="Author_Group.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 </bookinfo>

Modified: projects/docs/trunk/Clustering_Guide/en-US/Clustering_Guide.xml
===================================================================
--- projects/docs/trunk/Clustering_Guide/en-US/Clustering_Guide.xml	2007-11-28 13:54:21 UTC (rev 67558)
+++ projects/docs/trunk/Clustering_Guide/en-US/Clustering_Guide.xml	2007-11-28 13:54:47 UTC (rev 67559)
@@ -2,43 +2,42 @@
 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd">
 <book>
   
- <bookinfo>
-	<title>JBoss Application Server 4.2.2</title>
-	<subtitle>Clustering Guide</subtitle>
-	<issuenum>4.2</issuenum>
-	<productnumber>2</productnumber>
-	<xi:include href="Author_Group.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
-	<xi:include href="Legal_Notice.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
-</bookinfo>
+<xi:include href="Book_Info.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 
 
-
   <chapter id="cluster.chapt">
     <title>Clustering</title>
     <subtitle>High Availability Enterprise Services via JBoss Clusters</subtitle>
-    <para/>
+    
     <section id="clustering-intro">
       <title>Introduction</title>
-      <para>Clustering allows us to run an applications on several parallel servers (a.k.a cluster nodes). The
-                load is distributed across different servers, and even if any of the servers fails, the application is
-                still accessible via other cluster nodes. Clustering is crucial for scalable enterprise applications, as
-                you can improve performance by simply adding more nodes to the cluster.</para>
-      <para>The JBoss Application Server (AS) comes with clustering support out of the box. The simplest way to
-                start a JBoss server cluster is to start several JBoss instances on the same local network, using the
-                    <literal>run -c all</literal> command for each instance. Those server instances, all started in the
-                    <literal>all</literal> configuration, detect each other and automatically form a cluster.</para>
-      <para>In the first section of this chapter, I discuss basic concepts behind JBoss's clustering services. It
-                is important that you understand those concepts before reading the rest of the chapter. Clustering
-                configurations for specific types of applications are covered after this section.</para>
+      <para>
+	      Clustering allows us to run an application on several parallel servers (a.k.a cluster nodes) while providing a single view to application clients. Load is distributed across different servers, and even if one or more of the servers fails, the application is still accessible via the surviving cluster nodes. Clustering is crucial for scalable enterprise applications, as you can improve performance by simply adding more nodes to the cluster. Clustering is crucial for highly available enterprise applications, as it is the clustering infrastructure that supports the redundancy needed for high availability.
+      </para>
+	
+      <para>
+	      The JBoss Application Server (AS) comes with clustering support out of the box. The simplest way to start a JBoss server cluster is to start several JBoss instances on the same local network, using the <literal>run -c all</literal> command for each instance. Those server instances, all started in the <literal>all</literal> configuration, detect each other and automatically form a cluster.
+      </para>
+      <para>
+	      In the first section of this chapter, we discuss basic concepts behind JBoss's clustering services. It is important that you understand these concepts before reading the rest of the chapter. Clustering configurations for specific types of applications are covered after this section. 
+      </para>
+</section>
+      
       <section id="clustering-intro-def">
         <title>Cluster Definition</title>
-        <para>A cluster is a set of nodes. In a JBoss cluster, a node is a JBoss server instance. Thus, to build
-                    a cluster, several JBoss instances have to be grouped together (known as a "partition"). On a same
-                    network, we may have different clusters. In order to differentiate them, each cluster must have an
-                    individual name.</para>
-        <para><xref linkend="clustering-Partition.fig"/> shows an example network of JBoss server instances
-                    divided into three clusters, with each cluster only having one node. Nodes can be added to or
-                    removed from clusters at any time.</para>
+        <para>
+		A cluster is a set of nodes that communicate with each other and work toward a common goal. In a JBoss Application Server cluster (also known as a “partition”), a node is an JBoss Application Server instance. Communication between the nodes is handled by the JGroups group communication library, with a JGroups Channel providing the core functionality of tracking who is in the cluster and reliably exchanging messages between the cluster members.  JGroups channels with the same configuration and name have the ability to dynamically discover each other and form a group. This is why simply executing “run -c all” on two AS instances on the same network is enough for them to form a cluster – each AS starts a Channel (actually, several) with the same default configuration, so they dynamically discover each other and form a cluster. Nodes can be dynamically added to or removed from clusters at any time, simply by starting or stopping a Channel with a configuration and na!
 me that matches the other cluster members.
+		
+		In summary, a JBoss cluster is a set of AS server instances each of which is running an identically configured and named JGroups Channel.
+	</para>
+	<para>
+		On the same AS instance, different services can create their own Channel. In a default 4.2.x AS, four different services create channels – the web session replication service, the EJB3 SFSB replication service, the EJB3 entity caching service, and a core general purpose clustering service known as HAPartition. In order to differentiate these channels, each must have a unique name, and its configuration must match its peers yet differ from the other channels.
+	</para>
+	<para>
+		So, if you go to two AS 4.2.x instances and execute <literal>run -c all</literal>, the channels will discover each other and you'll have a conceptual <literal>cluster</literal>. It's  easy to think of this as a two node cluster, but it's important to understand that you really have 4 channels, and hence 4 two node clusters.
+	</para>
+	
+	<para>On the same network, even for the same service, we may have different clusters. <xref linkend="clustering-Partition.fig"/> shows an example network of JBoss server instances divided into three clusters, with the third cluster only having one node.  This sort of topology can be set up simply by configuring the AS instances such that within a set of nodes meant to form a cluster the Channel configurations and names match while they differ from any other channels on the same network. </para>
         <figure id="clustering-Partition.fig">
           <title>Clusters and server nodes</title>
           <mediaobject>
@@ -47,20 +46,20 @@
             </imageobject>
           </mediaobject>
         </figure>
-        <note>
-          <para>While it is technically possible to put a JBoss server instance into multiple clusters at the
-                        same time, this practice is generally not recommended, as it increases the management
-                        complexity.</para>
-        </note>
-        <para>Each JBoss server instance (node) specifies which cluster (i.e., partition) it joins in the
-                        <literal>ClusterPartition</literal> MBean in the <literal>deploy/cluster-service.xml</literal>
-                    file. All nodes that have the same <literal>ClusterPartition</literal> MBean configuration join the
-                    same cluster. Hence, if you want to divide JBoss nodes in a network into two clusters, you can just
-                    come up with two different <literal>ClusterPartition</literal> MBean configurations, and each node
-                    would have one of the two configurations depending on which cluster it needs to join. If the
-                    designated cluster does not exist when the node is started, the cluster would be created. Likewise,
-                    a cluster is removed when all its nodes are removed.</para>
-        <para>The following example shows the MBean definition packaged with the standard JBoss AS distribution.
+    
+        <para>
+		The section on “JGroups Configuration” and on “Isolating JGroups Channels” covers in detail how to configure Channels such that desired peers find each other and unwanted peers do not. As mentioned above, by default JBoss AS uses four separate JGroups Channels.  These can be divided into two broad categories: the Channel used by the general purpose HAPartition service, and three Channels created by JBoss Cache for special purpose caching and cluster wide state replication.
+	</para>
+</section>
+<section> <title>HAPartition</title>
+	
+<para>
+		    HAPartition is a general purpose service used for a variety of tasks in AS clustering.  At its core, it is an abstraction built on top of a JGroups Channel that provides support for making/receiving RPC invocations on/from one or more cluster members.  HAPartition also supports a distributed registry of which clustering services are running on which cluster members. It provides notifications to interested listeners when the cluster membership changes or the clustered service registry changes. HAPartition forms the core of many of the clustering services we'll be discussing in the rest of this guide, including smart client-side clustered proxies, EJB 2 SFSB replication and entity cache management, farming, HA-JNDI and HA singletons.
+	    </para>
+	    
+	    
+	    
+        <para>The following example shows the <literal>HAPartition</literal> MBean definition packaged with the standard JBoss AS distribution.
                     So, if you simply start JBoss servers with their default clustering settings on a local network, you
                     would get a default cluster named <literal>DefaultPartition</literal> that includes all server
                     instances as its nodes.</para>
@@ -89,17 +88,16 @@
     &lt;/attribute&gt;
 &lt;/mbean&gt;
             </programlisting>
-        <para>Here, we omitted the detailed JGroups protocol configuration for this cluster. JGroups handles the
+        <para>Here, we omitted the detailed JGroups protocol configuration for this channel. JGroups handles the
                     underlying peer-to-peer communication between nodes, and its configuration is discussed in <xref linkend="jbosscache-jgroups"/>. The following list shows the available configuration attributes
-                    in the <literal>ClusterPartition</literal> MBean.</para>
+                    in the <literal>HAPartition</literal> MBean.</para>
         <itemizedlist>
           <listitem>
             <para><emphasis role="bold">PartitionName</emphasis> is an optional attribute to specify the
-                            name of the cluster. Its default value is <literal>DefaultPartition</literal>.</para>
+		    name of the cluster. Its default value is <literal>DefaultPartition</literal>. Use the <literal>-g </literal> (a.k.a. --partition) command line switch to set this value at JBoss startup.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">NodeAddress</emphasis> is an optional attribute to specify the
-                            binding IP address of this node.</para>
+		  <para><emphasis role="bold">NodeAddress</emphasis> is an optional attribute used to help generate a unique name for this node.</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">DeadlockDetection</emphasis> is an optional boolean attribute that
@@ -107,69 +105,48 @@
                             value is <literal>false</literal>.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">StateTransferTimeout</emphasis> is an optional attribute to specify
-                            the timeout for state replication across the cluster (in milliseconds). Its default value is
-                                <literal>30000</literal>.</para>
+            <para><emphasis role="bold">StateTransferTimeout</emphasis> is an optional attribute to specify the timeout for state replication across the cluster (in milliseconds). State replication refers to the process of obtaining initial application state from other already-running cluster members at service startup.  Its default value is <literal>30000</literal>. </para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">PartitionConfig</emphasis> is an element to specify JGroup
                             configuration options for this cluster (see <xref linkend="jbosscache-jgroups"/>).</para>
           </listitem>
         </itemizedlist>
-        <para>In order for nodes to form a cluster, they must have the exact same
-                    <literal>PartitionName</literal> and the <literal>ParitionConfig</literal> elements. Changes in
-                    either element on some but not all nodes would cause the cluster to split. It is generally easier to
-                    change the <literal>ParitionConfig</literal> (i.e., the address/port) to run multiple cluster rather
-                    than changing the <literal>PartitionName</literal> due to the mulititude of places the former needs
-                    to be changed in other configuration files. However, changing the <literal>PartitionName</literal>
-                    is made easier in 4.0.2+ due to the use of the <literal>${jboss.partition.name}</literal> property
-                    which allows the name to be change via a single <literal>jboss.partition.name</literal> system
-                    property</para>
+        <para>In order for nodes to form a cluster, they must have the exact same <literal>PartitionName</literal> and the <literal>ParitionConfig</literal> elements. Changes in either element on some but not all nodes would cause the cluster to split.
+	 </para>
+	    
         <para>You can view the current cluster information by pointing your browser to the JMX console of any
                     JBoss instance in the cluster (i.e., <literal>http://hostname:8080/jmx-console/</literal>) and then
-                    clicking on the <literal>jboss:service=DefaultPartition</literal> MBean (change the MBean name to
-                    reflect your cluster name if this node does not join <literal>DefaultPartition</literal>). A list of
-                    IP addresses for the current cluster members is shown in the <literal>CurrentView</literal> field.</para>
-        <note>
-          <para>A cluster (partition) contains a set of nodes that work toward a same goal. Some clustering
-                        features require to sub-partition the cluster to achieve a better scalability. For example,
-                        let's imagine that we have a 10-node cluster and we want to replicate in memory the state of
-                        stateful session beans on all 10 different nodes to provide for fault-tolerant behaviour. It
-                        would mean that each node has to store a backup of the 9 other nodes. This would not scale at
-                        all (each node would need to carry the whole state cluster load). It is probably much better to
-                        have some kind of sub-partitions inside a cluster and have beans state exchanged only between
-                        nodes that are part of the same sub-partition. The future JBoss clustering implementation will
-                        support sub-partitions and it will allow the cluster administrator to determine the optimal size
-                        of a sub-partition. The sub-partition topology computation will be done dynamically by the
-                        cluster.</para>
-        </note>
-      </section>
+		    clicking on the <literal>jboss:service=DefaultPartition</literal> MBean (change the MBean name to reflect your partitionr name if you use the -g startup switch). A list of IP addresses for the current cluster members is shown in the CurrentView field.</para>
+	    
+        <note><title>Note</title>
+	<para>
+		While it is technically possible to put a JBoss server instance into multiple HAPartitions at the same time, this practice is generally not recommended, as it increases management complexity.
+	</para>
+	</note>
+
+</section>
+<section><title>JBoss Cache channels</title>
+<para>
+	JBoss Cache is a fully featured distributed cache framework that can be used in any application server environment or standalone. JBoss AS integrates JBoss Cache to provide cache services for HTTP sessions, EJB 3.0 session beans, and EJB 3.0 entity beans. Each of these cache services is defined in a separate Mbean, and each cache creates its own JGroups Channel. We will cover those MBeans when we discuss specific services in the next several sections.
+</para>
+
       <section id="clustering-intro-arch">
         <title>Service Architectures</title>
-        <para>The clustering topography defined by the <literal>ClusterPartition</literal> MBean on each node is
-                    of great importance to system administrators. But for most application developers, you are probably
-                    more concerned about the cluster architecture from a client application's point of view. JBoss AS
-                    supports two types of clustering architectures: client-side interceptors (a.k.a proxies or stubs)
-                    and load balancers.</para>
+        <para>The clustering topography defined by the <literal>HAPartition</literal> MBean on each node is
+		of great importance to system administrators. But for most application developers, you are probably more concerned about the cluster architecture from a client application's point of view. Two basic clustering architectures are used with JBoss AS: client-side interceptors (a.k.a smart proxies or stubs) and external load balancers. Which architecture your application will use will depend on what type of client you have.
+	    </para>
+	    
+	    
         <section id="clustering-intro-arch-proxy">
-          <title>Client-side interceptor</title>
-          <para>Most remote services provided by the JBoss application server, including JNDI, EJB, RMI and
-                        JBoss Remoting, require the client to obtain (e.g., to look up and download) a stub (or proxy)
-                        object. The stub object is generated by the server and it implements the business interface of
-                        the service. The client then makes local method calls against the stub object. The call is
-                        automatically routed across the network and invoked against service objects managed in the
-                        server. In a clustering environment, the server-generated stub object is also an interceptor
-                        that understand how to route calls to nodes in the cluster. The stub object figures out how to
-                        find the appropriate server node, marshal call parameters, un-marshall call results, return the
-                        results to the caller client.</para>
-          <para>The stub interceptors have updated knowledge about the cluster. For instance, they know the IP
-                        addresses of all available server nodes, the algorithm to distribute load across nodes (see next
-                        section), and how to failover the request if the target node not available. With every service
-                        request, the server node updates the stub interceptor with the latest changes in the cluster.
-                        For instance, if a node drops out of the cluster, each of the client stub interceptor is updated
-                        with the new configuration the next time it connects to any active node in the cluster. All the
-                        manipulations on the service stub are transparent to the client application. The client-side
-                        interceptor clustering architecture is illustrated in <xref linkend="clustering-InterceptorArch.fig"/>.</para>
+          <title>Client-side interceptor architecture</title>
+<para>
+		  Most remote services provided by the JBoss application server, including JNDI, EJB, JMS, RMI and JBoss Remoting, require the client to obtain (e.g., to look up and download) a stub (or proxy) object. The stub object is generated by the server and it implements the business interface of the service. The client then makes local method calls against the stub object. The stub automatically routes the call across the network and where it is invoked against service objects managed in the server. In a clustering environment, the server-generated stub object includes an interceptor that understands how to route calls to multiple nodes in the cluster. The stub object figures out how to find the appropriate server node, marshal call parameters, un-marshall call results, and return the result to the caller client.
+</para>
+
+
+<para>The stub interceptors maintain up-to-date knowledge about the cluster. For instance, they know the IP addresses of all available server nodes, the algorithm to distribute load across nodes (see next section), and how to failover the request if the target node not available. As part of handling each service request, if the cluster topology has changed the server node updates the stub interceptor with the latest changes in the cluster. For instance, if a node drops out of the cluster, each of client stub interceptor is updated with the new configuration the next time it connects to any active node in the cluster. All the manipulations done by the service stub are transparent to the client application. The client-side interceptor clustering architecture is illustrated in <xref linkend="clustering-InterceptorArch.fig"/>.
+</para>
           <figure id="clustering-InterceptorArch.fig">
             <title>The client-side interceptor (proxy) architecture for clustering</title>
             <mediaobject>
@@ -179,78 +156,85 @@
             </mediaobject>
           </figure>
           <note>
-            <para><xref linkend="clustering-session-slsb21-retry"/> describes how to enable the client proxy
+            <para><xref linkend="clustering-session-slsb21"/> describes how to enable the client proxy
                             to handle the entire cluster restart.</para>
           </note>
         </section>
         <section id="clustering-intro-arch-balancer">
           <title>Load balancer</title>
-          <para>Other JBoss services, in particular the HTTP web services, do not require the client to
-                        download anything. The client (e.g., a web browser) sends in requests and receives responses
-                        directly over the wire according to certain communication protocols (e.g., the HTTP protocol).
-                        In this case, a load balancer is required to process all requests and dispatch them to server
-                        nodes in the cluster. The load balancer is typically part of the cluster. It understands the
-                        cluster configuration as well as failover policies. The client only needs to know about the load
-                        balancer. The load balancer clustering architecture is illustrated in <xref linkend="clustering-BalancerArch.fig"/>.</para>
+	  <para>
+		  Other JBoss services, in particular the HTTP-based services, do not require the client to download anything. The client (e.g., a web browser) sends in requests and receives responses directly over the wire according to certain communication protocols (e.g., the HTTP protocol). In this case, an external load balancer is required to process all requests and dispatch them to server nodes in the cluster. The client only needs to know about how to contact the load balancer; it has no knowledge of the JBoss AS instances behind the load balancer. The load balancer is logically part of the cluster, but we refer to it as “external” because it is not running in the same process as either the client or any of the JBoss AS instances.  It can be implemented either in software or hardware.  There are many vendors of hardware load balancers; the mod_jk Apache module is an excellent example of a software load balancer. An external load balancer implements its own mechanism for unde!
 rstanding the cluster configuration and provides its own load balancing and failover policies. The external load balancer clustering architecture is illustrated in <xref linkend="clustering-BalancerArch.fig"/>.
+	  </para>
           <figure id="clustering-BalancerArch.fig">
-            <title>The load balancer architecture for clustering</title>
+            <title>The external load balancer architecture for clustering</title>
             <mediaobject>
               <imageobject>
                 <imagedata align="center" fileref="images/clustering-BalancerArch.png"/>
               </imageobject>
             </mediaobject>
           </figure>
-          <para>A potential problem with the load balancer solution is that the load balancer itself is a
-                        single point of failure. It needs to be monitored closely to ensure high availability of the
-                        entire cluster services.</para>
+	  <para>
+		  A potential problem with an external load balancer architecture is that the load balancer itself may be a single point of failure. It needs to be monitored closely to ensure high availability of the entire cluster's services.
+	  </para>
         </section>
-      </section>
+      
+</section>
       <section id="clustering-intro-balancepolicy">
         <title>Load-Balancing Policies</title>
-        <para>Both the JBoss client-side interceptor (stub) and load balancer use load balancing policies to
-                    determine which server node to send a new request to. In this section, let's go over the load
-                    balancing policies available in JBoss AS.</para>
+	<para>
+		Both the JBoss client-side interceptor (stub) and load balancer use load balancing policies to determine which server node to which node a new request should be sent. In this section, let's go over the load balancing policies available in JBoss AS.
+	</para>
         <section id="clustering-intro-balancepolicy-30">
-          <title>JBoss AS 3.0.x</title>
-          <para>In JBoss 3.0.x, the following two load balancing options are available.</para>
+		<title>Client-side interceptor architecture</title>
+		<para>
+			In JBoss 4.2.2, the following load balancing options are available when the client-side interceptor architecture is used. The client-side stub maintains a list of all nodes providing the target service; the job of the load balance policy is to pick a node from this list for each request.
+		</para>
           <itemizedlist>
             <listitem>
-              <para>Round-Robin (<literal>org.jboss.ha.framework.interfaces.RoundRobin</literal>): each
-                                call is dispatched to a new node. The first target node is randomly selected from the
-                                list.</para>
+		    <para>
+			    Round-Robin (<literal>org.jboss.ha.framework.interfaces.RoundRobin</literal>): each call is dispatched to a new node, proceeding sequentially through the list of nodes. The first target node is randomly selected from the list.
+		    </para>
             </listitem>
+	    
+	    <listitem>
+		    	<para>
+				Random-Robin (<literal>org.jboss.ha.framework.interfaces.RandomRobin</literal>): for each call the target node is randomly selected from the list.
+    			</para>
+            </listitem>
             <listitem>
-              <para>First Available (<literal>org.jboss.ha.framework.interfaces.FirstAvailable</literal>):
-                                one of the available target nodes is elected as the main target and is used for every
-                                call: this elected member is randomly chosen from the list of members in the cluster.
-                                When the list of target nodes changes (because a node starts or dies), the policy will
-                                re-elect a target node unless the currently elected node is still available. Each
-                                client-side interceptor or load balancer elects its own target node independently of the
-                                other proxies.</para>
+              <para>
+		      First Available (<literal>org.jboss.ha.framework.interfaces.FirstAvailable</literal>): one of the available target nodes is elected as the main target and is thereafter used for every call; this elected member is randomly chosen from the list of members in the cluster. When the list of target nodes changes (because a node starts or dies), the policy will choose a new target node unless the currently elected node is still available. Each client-side stub elects its own target node independently of the other stubs, so if a particular client downloads two stubs for the same target service (e.g., an EJB), each stub will independently pick its target.  This is an example of a policy that provides “session affinity” or “sticky sessions”, since the target node does not change once established.
+	      </para>
             </listitem>
+	    
+    
+            <listitem>
+	    <para>
+		    First Available Identical All Proxies (<literal>org.jboss.ha.framework.interfaces.FirstAvailableIdenticalAllProxies</literal>): has the same behaviour as the "First Available" policy but the elected target node is shared by all stubs in the same client-side VM that are associated with the same target service. So if a particular client downloads two stubs for the same target service (e.g. an EJB), each stub will use the same target.
+    </para>
+            </listitem>
+	    
           </itemizedlist>
+        <para>
+		Each of the above is an implementation of the  org.jboss.ha.framework.interfaces.LoadBalancePolicy interface; users are free to write their own implementation of this simple interface if they need some special behavior. In later sections we'll see how to configure the load balance policies used by different services.
+	</para>
+</section>
+	<section><title>External load balancer architecture</title>
+		
+<para>
+As noted above, an external load balancer provides its own load balancing capabilities. What capabilities are supported depends on the provider of the load balancer.  The only JBoss requirement is that the load balancer support “session affinitiy” (a.k.a. “sticky sessions”). With session affinitiy enabled, once the load balancer routes a request from a client to node A and the server initiates a session, all future requests associated with that session must be routed to node A, so long as node A is available.
+	</para>
+	  
+	  
         </section>
-        <section id="clustering-intro-balancepolicy-32">
-          <title>JBoss AS 3.2+</title>
-          <para>In JBoss 3.2+, three load balancing options are available. The Round-Robin and First Available
-                        options have the same meaning as the ones in JBoss AS 3.0.x.</para>
-          <para>The new load balancing option in JBoss 3.2 is "First AvailableIdenticalAllProxies"
-                            (<literal>org.jboss.ha.framework.interfaces.FirstAvailableIdenticalAllProxies</literal>). It
-                        has the same behaviour as the "First Available" policy but the elected target node is shared by
-                        all client-side interceptors of the same "family".</para>
-          <para>In JBoss 3.2 (and later), the notion of "Proxy Family" is defined. A Proxy Family is a set of
-                        stub interceptors that all make invocations against the same replicated target. For EJBs for
-                        example, all stubs targeting the same EJB in a given cluster belong to the same proxy family.
-                        All interceptors of a given family share the same list of target nodes. Each interceptor also
-                        has the ability to share arbitrary information with other interceptors of the same family. A use
-                        case for the proxy family is give in <xref linkend="clustering-session-slsb21"/>.</para>
-        </section>
       </section>
+      
+      
       <section id="clustering-intro-farm">
         <title>Farming Deployment</title>
         <para>The easiest way to deploy an application into the cluster is to use the farming service. That is
                     to hot-deploy the application archive file (e.g., the EAR, WAR or SAR file) in the
-                    <code>all/farm/</code> directory of any of the cluster member and the application is automatically
+		    <code>all/farm/</code> directory of any of the cluster members and the application will be automatically
                     duplicated across all nodes in the same cluster. If node joins the cluster later, it will pull in
                     all farm deployed applications in the cluster and deploy them locally at start-up time. If you
                     delete the application from one of the running cluster server node's <literal>farm/</literal>
@@ -258,24 +242,21 @@
                     nodes farm folder (triggers undeployment.) You should manually delete the application from the farm
                     folder of any server node not currently connected to the cluster.</para>
         <note>
-          <para>Currently, due to an implementation bug, the farm deployment service only works for
-                        hot-deployed archives. If you put an application in the <literal>farm/</literal> directory first
-                        and then start the server, the application would not be detected and pushed across the cluster.
-                        We are working to resolve this issue.</para>
+		<para>Currently, due to an implementation weakness, the farm deployment service only works for 1) archives located in the farm/ directory of the first node to join the cluster or 2) hot-deployed archives. If you first put a new application in the farm/ directory and then start the server to have it join an already running cluster, the application will not be pushed across the cluster or deployed. This is because the farm service does not know whether the application really represents a new deployment or represents an old deployment that was removed from the rest of the cluster while the newly starting node was off-line. We are working to resolve this issue.</para>
         </note>
         <note>
-          <para>You can only put archive files, not exploded directories, in the <literal>farm</literal>
-                        directory. This way, the application on a remote node is only deployed when the entire archive
-                        file is copied over. Otherwise, the application might be deployed (and failed) when the
-                        directory is only partially copied.</para>
+		<para>You can only put zipped archive files, not exploded directories, in the farm directory. If exploded directories are placed in farm the directory contents will be replicated around the cluster piecemeal, and it is very likely that remote nodes will begin trying to deploy things before all the pieces have arrived, leading to deployment failure. </para>
         </note>
+	<note>
+		<para>Farmed deployment is not atomic. A problem deploying, undeploying or redeploying an application on one node in the cluster will not prevent the deployment, undeployment or redeployment being done on the other nodes.  There is no rollback capability. Deployment is also not staggered; it is quite likely, for example, that a redeployment will happen on all nodes in the cluster simultaneously, briefly leaving no nodes in the cluster providing service.
+		</para>
+	</note>
+	
         <para>Farming is enabled by default in the <literal>all</literal> configuration in JBoss AS
-                    distributions, so you will not have to set it up yourself. The configuration file is located in the
-                        <literal>deploy/deploy.last</literal> directory. If you want to enable farming in your custom
-                    configuration, simply create the XML file shown below (named it <literal>farm-service.xml</literal>)
-                    and copy it to the JBoss deploy directory
-                    <literal>$JBOSS_HOME/server/your_own_config/deploy</literal>. Make sure that you custom
-                    configuration has clustering enabled.</para>
+		distributions, so you will not have to set it up yourself. The <literal>farm-service.xml</literal> configuration file is located in the deploy/deploy.last directory. If you want to enable farming in a custom configuration, simply copy the  farm-service.xml file  and copy it to the JBoss deploy directory  <literal>$JBOSS_HOME/server/your_own_config/deploy/deploy.last</literal>. Make sure that your custom  configuration has clustering enabled.</para>
+	<para>
+	After deploying farm-service.xml you are ready to rumble. The required FarmMemberService MBean attributes for configuring a farm are listed below.
+</para>
         <programlisting>
 &lt;?xml version="1.0" encoding="UTF-8"?&gt;    
 &lt;server&gt;        
@@ -283,125 +264,205 @@
     &lt;mbean code="org.jboss.ha.framework.server.FarmMemberService"     
             name="jboss:service=FarmMember,partition=DefaultPartition"&gt;     
         ...      
-        &lt;attribute name="PartitionName"&gt;DefaultPartition&lt;/attribute&gt;      
-        &lt;attribute name="ScanPeriod"&gt;5000&lt;/attribute&gt;      
-        &lt;attribute name="URLs"&gt;farm/&lt;/attribute&gt;     
-    &lt;/mbean&gt;       
+	
+	&lt;depends optional-attribute-name="ClusterPartition" 
+	proxy-type="attribute"&gt;
+		jboss:service=${jboss.partition.name:DefaultPartition}
+		&lt;/depends&gt;     
+		&lt;attribute name="ScanPeriod"&gt;5000&lt;/attribute&gt;      
+		&lt;attribute name="URLs"&gt;farm/&lt;/attribute&gt;     
+	...
+	&lt;/mbean&gt;       
 &lt;/server&gt;
             </programlisting>
-        <para>After deploying <literal>farm-service.xml</literal> you are ready to rumble. The required
-                        <literal>FarmMemberService</literal> MBean attributes for configuring a farm are listed below.</para>
+      
+	    
         <itemizedlist>
           <listitem>
-            <para><emphasis role="bold">PartitionName</emphasis> specifies the name of the cluster for this
-                            deployed farm. Its default value is <literal>DefaultPartition</literal>.</para>
+		  <para><emphasis role="bold">ClusterPartition</emphasis> is a required attribute to inject the HAPartition service that the farm service uses for intra-cluster communication.</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">URLs</emphasis> points to the directory where deployer watches for
                             files to be deployed. This MBean will create this directory is if does not already exist.
-                            Also, "." pertains to the configuration directory (i.e.,
-                            <literal>$JBOSS_HOME/server/all/</literal>).</para>
+			    If a full URL is not provided, it is assumed that the value is a filesytem path relative to the configuration directory (e.g. <literal>$JBOSS_HOME/server/all/</literal>).</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">ScanPeriod</emphasis> specifies the interval at which the folder
                             must be scanned for changes.. Its default value is <literal>5000</literal>.</para>
           </listitem>
         </itemizedlist>
-        <para>The Farming service is an extension of the <literal>URLDeploymentScanner</literal>, which scans
-                    for hot deployments in <literal>deploy/</literal> directory. So, you can use all the attributes
+	<para>The farming service is an extension of the <literal>URLDeploymentScanner</literal>, which scans for hot deployments in the <literal>deploy/</literal> directory. So, you can use all the attributes
                     defined in the <literal>URLDeploymentScanner</literal> MBean in the
                     <literal>FarmMemberService</literal> MBean. In fact, the <literal>URLs</literal> and
                         <literal>ScanPeriod</literal> attributes listed above are inherited from the
                         <literal>URLDeploymentScanner</literal> MBean.</para>
       </section>
+      
+      
       <section id="clustering-intro-state">
         <title>Distributed state replication services</title>
         <para>In a clustered server environment, distributed state management is a key service the cluster must
                     provide. For instance, in a stateful session bean application, the session state must be
                     synchronized among all bean instances across all nodes, so that the client application reaches the
                     same session state no matter which node serves the request. In an entity bean application, the bean
-                    object sometimes needs to be cached across the cluster to reduce the database load. Currently, the
-                    state replication and distributed cache services in JBoss AS are provided via two ways: the
-                        <literal>HASessionState</literal> MBean and the JBoss Cache framework.</para>
+		    object sometimes needs to be cached across the cluster to reduce the database load. Currently, the state replication and distributed cache services in JBoss AS are provided via three ways: the <literal>HASessionState</literal> Mbean, the <literal>DistributedState</literal> MBean and the JBoss Cache framework.</para>
         <itemizedlist>
           <listitem>
-            <para>The <literal>HASessionState</literal> MBean provides session replication and distributed
-                            cache services for EJB 2.x stateful session beans and HTTP load balancers in JBoss 3.x and
-                            4.x. The MBean is defined in the <literal>all/deploy/cluster-service.xml</literal> file. We
-                            will show its configuration options in the EJB 2.x stateful session bean section
-                        later.</para>
+		  <para>The <literal>HASessionState</literal> MBean is a legacy service that provides session replication and distributed cache services for EJB 2.x stateful session beans. The MBean is defined in the  <literal>all/deploy/cluster-service.xml</literal> file. We will show its configuration options in the EJB 2.x stateful session bean section later.</para>
           </listitem>
+	  <listitem>
+		  <para>
+			  The <literal>DistributedState</literal> Mbean is a legacy service built on the HAPartition service. It is supported for backwards compatibility reasons, but new applications should not use it; they should use the much more sophisticated JBoss Cache instead.
+		  </para>
+	  </listitem>
+	  
           <listitem>
-            <para>JBoss Cache is a fully featured distributed cache framework that can be used in any
-                            application server environment and standalone. It gradually replaces the
-                                <literal>HASessionState</literal> service. JBoss AS integrates JBoss Cache to provide
-                            cache services for HTTP sessions, EJB 3.0 session and entity beans, as well as Hibernate
-                            persistence objects. Each of these cache services is defined in a separate MBean. We will
-                            cover those MBeans when we discuss specific services in the next several sections.</para>
+            <para>
+		    As mentioned above JBoss Cache  is used to provide cache services for HTTP sessions, EJB 3.0 session beans and EJB 3.0 entity beans. It is the primary distributed state management tool in JBoss AS, and is an excellent choice for any custom caching requirements your applications may have.  We will cover JBoss Cache in more detail when we discuss specific services in the next several sections..</para>
           </listitem>
         </itemizedlist>
+
       </section>
     </section>
-    <section id="clustering-jndi">
+</chapter>
+
+
+<chapter id="clustering-jndi">
       <title>Clustered JNDI Services</title>
-      <para>JNDI is one of the most important services provided by the application server. The JBoss clustered
-                JNDI service is based on the client-side interceptor architecture. The client must obtain a JNDI stub
-                object (via the <literal>InitialContext</literal> object) and invoke JNDI lookup services on the remote
-                server through the stub. Furthermore, JNDI is the basis for many other interceptor-based clustering
-                services: those services register themselves with the JNDI so that the client can lookup their stubs and
-                make use of their services.</para>
+      <para>
+	      JNDI is one of the most important services provided by the application server. The JBoss HA-JNDI (High Availability JNDI) service brings the following features to JNDI:</para>
+	      <itemizedlist>
+		      <listitem>
+				<para>
+	      			Transparent failover of naming operations. If an HA-JNDI naming Context is connected to the HA-JNDI service on a particular JBoss AS instance, and that service fails or is shut down, the HA-JNDI client can transparently fail over to another AS instance.
+				</para>
+			</listitem>
+			<listitem>
+				<para>
+	      			Load balancing of naming operations. An HA-JNDI naming Context will automatically load balance its requests across all the HA-JNDI servers in the cluster.
+			</para>
+		</listitem>
+		<listitem>
+				<para>
+				Automatic client discovery of HA-JNDI servers (using multicast).
+			</para>
+		</listitem>
+		<listitem>
+				<para>
+	      Unified view of JNDI trees cluster-wide.  Client can connect to the HA-JNDI service running on any node in the cluster and find objects bound in JNDI on any other node.  This is accomplished via two mechanisms:
+      </para>
+</listitem>
+</itemizedlist>
+
+	<itemizedlist>
+		<listitem>
+			<para>Cross-cluster lookups. A client can perform a lookup and the server side HA-JNDI service has the ability to find things bound in regular JNDI on any node in the cluster.
+			</para>
+		</listitem>
+		<listitem>
+			<para>A replicated cluster-wide context tree. An object bound into the HA-JNDI service will be replicated around the cluster, and a copy of that object will be available in-VM on each node in the cluster.
+			</para>
+		</listitem>
+	</itemizedlist>
+
+
+			
+
+	      
+	<para>
+		JNDI is a key component for many other interceptor-based clustering services: those services register themselves with the JNDI so that the client can lookup their proxies and make use of their services. HA-JNDI completes the picture by ensuring that clients have a highly-available means to look up those proxies. However, it is important to understand that using HA-JNDI (or not) has no effect whatsoever on the clustering behavior of the objects that are looked up. To illustrate:
+	</para>
+	      <itemizedlist>
+		      <listitem>
+			      <para>
+				      If an EJB is not configured as clustered, looking up the EJB via HA-JNDI does not somehow result in the addition of clustering capabilities (load balancing of EJB calls, transparent failover, state replication) to the EJB.
+			      </para>
+		      </listitem>
+		      <listitem>
+			      <para>
+				      If an EJB is configured as clustered, looking up the EJB via regular JNDI instead of HA-JNDI does not somehow result in the removal of the bean proxy's clustering capabilities.
+			      </para>
+		      </listitem>
+	      </itemizedlist>
+	      
+	      
+
+	
       <section id="clustering-jndi-how">
         <title>How it works</title>
-        <para>The JBoss HA-JNDI (High Availability JNDI) service maintains a cluster-wide context tree. The
-                    cluster wide tree is always available as long as there is one node left in the cluster. Each JNDI
-                    node in the cluster also maintains its own local JNDI context. The server side application can bind
-                    its objects to either trees. In this section, you will learn the distinctions of the two trees and
-                    the best practices in application development. The design rational of this architecture is as
-                    follows.</para>
+        <para>
+		The JBoss client-side HA-JNDI naming Context is based on the client-side interceptor architecture. The client  obtains an HA-JNDI proxy object (via the InitialContext object) and invokes JNDI lookup services on the remote server through the proxy.  The client specifies that it wants an HA-JNDI proxy by configuring the naming properties used by  the InitialContext object.  This is covered in detail in the  “Client Configuration” section.  Other than the need to ensure the appropriate naming properties are provided to the InitialContext, the fact that the  naming Context is using HA-JNDI is completely transparent to the client.
+	</para>
+	<para>
+		On the server side, he the HA-JNDI service maintains a cluster-wide context tree. The cluster wide tree is always available as long as there is one node left in the cluster. Each node in the cluster also maintains its own local JNDI context tree.  The HA-JNDI service on that node is able to find objects bound into the local JNDI context tree.  An application can bind its objects to either tree. The design rationale for this architecture is as follows:
+	</para>
         <itemizedlist>
           <listitem>
-            <para>We didn't want any migration issues with applications already assuming that their JNDI
-                            implementation was local. We wanted clustering to work out-of-the-box with just a few tweaks
-                            of configuration files.</para>
+		  <para>
+			  It avoids migration issues with applications that assume that their JNDI implementation is local. This allows clustering to work out-of-the-box with just a few tweaks of configuration files.
+		  </para>
           </listitem>
+          
           <listitem>
-            <para>We needed a clean distinction between locally bound objects and cluster-wide
-                        objects.</para>
+		  <para>
+			  In a homogeneous cluster, this configuration actually cuts down on the amount of network traffic. A homogenous cluster is one where the same types of objects are bound under the same names on each node.
+		  </para>
           </listitem>
           <listitem>
-            <para>In a homogeneous cluster, this configuration actually cuts down on the amount of network
-                            traffic.</para>
+            	<para>
+		    Designing it in this way makes the HA-JNDI service an optional service since all underlying cluster code uses a straight new <literal>InitialContext()</literal> to lookup or create bindings.
+	    	</para>
           </listitem>
-          <listitem>
-            <para>Designing it in this way makes the HA-JNDI service an optional service since all
-                            underlying cluster code uses a straight new <literal>InitialContext()</literal> to lookup or
-                            create bindings.</para>
-          </listitem>
         </itemizedlist>
-        <para>On the server side, <literal>new InitialContext()</literal>, will be bound to a local-only,
-                    non-cluster-wide JNDI Context (this is actually basic JNDI). So, all EJB homes and such will not be
-                    bound to the cluster-wide JNDI Context, but rather, each home will be bound into the local JNDI.
-                    When a remote client does a lookup through HA-JNDI, HA-JNDI will delegate to the local JNDI Context
-                    when it cannot find the object within the global cluster-wide Context. The detailed lookup rule is
-                    as follows.</para>
+	
+        <para>
+		On the server side, a naming <literal>Context</literal> obtained via a call to new <literal>InitialContext()</literal>  will be bound to the local-only, non-cluster-wide JNDI Context (this is actually basic JNDI). So, all EJB homes and such will not be bound to the cluster-wide JNDI Context, but rather, each home will be bound into the local JNDI. 
+	</para>
+	<para>
+		When a remote client does a lookup through HA-JNDI, HA-JNDI will delegate to the local JNDI Context when it cannot find the object within the global cluster-wide Context. The detailed lookup rule is as follows.
+	</para>
         <itemizedlist>
           <listitem>
-            <para>If the binding is available in the cluster-wide JNDI tree and it returns it.</para>
+		  <para>If the binding is available in the cluster-wide JNDI tree, return it.</para>
           </listitem>
           <listitem>
-            <para>If the binding is not in the cluster-wide tree, it delegates the lookup query to the local
-                            JNDI service and returns the received answer if available.</para>
+		  <para>If the binding is not in the cluster-wide tree, delegate the lookup query to the local JNDI service and return the received answer if available.</para>
           </listitem>
           <listitem>
-            <para>If not available, the HA-JNDI services asks all other nodes in the cluster if their local
-                            JNDI service owns such a binding and returns the an answer from the set it receives.</para>
+            <para>If not available, the HA-JNDI services asks all other nodes in the cluster if their local JNDI service owns such a binding and returns the answer from the set it receives.</para>
           </listitem>
           <listitem>
-            <para>If no local JNDI service owns such a binding, a <literal>NameNotFoundException</literal>
-                            is finally raised.</para>
+            <para>If no local JNDI service owns such a binding, a <literal>NameNotFoundException</literal> is finally raised.</para>
           </listitem>
         </itemizedlist>
-        <para>So, an EJB home lookup through HA-JNDI, will always be delegated to the local JNDI instance. If
+	
+	<para>
+		In practice, objects are rarely bound in the cluster-wide JNDI tree; rather they are bound in the local JNDI tree.  For example, when EJBs are deployed, their proxies are always bound in local JNDI, not HA-JNDI. So, an EJB home lookup done through HA-JNDI will always be delegated to the local JNDI instance.
+	</para>
+        
+	<note><title>Note</title>
+		<para>
+			If different beans (even of the same type, but participating in different clusters) use the same JNDI name, this means that each JNDI server will have a logically different "target" bound (JNDI on node 1 will have a binding for bean A and JNDI on node 2 will have a binding, under the same name, for bean B). Consequently, if a client performs a HA-JNDI query for this name, the query will be invoked on any JNDI server of the cluster and will return the locally bound stub. Nevertheless, it may not be the correct stub that the client is expecting to receive! So, it is always best practice to ensure that across the cluster different names are used for logically different bindings.
+		</para>
+	</note>
+	
+	
+	<note><title>Note</title>
+		<para>
+			You cannot currently use a non-JNP JNDI implementation (i.e. LDAP) for your local JNDI implementation if you want to use HA-JNDI. However, you can use JNDI federation using the ExternalContext MBean to bind non-JBoss JNDI trees into the JBoss JNDI namespace. Furthermore, nothing prevents you using one centralized JNDI server for your whole cluster and scrapping HA-JNDI and JNP.
+		</para>
+	</note>	
+	
+	<note><title>Note</title>
+		<para>
+			If a binding is only made available on a few nodes in the cluster (for example because a bean is only deployed on a small subset of nodes in the cluster), the probability that a lookup will hit a HA-JNDI server that does not own this binding is higher and thus the lookup will need to be forwarded to all nodes in the cluster. Consequently, the query time will be longer than if the binding would have been available locally. Moral of the story: as much as possible, cache the result of your JNDI queries in your client.
+		</para>
+	</note>
+	
+	
+	
+	
+	<para>So, an EJB home lookup through HA-JNDI, will always be delegated to the local JNDI instance. If
                     different beans (even of the same type, but participating in different clusters) use the same JNDI
                     name, it means that each JNDI server will have a different "target" bound (JNDI on node 1 will have
                     a binding for bean A and JNDI on node 2 will have a binding, under the same name, for bean B).
@@ -423,57 +484,63 @@
                         been available locally. Moral of the story: as much as possible, cache the result of your JNDI
                         queries in your client.</para>
         </note>
-        <para>If you want to access HA-JNDI from the server side, you must explicitly get an
-                        <literal>InitialContext</literal> by passing in JNDI properties. The following code shows how to
-                    access the HA-JNDI.</para>
-        <programlisting>
-Properties p = new Properties();  
-p.put(Context.INITIAL_CONTEXT_FACTORY,   
-      "org.jnp.interfaces.NamingContextFactory");  
-p.put(Context.URL_PKG_PREFIXES, "jboss.naming:org.jnp.interfaces");  
-p.put(Context.PROVIDER_URL, "localhost:1100"); // HA-JNDI port.  
-return new InitialContext(p); 
-            </programlisting>
-        <para>The <literal>Context.PROVIDER_URL</literal> property points to the HA-JNDI service configured in
-                    the <literal>HANamingService</literal> MBean (see <xref linkend="clustering-jndi-jboss"/>).</para>
+
       </section>
+      
+      
       <section id="clustering-jndi-client">
         <title>Client configuration</title>
-        <para>The JNDI client needs to be aware of the HA-JNDI cluster. You can pass a list of JNDI servers
-                    (i.e., the nodes in the HA-JNDI cluster) to the <literal>java.naming.provider.url</literal> JNDI
-                    setting in the <literal>jndi.properties</literal> file. Each server node is identified by its IP
-                    address and the JNDI port number. The server nodes are separated by commas (see <xref linkend="clustering-jndi-jboss"/> on how to configure the servers and ports).</para>
-        <programlisting>
+	
+	<section><title>For clients running inside the application server</title>
+		<para>
+			If you want to access HA-JNDI from inside the application server, you must explicitly get an InitialContext by passing in JNDI properties. The following code shows how to create a naming Context bound to HA-JNDI:
+		</para>
+<programlisting>
+	Properties p = new Properties();  
+	p.put(Context.INITIAL_CONTEXT_FACTORY,   
+	"org.jnp.interfaces.NamingContextFactory");  
+	p.put(Context.URL_PKG_PREFIXES, "jboss.naming:org.jnp.interfaces");  
+	p.put(Context.PROVIDER_URL, "localhost:1100"); // HA-JNDI port.  
+	return new InitialContext(p); 
+</programlisting>
+<para>		
+The Context.PROVIDER_URL property points to the HA-JNDI service configured in the HANamingService MBean (see the section called “JBoss configuration”).
+</para>
+<para>
+	Do not attempt to simplify things by placing a jndi.properties file in your deployment or by editing the AS's conf/jndi.properties file. Doing either will almost certainly break things for your application and quite possibly across the application server. If you want to externalize your client configuration, one approach is to deploy a properties file not named jndi.properties, and then programatically create a Properties object that loads that file's contents.
+</para>
+
+</section>
+
+
+<section><title>For clients running outside the application server</title>
+			
+        <para>The JNDI client needs to be aware of the HA-JNDI cluster. You can pass a list of JNDI servers (i.e., the nodes in the HA-JNDI cluster) to the <literal>java.naming.provider.url</literal> JNDI setting in the <literal>jndi.properties</literal> file. Each server node is identified by its IP address and the JNDI port number. The server nodes are separated by commas (see <xref linkend="clustering-jndi-jboss"/> for how to configure the servers and ports).</para>
+	
+<programlisting>
 java.naming.provier.url=server1:1100,server2:1100,server3:1100,server4:1100
             </programlisting>
         <para>When initialising, the JNP client code will try to get in touch with each server node from the
                     list, one after the other, stopping as soon as one server has been reached. It will then download
                     the HA-JNDI stub from this node.</para>
         <note>
-          <para>There is no load balancing behavior in the JNP client lookup process. It just goes through the
-                        provider list and use the first available server. The HA-JNDI provider list only needs to
-                        contain a subset of HA-JNDI nodes in the cluster.</para>
+          <para>There is no load balancing behavior in the JNP client lookup process itself. It just goes through the provider lists and uses the first available server to obtain the stub. The HA-JNDI provider list only needs to contain a subset of HA-JNDI nodes in the cluster.</para>
         </note>
-        <para>The downloaded smart stub contains the logic to fail-over to another node if necessary and the
-                    updated list of currently running nodes. Furthermore, each time a JNDI invocation is made to the
-                    server, the list of targets in the stub interceptor is updated (only if the list has changed since
-                    the last call).</para>
-        <para>If the property string <literal>java.naming.provider.url</literal> is empty or if all servers it
-                    mentions are not reachable, the JNP client will try to discover a bootstrap HA-JNDI server through a
-                    multicast call on the network (auto-discovery). See <xref linkend="clustering-jndi-jboss"/> on how
-                    to configure auto-discovery on the JNDI server nodes. Through auto-discovery, the client might be
-                    able to get a valid HA-JNDI server node without any configuration. Of course, for the auto-discovery
-                    to work, the client must reside in the same LAN as the server cluster (e.g., the web servlets using
-                    the EJB servers). The LAN or WAN must also be configured to propagate such multicast datagrams.</para>
+	
+        <para>
+		The downloaded smart proxy contains the list of currently running nodes and the logic to load balance naming requests and to fail-over to another node if necessary. Furthermore, each time a JNDI invocation is made to the server, the list of targets in the proxy interceptor is updated (only if the list has changed since the last call).
+	</para>
+	
+        <para>
+		If the property string java.naming.provider.url is empty or if all servers it mentions are not reachable, the JNP client will try to discover a HA-JNDI server through a multicast call on the network (auto-discovery). See the section called “JBoss configuration” on how to configure auto-discovery on the JNDI server nodes. Through auto-discovery, the client might be able to get a valid HA-JNDI server node without any configuration. Of course, for auto-discovery to work, the network segment(s) between the client and the server cluster  must be configured to propagate such multicast datagrams.
+	</para>
         <note>
-          <para>The auto-discovery feature uses multicast group address 230.0.0.4:1102.</para>
+		<para>By default the auto-discovery feature uses multicast group address 230.0.0.4 and port1102.</para>
         </note>
-        <para>In addition to the <literal>java.naming.provier.url</literal> property, you can specify a set of
-                    other properties. The following list shows all client side properties you can specify, when creating
-                    a new <literal>InitialContext</literal>.</para>
+	<para>In addition to the <literal>java.naming.provider.url</literal> property, you can specify a set of other properties. The following list shows all clustering-related client side properties you can specify when creating a new InitialContext. (All of the standard, non-clustering-related environment properties used with regular JNDI are also available.)</para>
         <itemizedlist>
           <listitem>
-            <para><literal>java.naming.provier.url</literal>: Provides a list of IP addresses and port
+            <para><literal>java.naming.provider.url</literal>: Provides a list of IP addresses and port
                             numbers for HA-JNDI provider nodes in the cluster. The client tries those providers one by
                             one and uses the first one that responds.</para>
           </listitem>
@@ -483,28 +550,29 @@
                         <literal>false</literal>.</para>
           </listitem>
           <listitem>
-            <para><literal>jnp.partitionName</literal>: In an environment where multiple HA-JNDI services,
-                            which are bound to distinct clusters (i.e., partitions), are started, this property allows
-                            you to configure which cluster you broadcast to when the automatic discovery feature is
-                            used. If you do not use the automatic discovery feature (e.g., you could explicitly provide
-                            a list of valid JNDI nodes in <literal>java.naming.provider.url</literal>), this property is
-                            not used. By default, this property is not set and the automatic discovery select the first
-                            HA-JNDI server that responds, independently of the cluster partition name.</para>
+		  <para><literal>jnp.partitionName</literal>: In an environment where multiple HA-JNDI services bound to distinct clusters (a.k.a. partitions), are running, this property allows you to ensure that your client only accepts automatic-discovery responses from servers in the desired partition. If you do not use the automatic discovery feature (i.e. jnp.disableDiscovery is true), this property is not used. By default, this property is not set and the automatic discovery select the first HA-JNDI server that responds, irregardless of the cluster partition name.</para>
           </listitem>
           <listitem>
             <para><literal>jnp.discoveryTimeout</literal>: Determines how much time the context will wait
                             for a response to its automatic discovery packet. Default is 5000 ms.</para>
           </listitem>
           <listitem>
-            <para><literal>jnp.discoveryGroup</literal>: Determines which multicast group address is used
-                            for the automatic discovery. Default is <literal>230.0.0.4</literal>.</para>
+		  <para><literal>jnp.discoveryGroup</literal>: Determines which multicast group address is used for the automatic discovery. Default is 230.0.0.4.  Must match the value of the AutoDiscoveryAddress configured on the server side HA-JNDI service.</para>
           </listitem>
           <listitem>
-            <para><literal>jnp.discoveryPort</literal>: Determines which multicast group port is used for
-                            the automatic discovery. Default is <literal>1102</literal>.</para>
+		  <para><literal>jnp.discoveryPort</literal>: Determines which multicast group port is used for the automatic discovery. Default is 1102. Must match the value of the AutoDiscoveryPort configured on the server side HA-JNDI service.</para>
           </listitem>
+  	<listitem>
+		<para><literal>jnp.discoveryTTL</literal>: specifies the TTL (time-to-live) for autodiscovery IP multicast packets. This value represents the number of network hops a multicast packet can be allowed to propagate before networking equipment should drop the packet. Despite its name, it does not represent a unit of time.
+		</para>
+	</listitem>
+		  
         </itemizedlist>
       </section>
+
+
+      
+      
       <section id="clustering-jndi-jboss">
         <title>JBoss configuration</title>
         <para>The <literal>cluster-service.xml</literal> file in the <literal>all/deploy</literal> directory
@@ -512,24 +580,21 @@
         <programlisting>
 &lt;mbean code="org.jboss.ha.jndi.HANamingService"            
        name="jboss:service=HAJNDI"&gt;       
-    &lt;depends&gt;jboss:service=DefaultPartition&lt;/depends&gt;    
-&lt;/mbean&gt;
+       &lt;depends optional-attribute-name="ClusterPartition" 
+		proxy-type="attribute"&gt;jboss:service=${jboss.partition.name:DefaultPartition}&lt;/depends&gt; 
+       
+&lt;mbean&gt;
             </programlisting>
-        <para>You can see that this MBean depends on the <literal>DefaultPartition</literal> MBean defined above
-                    it (discussed in an earlier section in this chapter). In other configurations, you can put that
-                    element in the <literal>jboss-services.xml</literal> file or any other JBoss configuration files in
+        <para>You can see that this MBean depends on the <literal>DefaultPartition</literal> MBean defined above it (discussed earlier in this chapter). In other configurations, you can put that
+                    element in the <literal>jboss-service.xml</literal> file or any other JBoss configuration files in
                     the <literal>/deploy</literal> directory to enable HA-JNDI services. The available attributes for
                     this MBean are listed below.</para>
         <itemizedlist>
           <listitem>
-            <para><emphasis role="bold">PartitionName</emphasis> is an optional attribute to specify the
-                            name of the cluster for the different nodes of the HA-JNDI service to communicate. The
-                            default value is <literal>DefaultPartition</literal>.</para>
+		  <para><emphasis role="bold">Cluster Partition</emphasis> is a required attribute to inject the HAPartition service that HA-JNDI uses for intra-cluster communication.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">BindAddress</emphasis> is an optional attribute to specify the
-                            address to which the HA-JNDI server will bind waiting for JNP clients. Only useful for
-                            multi-homed computers.</para>
+		  <para><emphasis role="bold">BindAddress</emphasis> is an optional attribute to specify the address to which the HA-JNDI server will bind waiting for JNP clients. Only useful for multi-homed computers. The default value is the value of the jboss.bind.address system property, or the host's default addresss if that property is not set.  The jboss.bind.address system property is set if the -b command line switch is used when JBoss is started.</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">Port</emphasis> is an optional attribute to specify the port to
@@ -542,104 +607,100 @@
                                 <literal>50</literal>.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">RmiPort</emphasis> determines which port the server should use to
-                            communicate with the downloaded stub. This attribute is optional. If it is missing, the
-                            server automatically assigns a RMI port.</para>
+		  <para><emphasis role="bold">RmiPort</emphasis> determines which port the server should use to communicate with the downloaded stub. This attribute is optional. The default value is 1101. If no value is set, the server automatically assigns a RMI port.</para>
           </listitem>
+	  <listitem>
+            <para>
+		    <literal>DiscoveryDisabled</literal> is a boolean flag that disables configuration of the auto discovery multicast listener.
+	    </para>
+    	  </listitem>
+	  
           <listitem>
-            <para><emphasis role="bold">AutoDiscoveryAddress</emphasis> is an optional attribute to specify
-                            the multicast address to listen to for JNDI automatic discovery. The default value is
-                                <literal>230.0.0.4</literal>.</para>
+		  <para><emphasis role="bold">AutoDiscoveryAddress</emphasis> is an optional attribute to specify the multicast address to listen to for JNDI automatic discovery. The default value is the value of the jboss.partition.udpGroup system property, or 230.0.0.4 if that is not set.  The jboss.partition.udpGroup system property is set if the -u command line switch is used when JBoss is started.</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">AutoDiscoveryGroup</emphasis> is an optional attribute to specify
                             the multicast group to listen to for JNDI automatic discovery.. The default value is
                                 <literal>1102</literal>.</para>
           </listitem>
+         
           <listitem>
-            <para><emphasis role="bold">LookupPool</emphasis> specifies the thread pool service used to
-                            control the bootstrap and auto discovery lookups.</para>
+		  <para><emphasis role="bold">AutoDiscoveryBindAddress</emphasis> sets the interface on which HA-JNDI should listen for auto-discovery request packets. If this attribute is not specified and a <literal>BindAddress</literal> is specified, the <literal>BindAddress</literal> will be used..</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">DiscoveryDisabled</emphasis> is a boolean flag that disables
-                            configuration of the auto discovery multicast listener.</para>
+		  <para><emphasis role="bold">AutoDiscoveryTTL</emphasis> specifies the TTL (time-to-live) for autodiscovery IP multicast packets. This value represents the number of network hops a multicast packet can be allowed to propagate before networking equipment should drop the packet. Despite its name, it does not represent a unit of time.</para>
           </listitem>
-          <listitem>
-            <para><emphasis role="bold">AutoDiscoveryBindAddress</emphasis> sets the auto-discovery
-                            bootstrap multicast bind address. If this attribute is not specified and a
-                                <literal>BindAddress</literal> is specified, the <literal>BindAddress</literal> will be
-                            used..</para>
-          </listitem>
-          <listitem>
-            <para><emphasis role="bold">AutoDiscoveryTTL</emphasis> specifies the TTL (time-to-live) for
-                            autodiscovery IP multicast packets.</para>
-          </listitem>
+	  <listitem>
+		  <para><emphasis role="bold">LoadBalancePolicy</emphasis> specifies the class name of the LoadBalancePolicyimplementation that should be included in the client proxy.  See the earlier section on “Load-Balancing Policies” for details.
+			  </para>
+		  </listitem>
+	<listitem>
+				  
+		<para><emphasis role="bold">LookupPool</emphasis> specifies the thread pool service used to control the bootstrap and auto discovery lookups.
+			  </para>
+		  </listitem>
+	 
         </itemizedlist>
         <para>The full default configuration of the <literal>HANamingService</literal> MBean is as follows.</para>
-        <programlisting>
-&lt;mbean code="org.jboss.ha.jndi.HANamingService" 
-      name="jboss:service=HAJNDI"&gt; 
-    &lt;depends&gt;
-        jboss:service=${jboss.partition.name:DefaultPartition}
-    &lt;/depends&gt; 
-    &lt;! -- Name of the partition to which the service is linked --&gt; 
-    &lt;attribute name="PartitionName"&gt;
-        ${jboss.partition.name:DefaultPartition}
-    &lt;/attribute&gt; 
-    &lt;! -- Bind address of bootstrap and HA-JNDI RMI endpoints --&gt; 
-    &lt;attribute name="BindAddress"&gt;${jboss.bind.address}&lt;/attribute&gt; 
-    &lt;! -- Port on which the HA-JNDI stub is made available --&gt; 
-    &lt;attribute name="Port"&gt;1100&lt;/attribute&gt; 
-    &lt;! -- RmiPort to be used by the HA-JNDI service once bound. 
-        0 is for auto. --&gt; 
-    &lt;attribute name="RmiPort"&gt;1101&lt;/attribute&gt; 
-    &lt;! -- Accept backlog of the bootstrap socket --&gt; 
-    &lt;attribute name="Backlog"&gt;50&lt;/attribute&gt; 
-    &lt;! -- The thread pool service used to control the bootstrap and 
-      auto discovery lookups --&gt; 
-    &lt;depends optional-attribute-name="LookupPool" 
-        proxy-type="attribute"&gt;jboss.system:service=ThreadPool&lt;/depends&gt;
+<programlisting><![CDATA[
+ <mbean code="org.jboss.ha.jndi.HANamingService" 
+	name="jboss:service=HAJNDI"> 
+	 <!-- We now inject the partition into the HAJNDI service instead 
+	 of requiring that the partition name be passed --> 
+	 <depends optional-attribute-name="ClusterPartition" 
+	 proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}</depends> 
+	 <!-- Bind address of bootstrap and HA-JNDI RMI endpoints --> 
+	 <attribute name="BindAddress">${jboss.bind.address}</attribute> 
+	 <!-- Port on which the HA-JNDI stub is made available --> 
+	 <attribute name="Port">1100</attribute> 
+	 <!-- RmiPort to be used by the HA-JNDI service once bound. 0 => auto. --> 
+	 <attribute name="RmiPort">1101</attribute> 
+	 <!-- Accept backlog of the bootstrap socket --> 
+	 <attribute name="Backlog">50</attribute> 
+	 <!-- The thread pool service used to control the bootstrap and auto discovery lookups --> 
+	<depends optional-attribute-name="LookupPool" 
+	 proxy-type="attribute">jboss.system:service=ThreadPool</depends> 
+	 <!-- A flag to disable the auto discovery via multicast --> 
+	<attribute name="DiscoveryDisabled">false</attribute> 
+	<!-- Set the auto-discovery bootstrap multicast bind address. If not 
+	 specified and a BindAddress is specified, the BindAddress will be used. --> 
+	 <attribute name="AutoDiscoveryBindAddress">${jboss.bind.address}</attribute> 
+	 <!-- Multicast Address and group port used for auto-discovery --> 
+	 <attribute name="AutoDiscoveryAddress">${jboss.partition.udpGroup:230.0.0.4}</attribute> 
+	 <attribute name="AutoDiscoveryGroup">1102</attribute> 
+	 <!-- The TTL (time-to-live) for autodiscovery IP multicast packets --> 
+	 <attribute name="AutoDiscoveryTTL">16</attribute> 
+	 <!-- The load balancing policy for HA-JNDI --> 
+	 <attribute name="LoadBalancePolicy">org.jboss.ha.framework.interfaces.RoundRobin</attribute> 
+	
+	 <!-- Client socket factory to be used for client-server 
+	 RMI invocations during JNDI queries 
+	 <attribute name="ClientSocketFactory">custom</attribute> 
+	 --> 
+	 <!-- Server socket factory to be used for client-server 
+	 RMI invocations during JNDI queries 
+	 <attribute name="ServerSocketFactory">custom</attribute> 
+	  --> 
+   </mbean>]]>
+</programlisting>
+<para>
+	It is possible to start several HA-JNDI services that use different clusters. This can be used, for example, if a node is part of many clusters. In this case, make sure that you set a different port or IP address for eachservices. For instance, if you wanted to hook up HA-JNDI to the example cluster you set up and change the binding port, the Mbean descriptor would look as follows.
+</para>
+<programlisting><![CDATA[
+<mbean code="org.jboss.ha.jndi.HANamingService"    
+      name="jboss:service=HAJNDI">    
 
-    &lt;! -- A flag to disable the auto discovery via multicast --&gt; 
-    &lt;attribute name="DiscoveryDisabled"&gt;false&lt;/attribute&gt; 
-    &lt;! -- Set the auto-discovery bootstrap multicast bind address. --&gt; 
-    &lt;attribute name="AutoDiscoveryBindAddress"&gt;
-        ${jboss.bind.address}
-    &lt;/attribute&gt; 
+      <depends optional-attribute-name="ClusterPartition" 
+   proxy-type="attribute">jboss:service=MySpecialPartition</depends>  
+ <attribute name="Port">56789</attribute>  
+</mbean> ]]>
+</programlisting>
+      </section>
+</section>
     
-    &lt;! -- Multicast Address and group port used for auto-discovery --&gt; 
-    &lt;attribute name="AutoDiscoveryAddress"&gt;
-        ${jboss.partition.udpGroup:230.0.0.4}
-    &lt;/attribute&gt; 
-    &lt;attribute name="AutoDiscoveryGroup"&gt;1102&lt;/attribute&gt; 
-    &lt;! -- The TTL (time-to-live) for autodiscovery IP multicast packets --&gt; 
-    &lt;attribute name="AutoDiscoveryTTL"&gt;16&lt;/attribute&gt;
+</chapter>
 
-    &lt;! -- Client socket factory to be used for client-server 
-           RMI invocations during JNDI queries 
-    &lt;attribute name="ClientSocketFactory"&gt;custom&lt;/attribute&gt; 
-    --&gt; 
-    &lt;! -- Server socket factory to be used for client-server 
-           RMI invocations during JNDI queries 
-    &lt;attribute name="ServerSocketFactory"&gt;custom&lt;/attribute&gt; 
-    --&gt; 
-&lt;/mbean&gt;            
-            </programlisting>
-        <para>It is possible to start several HA-JNDI services that use different clusters. This can be used,
-                    for example, if a node is part of many clusters. In this case, make sure that you set a different
-                    port or IP address for both services. For instance, if you wanted to hook up HA-JNDI to the example
-                    cluster you set up and change the binding port, the Mbean descriptor would look as follows.</para>
-        <programlisting>
-&lt;mbean code="org.jboss.ha.jndi.HANamingService"    
-       name="jboss:service=HAJNDI"&gt;    
-    &lt;depends&gt;jboss:service=MySpecialPartition&lt;/depends&gt;    
-    &lt;attribute name="PartitionName"&gt;MySpecialPartition&lt;/attribute&gt;    
-    &lt;attribute name="Port"&gt;56789&lt;/attribute&gt;  
-&lt;/mbean&gt; 
-            </programlisting>
-      </section>
-    </section>
-    <section id="clustering-session">
+<chapter id="clustering-session">
       <title>Clustered Session EJBs</title>
       <para>Session EJBs provide remote invocation services. They are clustered based on the client-side
                 interceptor architecture. The client application for a clustered session bean is exactly the same as the
@@ -673,16 +734,14 @@
     &lt;/enterprise-beans&gt;
 &lt;/jboss&gt;
             </programlisting>
-        <note>
+        
+	  <note>
           <para>The <literal>&lt;clustered&gt;True&lt;/clustered&gt;</literal> element is really just an
                         alias for the <literal>&lt;configuration-name&gt;Clustered Stateless
-                            SessionBean&lt;/configuration-name&gt;</literal> element.</para>
+				SessionBean&lt;/configuration-name&gt;</literal> element in the conf/standard-jboss.xml file.</para>
         </note>
-        <para>In the bean configuration, only the <literal>&lt;clustered&gt;</literal> element is mandatory. It
-                    indicates that the bean works in a cluster. The <literal>&lt;cluster-config&gt;</literal> element
-                    is optional and the default values of its attributes are indicated in the sample configuration
-                    above. Below is a description of the attributes in the <literal>&lt;cluster-config&gt;</literal>
-                    element.</para>
+	
+	<para>In the bean configuration, only the &lt;clustered&gt; element is mandatory. It indicates that the bean needs to support clustering features. The &lt;cluster-config&gt; element is optional and the default values of its attributes are indicated in the sample configuration above. Below is a description of the attributes in the &lt;cluster-config&gt; element..</para>
         <itemizedlist>
           <listitem>
             <para><emphasis role="bold">partition-name</emphasis> specifies the name of the cluster the bean
@@ -703,89 +762,9 @@
                                 <literal>home-load-balance-policy</literal> attribute also apply.</para>
           </listitem>
         </itemizedlist>
-        <para>In JBoss 3.0.x, each client-side stub has its own list of available target nodes. Consequently,
-                    some side-effects can occur. For example, if you cache your home stub and re-create a remote stub
-                    for a stateless session bean (with the Round-Robin policy) each time you need to make an invocation,
-                    a new remote stub, containing the list of available targets, will be downloaded for each invocation.
-                    Consequently, as the first target node is always the first in the list, calls will not seemed to be
-                    load-balanced because there is no usage-history between different stubs. In JBoss 3.2+, the proxy
-                    families (i.e., the "First AvailableIdenticalAllProxies" load balancing policy, see <xref linkend="clustering-intro-balancepolicy-32"/>) remove this side effect as the home and remote
-                    stubs of a given EJB are in two different families.</para>
-        <section id="clustering-session-slsb21-retry">
-          <title>Handle Cluster Restart</title>
-          <para>We have covered the HA smart client architecture in <xref linkend="clustering-intro-arch-proxy"/>. The default HA smart proxy client can only failover
-                        as long as one node in the cluster exists. If there is a complete cluster shutdown, the proxy
-                        becomes orphanned and looses knowledge of the available nodes in the cluster. There is no way
-                        for the proxy to recover from this. The proxy needs to be looked up out of JNDI/HAJNDI when the
-                        nodes are restarted.</para>
-          <para>The 3.2.7+/4.0.2+ releases contain a <literal>RetryInterceptor</literal> that can be added to
-                        the proxy client side interceptor stack to allow for a transparent recovery from such a restart
-                        failure. To enable it for an EJB, setup an <literal>invoker-proxy-binding</literal> that
-                        includes the <literal>RetryInterceptor</literal>. Below is an example
-                        <literal>jboss.xml</literal> configuration.</para>
-          <programlisting>
-&lt;jboss&gt;
-    &lt;session&gt;
-        &lt;ejb-name&gt;nextgen_RetryInterceptorStatelessSession&lt;/ejb-name&gt;
-        &lt;invoker-bindings&gt;
-            &lt;invoker&gt;
-                &lt;invoker-proxy-binding-name&gt;
-                    clustered-retry-stateless-rmi-invoker
-                &lt;/invoker-proxy-binding-name&gt;
-                &lt;jndi-name&gt;
-                    nextgen_RetryInterceptorStatelessSession
-                &lt;/jndi-name&gt;
-            &lt;/invoker&gt;
-        &lt;/invoker-bindings&gt;
-        &lt;clustered&gt;true&lt;/clustered&gt;
-    &lt;/session&gt;
-
-    &lt;invoker-proxy-binding&gt;
-        &lt;name&gt;clustered-retry-stateless-rmi-invoker&lt;/name&gt;
-        &lt;invoker-mbean&gt;jboss:service=invoker,type=jrmpha&lt;/invoker-mbean&gt;
-        &lt;proxy-factory&gt;org.jboss.proxy.ejb.ProxyFactoryHA&lt;/proxy-factory&gt;
-        &lt;proxy-factory-config&gt;
-            &lt;client-interceptors&gt;
-                &lt;home&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.ejb.HomeInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.SecurityInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.TransactionInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.ejb.RetryInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.invocation.InvokerInterceptor
-                    &lt;/interceptor&gt;
-                &lt;/home&gt;
-                &lt;bean&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.ejb.StatelessSessionInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.SecurityInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.TransactionInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.proxy.ejb.RetryInterceptor
-                    &lt;/interceptor&gt;
-                    &lt;interceptor&gt;
-                        org.jboss.invocation.InvokerInterceptor
-                    &lt;/interceptor&gt;
-                &lt;/bean&gt;
-            &lt;/client-interceptors&gt;
-        &lt;/proxy-factory-config&gt;
-    &lt;/invoker-proxy-binding&gt;
-                </programlisting>
-        </section>
+        
       </section>
+      
       <section id="clustering-session-sfsb21">
         <title>Stateful Session Bean in EJB 2.x</title>
         <para>Clustering stateful session beans is more complex than clustering their stateless counterparts
@@ -850,36 +829,39 @@
           <title>The HASessionState service configuration</title>
           <para>The <literal>HASessionState</literal> service MBean is defined in the
                             <code>all/deploy/cluster-service.xml</code> file.</para>
-          <programlisting>
-&lt;mbean code="org.jboss.ha.hasessionstate.server.HASessionStateService"
-      name="jboss:service=HASessionState"&gt;
-    &lt;depends&gt;
-        jboss:service=${jboss.partition.name:DefaultPartition}
-    &lt;/depends&gt;
-    &lt;!-- Name of the partition to which the service is linked --&gt;
-    &lt;attribute name="PartitionName"&gt;
-        ${jboss.partition.name:DefaultPartition}
-    &lt;/attribute&gt;
-    &lt;!-- JNDI name under which the service is bound --&gt;
-    &lt;attribute name="JndiName"&gt;/HASessionState/Default&lt;/attribute&gt;
-    &lt;!-- Max delay before cleaning unreclaimed state.
-           Defaults to 30*60*1000 =&gt; 30 minutes --&gt;
-    &lt;attribute name="BeanCleaningDelay"&gt;0&lt;/attribute&gt;
-&lt;/mbean&gt;
-                </programlisting>
-          <para>The configuration attributes in the <literal>HASessionState</literal> MBean are listed below.</para>
+<programlisting><![CDATA[ 
+<mbean code="org.jboss.ha.hasessionstate.server.HASessionStateService"
+   name="jboss:service=HASessionState">
+    
+    <depends>jboss:service=Naming</depends> 
+   <!-- We now inject the partition into the HAJNDI service instead 
+ of requiring that the partition name be passed --> 
+ <depends optional-attribute-name="ClusterPartition" 
+  proxy-type="attribute">
+  jboss:service=${jboss.partition.name:DefaultPartition}
+  </depends>
+  <!-- JNDI name under which the service is bound -->
+  <attribute name="JndiName">/HASessionState/Default</attribute>
+  <!-- Max delay before cleaning unreclaimed state.
+Defaults to 30*60*1000 => 30 minutes -->
+<attribute name="BeanCleaningDelay">0</attribute>
+</mbean>   ]]>
+</programlisting>
+          
+<para>The configuration attributes in the <literal>HASessionState</literal> MBean are listed below.</para>
           <itemizedlist>
-            <listitem>
+		  <listitem>
+			  <para><emphasis role="bold">ClusterPartition</emphasis> is a required attribute to inject the HAPartition service that HA-JNDI uses for intra-cluster communication.</para>
+            </listitem>
+		  
+		  
+		  <listitem>
               <para><emphasis role="bold">JndiName</emphasis> is an optional attribute to specify the JNDI
                                 name under which this <literal>HASessionState</literal> service is bound. The default
                                 value is <literal>/HAPartition/Default</literal>.</para>
             </listitem>
+         
             <listitem>
-              <para><emphasis role="bold">PartitionName</emphasis> is an optional attribute to specify the
-                                name of the cluster in which the current <literal>HASessionState</literal> protocol will
-                                work. The default value is <literal>DefaultPartition</literal>.</para>
-            </listitem>
-            <listitem>
               <para><emphasis role="bold">BeanCleaningDelay</emphasis> is an optional attribute to specify
                                 the number of miliseconds after which the <literal>HASessionState</literal> service can
                                 clean a state that has not been modified. If a node, owning a bean, crashes, its brother
@@ -891,13 +873,121 @@
             </listitem>
           </itemizedlist>
         </section>
+	<section><title>Handling Cluster Restart</title>
+		<para>
+			We have covered the HA smart client architecture in the section called “Client-side interceptor architecture”. The default HA smart proxy client can only failover as long as one node in the cluster exists. If there is a complete cluster shutdown, the proxy becomes orphaned and loses knowledge of the available nodes in the cluster. There is no way for the proxy to recover from this. The proxy needs to look up a fresh set of targets out of JNDI/HAJNDI when the nodes are restarted.
+		</para>
+		<para>
+			The 3.2.7+/4.0.2+ releases contain a RetryInterceptor that can be added to the proxy client side interceptor stack to allow for a transparent recovery from such a restart failure. To enable it for an EJB, setup an invoker-proxy-binding that includes the RetryInterceptor. Below is an example jboss.xml configuration.
+		</para>
+<programlisting><![CDATA[ 
+ <jboss>
+ <session>
+ 	<ejb-name>nextgen_RetryInterceptorStatelessSession</ejb-name>
+ 	<invoker-bindings>
+ 	<invoker>
+ 	<invoker-proxy-binding-name>
+ 	clustered-retry-stateless-rmi-invoker
+ 	</invoker-proxy-binding-name>
+ 	<jndi-name>
+ 	nextgen_RetryInterceptorStatelessSession
+ 	</jndi-name>
+ 	</invoker>
+ 	</invoker-bindings>
+ 	<clustered>true</clustered>
+ </session>
+  
+ <invoker-proxy-binding>
+ 	<name>clustered-retry-stateless-rmi-invoker</name>
+	 <invoker-mbean>jboss:service=invoker,type=jrmpha</invoker-mbean>
+ 	<proxy-factory>org.jboss.proxy.ejb.ProxyFactoryHA</proxy-factory>
+	 <proxy-factory-config>
+ 	<client-interceptors>
+ 		<home>
+ 		<interceptor>
+ 		org.jboss.proxy.ejb.HomeInterceptor
+ 		</interceptor>
+		<interceptor>
+		org.jboss.proxy.SecurityInterceptor
+		</interceptor>
+		<interceptor>
+  		org.jboss.proxy.TransactionInterceptor
+  		</interceptor>
+ 		<interceptor>
+ 		org.jboss.proxy.ejb.RetryInterceptor
+ 		</interceptor>
+  		<interceptor>
+  		org.jboss.invocation.InvokerInterceptor
+  		</interceptor>
+  	</home>
+ 	<bean>
+ 		 <interceptor>
+  		org.jboss.proxy.ejb.StatelessSessionInterceptor
+ 		 </interceptor>
+		 <interceptor>
+		 org.jboss.proxy.SecurityInterceptor
+ 		</interceptor>
+		 <interceptor>
+		org.jboss.proxy.TransactionInterceptor
+		</interceptor>
+		<interceptor>
+		org.jboss.proxy.ejb.RetryInterceptor
+ 		</interceptor>
+ 		<interceptor>
+ 		org.jboss.invocation.InvokerInterceptor
+		</interceptor>
+	</bean>
+	  </client-interceptors>
+	  </proxy-factory-config>
+ </invoker-proxy-binding> ]]>
+</programlisting>	
+		
+	</section>
+	
+	<section><title>JNDI Lookup Process</title>
+		<para>In order to recover the HA proxy, the RetryInterceptor does a lookup in JNDI. This means that internally it creates a new InitialContext and does a JNDI lookup. But, for that lookup to succeed, the InitialContext needs to be configured properly to find your naming server. The RetryInterceptor will go through the following steps in attempting to determine the proper naming environment properties:
+		</para>
+		<orderedlist>
+			<listitem>
+				<para>
+					It will check its own static retryEnv field. This field can be set by client code via a call to RetryInterceptor.setRetryEnv(Properties). This approach to configuration has two downsides: first, it reduces portability by introducing JBoss-specific calls to the client code; and second, since a static field is used only a single configuration per JVM is possible. 
+				</para>
+			</listitem>
+			<listitem>
+				<para>
+					If the retryEnv field is null, it will check for any environment properties bound to a ThreadLocal by the org.jboss.naming.NamingContextFactory class. To use this class as your naming context factory, in your jndi.properties set property java.naming.factory.initial=org.jboss.naming.NamingContextFactory. The advantage of this approach is use of org.jboss.naming.NamingContextFactory is simply a configuration option in your jndi.properties file, and thus your java code is unaffected. The downside is the naming properties are stored in a ThreadLocal and thus are only visible to the thread that originally created an InitialContext. 
+				</para>
+			</listitem>
+			<listitem>
+				<para>
+					If neither of the above approaches yield a set of naming environment properties, a default InitialContext is used. If the attempt to contact a naming server is unsuccessful, by default the InitialContext will attempt to fall back on multicast discovery to find an HA-JNDI naming server. See the section on “ClusteredJNDI Services” for more on multicast discovery of HA-JNDI. 
+				</para>
+			</listitem>
+			
+		</orderedlist>
+		
+	</section>
+	
+	<section><title>SingleRetryInterceptor</title>
+		<para>
+			The RetryInterceptor is useful in many use cases, but a disadvantage it has is that it will continue attempting to re-lookup the HA proxy in JNDI until it succeeds. If for some reason it cannot succeed, this process could go on forever, and thus the EJB call that triggered the RetryInterceptor will never return. For many client applications, this possibility is unacceptable. As a result, JBoss doesn't make the RetryInterceptor part of its default client interceptor stacks for clustered EJBs.
+		</para>
+		<para>
+			In the 4.0.4.RC1 release, a new flavor of retry interceptor was introduced, the org.jboss.proxy.ejb.SingleRetryInterceptor. This version works like the RetryInterceptor, but only makes a single attempt to re-lookup the HA proxy in JNDI. If this attempt fails, the EJB call will fail just as if no retry interceptor was used. Beginning with 4.0.4.CR2, the SingleRetryInterceptor is part of the default client interceptor stacks for clustered EJBs.
+		</para>
+		<para>
+			The downside of the SingleRetryInterceptor is that if the retry attempt is made during a portion of a cluster restart where no servers are available, the retry will fail and no further attempts will be made. 
+		</para>
+	</section>
+	
+	
+	
       </section>
-<!-- TBD: Would be good to give a more complex example with
-                attributes on the annotations -->
+
       <section id="clustering-session-slsb30">
         <title>Stateless Session Bean in EJB 3.0</title>
         <para>To cluster a stateless session bean in EJB 3.0, all you need to do is to annotate the bean class
-                    withe the <literal>@Cluster</literal> annotation. You can pass in the load balance policy and
+                    withe the <literal>@Clustered</literal> annotation. You can pass in the load balance policy and
                     cluster partition as parameters to the annotation. The default load balance policy is
                         <literal>org.jboss.ha.framework.interfaces.RandomRobin</literal> and the default cluster is
                         <literal>DefaultPartition</literal>. Below is the definition of the <literal>@Cluster</literal>
@@ -905,7 +995,7 @@
         <programlisting>
 public @interface Clustered {
    Class loadBalancePolicy() default LoadBalancePolicy.class;
-   String partition() default "DefaultPartition";
+   String partition() default  "${jboss.partition.name:DefaultPartition}";
 }
             </programlisting>
         <para>Here is an example of a clustered EJB 3.0 stateless session bean implementation.</para>
@@ -919,15 +1009,71 @@
    }
 }
             </programlisting>
+	    <para>
+		    The <literal>@Clustered</literal> annotation can also be omitted and the clustering configuration applied in jboss.xml:
+	    </para>
+<programlisting><![CDATA[ 
+<jboss>    
+	 <enterprise-beans>
+	 <session>
+		 <ejb-name>NonAnnotationStateful</ejb-name>
+		<clustered>true</clustered>
+			<cluster-config>
+			<partition-name>FooPartition</partition-name>
+			<load-balance-policy>
+			org.jboss.ha.framework.interfaces.RandomRobin
+			 </load-balance-policy>
+		 </cluster-config>
+	 </session>    
+	 </enterprise-beans>
+</jboss>   ]]>
+</programlisting>
+	
+	    
       </section>
       <section id="clustering-session-sfsb30">
-        <title>Stateful Session Bean in EJB 3.0</title>
+        <title>Stateful Session Beans in EJB 3.0</title>
         <para>To cluster stateful session beans in EJB 3.0, you need to tag the bean implementation class with
                     the <literal>@Cluster</literal> annotation, just as we did with the EJB 3.0 stateless session bean
-                    earlier.</para>
+		    earlier. The @org.jboss.ejb3.annotation.cache.tree.CacheConfig annotation can also be applied to the bean to specify caching behavior. Below is the definition of the @CacheConfig annotation:
+</para>
+
+<programlisting><![CDATA[ 
+public @interface CacheConfig
+{
+String name() default "jboss.cache:service=EJB3SFSBClusteredCache";
+int maxSize() default 10000;
+long idleTimeoutSeconds() default 300;   
+boolean replicationIsPassivation() default true;   
+long removalTimeoutSeconds() default 0;
+} ]]>
+</programlisting>
+
+<itemizedlist>
+	<listitem><para><literal>name</literal> specifies the object name of the JBoss Cache Mbean that should be used for caching the bean (see below for more on this Mbean).</para></listitem>
+	
+	<listitem><para><literal>maxSize</literal> specifies the maximum number of beans that can cached before the cache should start passivating beans, using an LRU algorithm.</para></listitem>
+	
+	<listitem><para><literal>idleTimeoutSeconds</literal> specifies the max period of time a bean can go unused before the cache should passivate it (irregardless of whether maxSize beans are cached.)</para></listitem>
+	
+	<listitem><para><literal>removalTimeoutSeconds</literal> specifies the max period of time a bean can go unused before the cache should remove it altogether.</para></listitem>
+	
+	<listitem><para><literal>replicationIsPassivation</literal> specifies whether the cache should consider a replication as being equivalent to a passivation, and invoke any @PrePassivate and @PostActivate callbacks on the bean. By default true, since replication involves serializing the bean, and preparing for and recovering from serialization is a common reason for implementing the callback methods.</para></listitem>
+
+</itemizedlist>
+
+
+<para>
+Here is an example of a clustered EJB 3.0 stateful session bean implementation.
+</para>
+
+
+
+
         <programlisting>
 @Stateful
 @Clustered
+ at CacheConfig(maxSize=5000,removalTimeoutSeconds=18000)
 public class MyBean implements MySessionInt {
    
    private int state = 0;
@@ -937,80 +1083,149 @@
    }
 }
             </programlisting>
-        <para>JBoss Cache provides the session state replication service for EJB 3.0 stateful session beans. The
+
+<para>
+	As with stateless beans, the @Clustered annotation can also be omitted and the clustering configuration applied in jboss.xml; see the example above.
+</para>
+<para>
+	As with EJB 2.0 clustered SFSBs, JBoss provides a mechanism whereby a bean implementation can expose a method the container can invoke to check whether the bean's state is not dirty after a request and doesn't need to be replicated.  With EJB3, the mechanism is a little more formal; instead of just exposing a method with a known signature, an EJB3 SFSB must implement the org.jboss.ejb3.cache.Optimized interface:		    
+</para>
+<programlisting>
+public interface Optimized {
+boolean isModified();
+}
+</programlisting>
+
+<para>JBoss Cache provides the session state replication service for EJB 3.0 stateful session beans. The
                     related MBean service is defined in the <literal>ejb3-clustered-sfsbcache-service.xml</literal> file
                     in the <literal>deploy</literal> directory. The contents of the file are as follows.</para>
-        <programlisting>
-&lt;server&gt;
-   &lt;mbean code="org.jboss.ejb3.cache.tree.PassivationTreeCache"
-       name="jboss.cache:service=EJB3SFSBClusteredCache"&gt;
-      
-        &lt;attribute name="IsolationLevel"&gt;READ_UNCOMMITTED&lt;/attribute&gt;
-        &lt;attribute name="CacheMode"&gt;REPL_SYNC&lt;/attribute&gt;
-        &lt;attribute name="ClusterName"&gt;SFSB-Cache&lt;/attribute&gt;
-        &lt;attribute name="ClusterConfig"&gt;
-            ... ...
-        &lt;/attribute&gt;
-
-        &lt;!--  Number of milliseconds to wait until all responses for a
-              synchronous call have been received.
-        --&gt;
-        &lt;attribute name="SyncReplTimeout"&gt;10000&lt;/attribute&gt;
-
-        &lt;!--  Max number of milliseconds to wait for a lock acquisition --&gt;
-        &lt;attribute name="LockAcquisitionTimeout"&gt;15000&lt;/attribute&gt;
-
-        &lt;!--  Name of the eviction policy class. --&gt;
-        &lt;attribute name="EvictionPolicyClass"&gt;
-            org.jboss.ejb3.cache.tree.StatefulEvictionPolicy
-        &lt;/attribute&gt;
-
-        &lt;!--  Specific eviction policy configurations. This is LRU --&gt;
-        &lt;attribute name="EvictionPolicyConfig"&gt;
-            &lt;config&gt;
-                &lt;attribute name="wakeUpIntervalSeconds"&gt;1&lt;/attribute&gt;
-                &lt;name&gt;statefulClustered&lt;/name&gt;
-                &lt;region name="/_default_"&gt;
-                    &lt;attribute name="maxNodes"&gt;1000000&lt;/attribute&gt;
-                    &lt;attribute name="timeToIdleSeconds"&gt;300&lt;/attribute&gt;
-                &lt;/region&gt;
-            &lt;/config&gt;
-        &lt;/attribute&gt;
-
-        &lt;attribute name="CacheLoaderFetchPersistentState"&gt;false&lt;/attribute&gt;
-        &lt;attribute name="CacheLoaderFetchTransientState"&gt;true&lt;/attribute&gt;
-        &lt;attribute name="FetchStateOnStartup"&gt;true&lt;/attribute&gt;
-        &lt;attribute name="CacheLoaderClass"&gt;
-            org.jboss.ejb3.cache.tree.StatefulCacheLoader
-        &lt;/attribute&gt;
-        &lt;attribute name="CacheLoaderConfig"&gt;
-            location=statefulClustered
-        &lt;/attribute&gt;
-   &lt;/mbean&gt;
-&lt;/server&gt;
-            </programlisting>
-        <para>The configuration attributes in the <literal>PassivationTreeCache</literal> MBean are essentially
-                    the same as the attributes in the standard JBoss Cache <literal>TreeCache</literal> MBean discussed
+<programlisting><![CDATA[ 
+<server>
+	<mbean code="org.jboss..cache.TreeCache"
+	name="jboss.cache:service=EJB3SFSBClusteredCache">
+	  
+		<attribute name="ClusterName">
+			${jboss.partition.name:DefaultPartition}-SFSBCache
+			</attribute>
+			<attribute name="IsolationLevel">REPEATABLE_READ</attribute>
+			<attribute name="CacheMode">REPL_ASYNC</attribute> 
+		  
+			<!-- We want to activate/inactivate regions as beans are deployed --> 
+			 <attribute name="UseRegionBasedMarshalling">true</attribute> 
+			<!-- Must match the value of "useRegionBasedMarshalling" --> 
+			<attribute name="InactiveOnStartup">true</attribute>
+			  
+			<attribute name="ClusterConfig">
+			... ...
+			</attribute> 
+			  
+			<!-- The max amount of time (in milliseconds) we wait until the 
+			initial state (ie. the contents of the cache) are retrieved from 
+			existing members.  --> 
+			<attribute name="InitialStateRetrievalTimeout">17500</attribute>
+			  
+			<!--  Number of milliseconds to wait until all responses for a
+				synchronous call have been received.
+				-->
+			<attribute name="SyncReplTimeout">17500</attribute>
+			  
+			<!--  Max number of milliseconds to wait for a lock acquisition -->
+			<attribute name="LockAcquisitionTimeout">15000</attribute>
+			  
+			 <!--  Name of the eviction policy class. -->
+			<attribute name="EvictionPolicyClass">
+				org.jboss.cache.eviction.LRUPolicy
+			</attribute>
+			  
+			<!--  Specific eviction policy configurations. This is LRU -->
+			<attribute name="EvictionPolicyConfig">
+			 <config>
+				<attribute name="wakeUpIntervalSeconds">5</attribute>
+				 <name>statefulClustered</name> 
+				<!-- So default region would never timeout -->
+				<region name="/_default_">
+				<attribute name="maxNodes">0</attribute>
+				 <attribute name="timeToIdleSeconds">0</attribute>
+				</region>
+			</config>
+		</attribute> 
+					  
+	<!-- Store passivated sessions to the file system --> 
+	 <attribute name="CacheLoaderConfiguration"> 
+	<config> 
+	  
+	 <passivation>true</passivation> 
+	<shared>false</shared> 
+							  
+	  <cacheloader> 
+		 <class>org.jboss.cache.loader.FileCacheLoader</class> 
+		<!-- Passivate to the server data dir --> 
+		 <properties> 
+			location=${jboss.server.data.dir}${/}sfsb 
+		</properties> 
+		<async>false</async> 
+		<fetchPersistentState>true</fetchPersistentState> 
+		<ignoreModifications>false</ignoreModifications> 
+		</cacheloader> 
+		  
+			 </config> 
+	   </attribute>
+	</mbean>
+</server>]]>
+</programlisting>
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+	    
+<para>The configuration attributes in this MBean are essentially the same as the attributes in the standard JBoss Cache <literal>TreeCache</literal> MBean discussed
                     in <xref linkend="jbosscache.chapt"/>. Again, we omitted the JGroups configurations in the
-                        <literal>ClusterConfig</literal> attribute (see more in <xref linkend="jbosscache-jgroups"/>).</para>
+		    <literal>ClusterConfig</literal> attribute (see more in <xref linkend="jbosscache-jgroups"/>). Two noteworthy items:</para>
+<itemizedlist>
+	<listitem>
+		<para>The cache is configured to support eviction. The EJB3 SFSB container uses the JBoss Cache eviction mechanism to manage SFSB passivation.  When beans are deployed, the EJB container will programatically add eviction regions to the cache, one region per bean type.
+		</para>
+	</listitem>
+	<listitem>
+		<para>A JBoss Cache CacheLoader is also configured; again to support SFSB passivation. When beans are evicted from the cache, the cache loader passivates them to a persistent store; in this case to the filesystem in the $JBOSS_HOME/server/all/data/sfsb directory. JBoss Cache supports a variety of different  CacheLoader implementations that know how to store data to different persistent store types; see the JBoss Cache documentation for details. However, if you change the  CacheLoaderConfiguration, be sure that you do not use a shared store (e.g., a single schema in a shared database.)  Each node in the cluster must have its own persistent store, otherwise as nodes independently passivate and activate clustered beans, they will corrupt each others data.
+		</para>
+	</listitem>
+</itemizedlist>
       </section>
-    </section>
-    <section id="clustering-entity">
+    
+
+</chapter>    
+    
+    <chapter id="clustering-entity">
       <title>Clustered Entity EJBs</title>
-      <para>In a JBoss AS cluster, the entity bean instances need to replicated across all nodes. If an entity
-                bean provides remote services, the service methods need to be load balanced as well.</para>
-      <para>To use a clustered entity bean, the application does not need to do anything special, except for
-                looking up bean references from the clustered HA-JNDI.</para>
+      <para>In a JBoss AS cluster, the entity bean instance caches need to be kept in sync across all nodes. If an entity bean provides remote services, the service methods need to be load balanced as well.</para>
+      
+      <para>To use a clustered entity bean, the application does not need to do anything special, except for looking up EJB 2.x remote bean references from the clustered HA-JNDI.</para>
       <section id="clustering-entity-21">
         <title>Entity Bean in EJB 2.x</title>
-        <para>First of all, it is worth to note that clustering 2.x entity beans is a bad thing to do. Its
-                    exposes elements that generally are too fine grained for use as remote objects to clustered remote
-                    objects and introduces data synchronization problems that are non-trivial. Do NOT use EJB 2.x entity
-                    bean clustering unless you fit into the sepecial case situation of read-only, or one read-write node
-                    with read-only nodes synched with the cache invalidation services.</para>
-<!--
-                TODO: Discuss what is cache invalidation service
-            -->
+	<para>First of all, it is worth noting that clustering 2.x entity beans is a bad thing to do. Its exposes elements that generally are too fine grained for use as remote objects to clustered remote objects and introduces data synchronization problems that are non-trivial. Do NOT use EJB 2.x entity bean clustering unless you fit into the sepecial case situation of read-only, or one read-write node with read-only nodes synched with the cache invalidation services.</para>
+
         <para>To cluster EJB 2.x entity beans, you need to add the <literal>&lt;clustered&gt;</literal> element
                     to the application's <literal>jboss.xml</literal> descriptor file. Below is a typical
                         <literal>jboss.xml</literal> file.</para>
@@ -1055,18 +1270,15 @@
       </section>
       <section id="clustering-entity-30">
         <title>Entity Bean in EJB 3.0</title>
-<!--
-                TODO: Discuss the drawback of EJB 3.0 clustering
-            -->
+
         <para>In EJB 3.0, the entity beans primarily serve as a persistence data model. They do not provide
                     remote services. Hence, the entity bean clustering service in EJB 3.0 primarily deals with
                     distributed caching and replication, instead of load balancing.</para>
-        <section id="clustering-entity-30-cache">
+        
+	    
+	    <section id="clustering-entity-30-cache">
           <title>Configure the distributed cache</title>
-          <para>To avoid round trips to the database, you can use a cache for your entities. JBoss EJB 3.0 is
-                        implemented by Hibernate, which has support for a second-level cache. The Hibernate setup used
-                        for the JBoss EJB 3.0 implementation uses JBoss Cache as its underlying cache implementation.
-                        The cache provides the following functionalities.</para>
+	  <para>To avoid round trips to the database, you can use a cache for your entities. JBoss EJB 3.0 entity beans are implemented by Hibernate, which has support for a second-level cache. The Hibernate setup used for the JBoss EJB 3.0 implementation uses JBoss Cache as its underlying second-level cache implementation. The second-level cache provides the following functionalities.</para>
           <itemizedlist>
             <listitem>
               <para>If you persist a cache enabled entity bean instance to the database via the entity
@@ -1085,65 +1297,75 @@
                                 does not exist in the database, it will be inserted into the cache.</para>
             </listitem>
           </itemizedlist>
-          <para>JBoss Cache service for EJB 3.0 entity beans is configured in a <literal>TreeCache</literal>
-                        MBean (see <xref linkend="jbosscache-cache"/>) in the
+          <para>The JBoss Cache service for EJB 3.0 entity beans is configured in a <literal>TreeCache</literal>
+                        MBean <!--(see <xref linkend="jbosscache-cache"/>) -->in the
                             <literal>deploy/ejb3-entity-cache-service.xml</literal> file. The name of the cache MBean
-                        service is <literal>jboss.cache:service=EJB3EntityTreeCache</literal>. Below is the contents of
+                        service is <literal>jboss.cache:service=EJB3EntityTreeCache</literal>. Below are the contents of
                         the <literal>ejb3-entity-cache-service.xml</literal> file in the standard JBoss distribution.
                         Again, we omitted the JGroups configuration element <literal>ClusterConfig</literal>.</para>
-          <programlisting>
-&lt;server&gt;
-    &lt;mbean code="org.jboss.cache.TreeCache" 
-            name="jboss.cache:service=EJB3EntityTreeCache"&gt;
-        
-        &lt;depends&gt;jboss:service=Naming&lt;/depends&gt;
-        &lt;depends&gt;jboss:service=TransactionManager&lt;/depends&gt;
+<programlisting><![CDATA[ 
+ <server>
+  <mbean code="org.jboss.cache.TreeCache" 
+ name="jboss.cache:service=EJB3EntityTreeCache">
+	  
+  <depends>jboss:service=Naming</depends>
+  <depends>jboss:service=TransactionManager</depends> 
+    
+  <!-- Name of cluster. Needs to be the same on all nodes in the clusters, 
+	       in order to find each other --> 
+	  <attribute name="ClusterName">
+  		${jboss.partition.name:DefaultPartition}-EntityCache
+	  </attribute>
+	  
+	  <!-- Configure the TransactionManager -->
+	 <attribute name="TransactionManagerLookupClass">
+	   org.jboss.cache.JBossTransactionManagerLookup
+	 </attribute>
+	  
+	 <attribute name="IsolationLevel">REPEATABLE_READ</attribute>
+	 <attribute name="CacheMode">REPL_SYNC</attribute> 
+	  
+	 <!-- Must be true if any entity deployment uses a scoped classloader --> 
+	 <attribute name="UseRegionBasedMarshalling">true</attribute> 
+	 <!-- Must match the value of "useRegionBasedMarshalling" --> 
+	 <attribute name="InactiveOnStartup">true</attribute>
+	  
+	 <attribute name="ClusterConfig">
+	  ... ...
+	 </attribute>
+	  
+	 <attribute name="InitialStateRetrievalTimeout">17500</attribute>
+	 <attribute name="SyncReplTimeout">17500</attribute>
+	 <attribute name="LockAcquisitionTimeout">15000</attribute>
+	  
+	 <attribute name="EvictionPolicyClass">
+	 org.jboss.cache.eviction.LRUPolicy
+	 </attribute>
+	  
+	 <!--  Specific eviction policy configurations. This is LRU -->
+	  <attribute name="EvictionPolicyConfig">
+	  <config>
+	  <attribute name="wakeUpIntervalSeconds">5</attribute>
+	  <!--  Cache wide default -->
+		  <region name="/_default_">
+		  <attribute name="maxNodes">5000</attribute>
+		  <attribute name="timeToLiveSeconds">1000</attribute>
+		  </region>
+	  </config>
+	 </attribute>
+	 </mbean>
+</server>
+]]>
+</programlisting>
 
-        &lt;!-- Configure the TransactionManager --&gt;
-        &lt;attribute name="TransactionManagerLookupClass"&gt;
-            org.jboss.cache.JBossTransactionManagerLookup
-        &lt;/attribute&gt;
+<para>This is a replicated cache, so, if running within a cluster, and the cache is updated, changes to the entries in one node will be replicated to the corresponding entries in the other nodes in the cluster.
+</para>
 
-        &lt;attribute name="IsolationLevel"&gt;REPEATABLE_READ&lt;/attribute&gt;
-        &lt;attribute name="CacheMode"&gt;REPL_SYNC&lt;/attribute&gt;
-
-        &lt;!--Name of cluster. Needs to be the same for all clusters, 
-            in order to find each other --&gt;
-        &lt;attribute name="ClusterName"&gt;EJB3-entity-cache&lt;/attribute&gt;
-
-        &lt;attribute name="ClusterConfig"&gt;
-            ... ...
-        &lt;/attribute&gt;
-
-        &lt;attribute name="InitialStateRetrievalTimeout"&gt;5000&lt;/attribute&gt;
-        &lt;attribute name="SyncReplTimeout"&gt;10000&lt;/attribute&gt;
-        &lt;attribute name="LockAcquisitionTimeout"&gt;15000&lt;/attribute&gt;
-
-        &lt;attribute name="EvictionPolicyClass"&gt;
-            org.jboss.cache.eviction.LRUPolicy
-        &lt;/attribute&gt;
-
-        &lt;!--  Specific eviction policy configurations. This is LRU --&gt;
-        &lt;attribute name="EvictionPolicyConfig"&gt;
-            &lt;config&gt;
-                &lt;attribute name="wakeUpIntervalSeconds"&gt;5&lt;/attribute&gt;
-                &lt;!--  Cache wide default --&gt;
-                &lt;region name="/_default_"&gt;
-                    &lt;attribute name="maxNodes"&gt;5000&lt;/attribute&gt;
-                    &lt;attribute name="timeToLiveSeconds"&gt;1000&lt;/attribute&gt;
-                &lt;/region&gt;
-            &lt;/config&gt;
-        &lt;/attribute&gt;
-    &lt;/mbean&gt;
-&lt;/server&gt;
-                </programlisting>
-          <para>As we discussed in <xref linkend="jbosscache-cache"/>, JBoss Cache allows you to specify
+          <para><!--As discussed in <xref linkend="jbosscache-cache"/>,--> JBoss Cache allows you to specify
                         timeouts to cached entities. Entities not accessed within a certain amount of time are dropped
-                        from the cache in order to save memory. If running within a cluster, and the cache is updated,
-                        changes to the entries in one node will be replicated to the corresponding entries in the other
-                        nodes in the cluster.</para>
-          <para>Now, we have JBoss Cache configured to support distributed caching of EJB 3.0 entity beans. We
-                        still have to configure individual entity beans to use the cache service.</para>
+			from the cache in order to save memory. The above configuration sets up a default configuration region that says that at most the cache will hold 5000 nodes, after which nodes will start being evicted from memory, least-recently used nodes last.  Also, if any node has not been accessed within the last 1000 seconds, it will be evicted from memory.  In general, a node in the cache represents a cached item (entity, collection, or query result set), although there are also a few other node that are used for internal purposes.  If the above values of 5000 maxNodes and 1000 idle seconds are invalid for your application(s), you can change the cache-wide defaults.  You can also add separate eviction regions for each of your entities; more on this below.
+		</para>
+          <para>Now, we have JBoss Cache configured to support distributed caching of EJB 3.0 entity beans. We still have to configure individual entity beans to use the cache service.</para>
         </section>
         <section id="clustering-entity-30-bean">
           <title>Configure the entity beans for cache</title>
@@ -1159,62 +1381,242 @@
     org.jboss.ejb3.entity.TreeCacheProviderHook
 &lt;/property&gt;
                 </programlisting>
-          <para>The following property element defines the object name of the cache to be used, and the MBean
-                        name.</para>
+		<para>The following property element defines the object name of the cache to be used, i.e., the name of the TreeCache MBean shown above.</para>
           <programlisting>
 &lt;property name="treecache.mbean.object_name"&gt;
     jboss.cache:service=EJB3EntityTreeCache
 &lt;/property&gt;
                 </programlisting>
-          <para>Next we need to configure what entities be cached. The default is to not cache anything, even
-                        with the settings shown above. We use the <literal>@Cache</literal> annotation to tag entity
-                        beans that needs to be cached.</para>
+<para>
+	Finally, you should give a “region_prefix” to this configuration.  This ensures that all cached items associated with this persistence.xml are properly grouped together in JBoss Cache.  The jboss.cache:service=EJB3EntityTreeCache cache is a shared resource, potentially used by multiple persistence units.  The items cached in that shared cache need to be properly grouped to allow the cache to properly manage classloading.
+	&lt;property name="hibernate.cache.region_prefix" value="myprefix"/&gt;
+</para>
+<para>
+	If you do not provide a region prefix, JBoss will automatically provide one for you, building it up from the name of the EAR (if any) and the name of the JAR that includes the persistence.xml. For example, a persistence.xml packaged in foo.ear, bar.jar would be given “foo_ear,bar_jar” as its region prefix. This is not a particularly friendly region prefix if you need to use it to set up specialized eviction regions (see below), so specifying your own region prefix is recommended.
+</para>
+		
+		
+		
+		
+		
+<para>Next we need to configure what entities be cached. The default is to not cache anything, even with the settings shown above. We use the <literal>@org.hibernate.annotations.Cache</literal> annotation to tag entity beans that needs to be cached.</para>
           <programlisting>
 @Entity 
 @Cache(usage=CacheConcurrencyStrategy.TRANSACTIONAL) 
-public class Customer implements Serializable { 
+public class Account implements Serializable { 
   // ... ... 
 }
                 </programlisting>
           <para>A very simplified rule of thumb is that you will typically want to do caching for objects that
-                        rarely change, and which are frequently read. You can fine tune the cache for each entity bean
-                        in the <literal>ejb3-entity-cache-service.xml</literal> configuration file. For instance, you
-                        can specify the size of the cache. If there are too many objects in the cache, the cache could
-                        evict oldest objects (or least used objects, depending on configuration) to make room for new
-                        objects. The cache for the <literal>mycompany.Customer</literal> entity bean is
-                            <literal>/mycompany/Customer</literal> cache region.</para>
-          <programlisting>
-&lt;server&gt;  
-  &lt;mbean code="org.jboss.cache.TreeCache" 
-         name="jboss.cache:service=EJB3EntityTreeCache"&gt;  
-    &lt;depends&gt;jboss:service=Naming 
-    &lt;depends&gt;jboss:service=TransactionManager 
-    ... ... 
-    &lt;attribute name="EvictionPolicyConfig"&gt;  
-      &lt;config&gt;  
-        &lt;attribute name="wakeUpIntervalSeconds"&gt;5&lt;/attribute&gt;  
-        &lt;region name="/_default_"&gt;  
-          &lt;attribute name="maxNodes"&gt;5000&lt;/attribute&gt;  
-          &lt;attribute name="timeToLiveSeconds"&gt;1000&lt;/attribute&gt;  
-        &lt;/region&gt;  
-        &lt;region name="/mycompany/Customer"&gt;  
-          &lt;attribute name="maxNodes"&gt;10&lt;/attribute&gt;  
-          &lt;attribute name="timeToLiveSeconds"&gt;5000&lt;/attribute&gt;  
-        &lt;/region&gt;  
-        ... ... 
-      &lt;/config&gt;  
-    &lt;/attribute&gt;  
-  &lt;/mbean&gt; 
-&lt;/server&gt;
-                </programlisting>
-          <para>If you do not specify a cache region for an entity bean class, all instances of this class
-                        will be cached in the <literal>/_default</literal> region as defined above. The EJB3
-                            <literal>Query</literal> API provides means for you to save to load query results (i.e.,
-                        collections of entity beans) from specified cache regions.</para>
+		  rarely change, and which are frequently read. You can fine tune the cache for each entity bean in the <literal>ejb3-entity-cache-service.xml</literal> configuration file. For instance, you can specify the size of the cache. If there are too many objects in the cache, the cache could evict oldest objects (or least used objects, depending on configuration) to make room for new objects.  Assuming the region_prefix specified in <literal>persistence.xml</literal> was myprefix, the default name of the cache region for the <literal>com.mycompany.entities.Account</literal> entity bean <literal>/myprefix/com/mycompany/entities/Account</literal>.</para>
+          <programlisting><![CDATA[
+<server>  
+  <mbean code="org.jboss.cache.TreeCache" 
+		 name="jboss.cache:service=EJB3EntityTreeCache"> 
+		  ... ... 
+	  <attribute name="EvictionPolicyConfig">  
+		  <config>  
+			  <attribute name="wakeUpIntervalSeconds">5</attribute>  
+			  <region name="/_default_">  
+				  <attribute name="maxNodes">5000</attribute>  
+				  <attribute name="timeToLiveSeconds">1000</attribute>  
+			  </region>  
+		  <!-- Separate eviction rules for Account entities -->
+			  <region name="/myprefix/com/mycompany/entities/Account">  
+				  <attribute name="maxNodes">10000</attribute>  
+				  <attribute name="timeToLiveSeconds">5000</attribute>  
+			  </region>  
+		  ... ... 
+		 </config>  
+	 </attribute>  
+ </mbean> 
+</server>]]>
+</programlisting>
+          
+
+<para>
+	If you do not specify a cache region for an entity bean class, all instances of this class will be cached in the <literal>/_default</literal> region as defined above. The @Cache annotation exposes an optional attribute “region” that lets you specify the cache region where an entity is to be stored, rather than having it be automatically be created from the fully-qualified class name of the entity class.
+</para>
+<programlisting>
+ at Entity 
+ at Cache(usage=CacheConcurrencyStrategy.TRANSACTIONAL,
+region=”Account”) 
+public class Account implements Serializable { 
+// ... ... 
+}
+</programlisting>
+<para>The eviction configuration would then become:</para>
+<programlisting><![CDATA[			
+<server>  
+	<mbean code="org.jboss.cache.TreeCache" 
+	      name="jboss.cache:service=EJB3EntityTreeCache"> 
+		... ... 
+	<attribute name="EvictionPolicyConfig">  
+	<config>  
+		<attribute name="wakeUpIntervalSeconds">5</attribute>  
+		<region name="/_default_">  
+		<attribute name="maxNodes">5000</attribute>  
+		<attribute name="timeToLiveSeconds">1000</attribute>  
+			</region>  
+		<!-- Separate eviction rules for Account entities -->
+			<region name="/myprefix/Account">  
+				<attribute name="maxNodes">10000</attribute>  
+				<attribute name="timeToLiveSeconds">5000</attribute>  
+			</region>  
+			... ... 
+	</config>  
+	</attribute>  
+	</mbean> 
+</server>]]>
+</programlisting>
+			
+</section>
+
+
+<section><title>Query result caching</title>
+<para>	
+	The EJB3 Query API also provides means for you to save in the second-level cache the  results (i.e., collections of primary keys of entity beans, or collections of scalar values) of specified queries.   Here we show a simple example of annotating a bean with a named query, also providing the Hibernate-specific hints that tells Hibernate to cache the query.
+</para>
+<para>
+	First, in persistence.xml you need to tell Hibernate to enable query caching:
+</para>
+<screen>&lt;property name="hibernate.cache.use_query_cache" value="true"/&gt;</screen>
+<para>	
+Next, you create a named query associated with an entity, and tell Hibernate you want to cache the results of that query:
+</para>
+<programlisting><![CDATA[ 
+ at Entity
+ at Cache (usage=CacheConcurrencyStrategy.TRANSACTIONAL,
+region=”Account”)
+ at NamedQueries({
+ at NamedQuery(name="account.bybranch",
+query="select acct from Account as acct where acct.branch = ?1",
+hints={@QueryHint(name="org.hibernate.cacheable",value="true")})           
+})
+public class Account implements Serializable { 
+// ... ... 
+}]]>
+</programlisting>
+
+<para>
+	The @NamedQueries, @NamedQuery and @QueryHint annotations are all in the javax.persistence package.See the Hibernate and EJB3 documentation for more on how to use EJB3 queries and on how to instruct EJB3 to cache queries. 
+</para>
+<para>
+By default, Hibernate stores query results in JBoss Cache in a region named {region_prefix}/org/hibernate/cache/StandardQueryCache. Based on this, you can set up separate eviction handling for your query results. So, if the region prefix were set to myprefix in persistence.xml, you could, for example, create this sort of eviction handling:
+</para>
+
+<programlisting>
+<![CDATA[ 
+<server>  
+	  <mbean code="org.jboss.cache.TreeCache" 
+		 name="jboss.cache:service=EJB3EntityTreeCache">
+		  ... ... 
+		  <attribute name="EvictionPolicyConfig">  
+			  <config>  
+			  <attribute name="wakeUpIntervalSeconds">5</attribute>  
+				  <region name="/_default_">  
+				  <attribute name="maxNodes">5000</attribute>  
+				  <attribute name="timeToLiveSeconds">1000</attribute>  
+				  </region>  
+				  <!-- Separate eviction rules for Account entities -->
+				  <region name="/myprefix/Account">  
+					  <attribute name="maxNodes">10000</attribute>  
+					  <attribute name="timeToLiveSeconds">5000</attribute>  
+				  </region>
+				  <!-- Cache queries for 10 minutes -->
+				  <region name="/myprefix/org/hibernate/cache/StandardQueryCache">  
+					  <attribute name="maxNodes">100</attribute>  
+					  <attribute name="timeToLiveSeconds">600</attribute>  
+				  </region>  
+				  ... ... 
+			  </config>  
+		  </attribute>  
+	  </mbean> 
+</server>
+	  ]]>
+</programlisting>
+
+<para>
+	The @NamedQuery.hints attribute shown above takes an array of vendor-specific @QueryHints as a value. Hibernate accepts the “org.hibernate.cacheRegion” query hint, where the value is the name of a cache region to use instead ofthe default /org/hibernate/cache/StandardQueryCache. For example:
+</para>
+<programlisting><![CDATA[
+	@Entity
+	@Cache (usage=CacheConcurrencyStrategy.TRANSACTIONAL,
+	region=”Account”)
+	@NamedQueries({
+	@NamedQuery(name="account.bybranch",
+	query="select acct from Account as acct where acct.branch = ?1",
+	hints={@QueryHint(name="org.hibernate.cacheable",value="true"),
+	@QueryHint(name=”org.hibernate.cacheRegion,value=”Queries”)
+	})           
+	})
+	public class Account implements Serializable { 
+	// ... ... 
+	}]]>
+</programlisting>
+	<para>
+	The related eviction configuration:
+</para>
+<programlisting><![CDATA[	
+<server>  
+	<mbean code="org.jboss.cache.TreeCache" 
+	       name="jboss.cache:service=EJB3EntityTreeCache">
+		... ... 
+		<attribute name="EvictionPolicyConfig">  
+			<config>  
+				<attribute name="wakeUpIntervalSeconds">5</attribute>  
+				<region name="/_default_">  
+					<attribute name="maxNodes">5000</attribute>  
+					<attribute name="timeToLiveSeconds">1000</attribute>  
+				</region>  
+				<!-- Separate eviction rules for Account entities -->
+				<region name="/myprefix/Account">  
+					<attribute name="maxNodes">10000</attribute>  
+					<attribute name="timeToLiveSeconds">5000</attribute>  
+				</region>
+				<!-- Cache queries for 10 minutes -->
+				<region name="/myprefix/Queries">  
+					<attribute name="maxNodes">100</attribute>  
+					<attribute name="timeToLiveSeconds">600</attribute>  
+				</region>  
+				... ... 
+			</config>  
+		</attribute>  
+	</mbean> 
+</server>]]>
+</programlisting>
+
+
+
         </section>
       </section>
-    </section>
-    <section id="clustering-http">
+ 
+</chapter>
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    
+    <chapter id="clustering-http">
       <title>HTTP Services</title>
       <para>HTTP session replication is used to replicate the state associated with your web clients on other
                 nodes of a cluster. Thus, in the event one of your node crashes, another node in the cluster will be
@@ -1224,33 +1626,24 @@
           <para>Session state replication</para>
         </listitem>
         <listitem>
-          <para>Load-balance of incoming invocations</para>
+          <para>Load-balancing of incoming invocations</para>
         </listitem>
       </itemizedlist>
       <para>State replication is directly handled by JBoss. When you run JBoss in the <literal>all</literal>
-                configuration, session state replication is enabled by default. Just deploy your web application and its
-                session state is already replicated across all JBoss instances in the cluster.</para>
-      <para>However, Load-balancing is a different story, it is not handled by JBoss itself and requires
-                additional software. As a very common scenario, we will demonstrate how to setup Apache and mod_jk. This
-                activity could be either performed by specialized hardware switches or routers (Cisco LoadDirector for
-                example) or any other dedicated software though.</para>
-      <note>
-        <para>A load-balancer tracks the HTTP requests and, depending on the session to which is linked the
-                    request, it dispatches the request to the appropriate node. This is called a load-balancer with
-                    sticky-sessions: once a session is created on a node, every future request will also be processed by
-                    the same node. Using a load-balancer that supports sticky-sessions without replicating the sessions
-                    allows you to scale very well without the cost of session state replication: each query will always
-                    be handled by the same node. But in the case a node dies, the state of all client sessions hosted by
-                    this node are lost (the shopping carts, for example) and the clients will most probably need to
-                    login on another node and restart with a new session. In many situations, it is acceptable not to
-                    replicate HTTP sessions because all critical state is stored in the database. In other situations,
-                    loosing a client session is not acceptable and, in this case, session state replication is the price
-                    one has to pay.</para>
+	      configuration, session state replication is enabled by default. Just configure your web application as distributable in its <filename>web.xml</filename> (see below), deploy it, and its session state is automtically replicated across all JBoss instances in the cluster.</para>
+      <para>However, load-balancing is a different story; it is not handled by JBoss itself and requires an external load balancer. aThis function could be provided by specialized hardware switches or routers (Cisco LoadDirector for example) or by specialized software running on commodity hardware. As a very common scenario, we will demonstrate how to set up a software load balancer using Apache httpd and mod_jk.</para>
+      
+      
+      <note><title>Note</title>
+	
+	      <para>A load-balancer tracks HTTP requests and, depending on the session to which the request is linked, it dispatches the request to the appropriate node. This is called load-balancing with sticky-sessions: once a session is created on a node, every future request will also be processed by that same node. Using a load-balancer that supports sticky-sessions but not configuring your web application for session replication allows you to scale very well by avoiding  the cost of session state replication: each query will always be handled by the same node. But in case a node dies, the state of all client sessions hosted by this node (the shopping carts, for example) will be lost and the clients will most probably need to login on another node and restart with a new session. In many situations, it is acceptable not to replicate HTTP sessions because all critical state is stored in a database. In other situations, losing a client session is not acceptable and, in this case!
 , session state replication is the price one has to pay.</para>
       </note>
-      <para>Apache is a well-known web server which can be extended by plugging modules. One of these modules,
-                mod_jk (and the newest mod_jk2) has been specifically designed to allow forward requests from Apache to
-                a Servlet container. Furthermore, it is also able to load-balance HTTP calls to a set of Servlet
-                containers while maintaining sticky sessions, and this is what is actually interesting for us.</para>
+      <section><title>Configuring load balancing using Apache and mod_jk</title>
+		      
+      <para>Apache is a well-known web server which can be extended by plugging in modules. One of these modules, mod_jk  has been specifically designed to allow the forwarding of requests from Apache to a Servlet container. Furthermore, it is also able to load-balance HTTP calls to a set of Servlet containers while maintaining sticky sessions, which is what is most interesting for us in this section.</para>
+      
+      
+</section>
       <section id="clustering-http-download">
         <title>Download the software</title>
         <para>First of all, make sure that you have Apache installed. You can download Apache directly from
@@ -1358,7 +1751,7 @@
       <section id="clustering-http-nodes">
         <title>Configure worker nodes in mod_jk</title>
         <para>Next, you need to configure mod_jk workers file <literal>conf/workers.properties</literal>. This
-                    file specify where are located the different Servlet containers and how calls should be
+                    file specifies where the different Servlet containers are located and how calls should be
                     load-balanced across them. The configuration file contains one section for each target servlet
                     container and one global section. For a two nodes setup, the file could look like this:</para>
 <!-- The local worker comment is from here: http://jira.jboss.com/jira/browse/JBDOCS-102 -->
@@ -1397,13 +1790,11 @@
                     8009.</para>
         <para>In the <literal>works.properties</literal> file, each node is defined using the
                         <literal>worker.XXX</literal> naming convention where <literal>XXX</literal> represents an
-                    arbitrary name you choose for one of the target Servlet container. For each worker, you must give
-                    the host name (or IP address) and port number of the AJP13 connector running in the Servlet
-                    container.</para>
+			arbitrary name you choose for each of the target Servlet containers. For each worker, you must specify the host name (or IP address) and the port number of the AJP13 connector running in the Servlet container.</para>
+	    
+	    
         <para>The <literal>lbfactor</literal> attribute is the load-balancing factor for this specific worker.
-                    It is used to define the priority (or weight) a node should have over other nodes. The higher this
-                    number is, the more HTTP requests it will receive. This setting can be used to differentiate servers
-                    with different processing power.</para>
+		It is used to define the priority (or weight) a node should have over other nodes. The higher this number is for a given worker relative to the other workers, the more HTTP requests the worker will receive. This setting can be used to differentiate servers with different processing power.</para>
         <para>The <literal>cachesize</literal> attribute defines the size of the thread pools associated to the
                     Servlet container (i.e. the number of concurrent requests it will forward to the Servlet container).
                     Make sure this number does not outnumber the number of threads configured on the AJP13 connector of
@@ -1416,23 +1807,20 @@
                     defined in the same file: load-balancing will happen over these workers.</para>
         <para>The <literal>sticky_session</literal> property specifies the cluster behavior for HTTP sessions.
                     If you specify <literal>worker.loadbalancer.sticky_session=0</literal>, each request will be load
-                    balanced between node1 and node2. But when a user opens a session on one server, it is a good idea
-                    to always forward this user's requests to the same server. This is called a "sticky session", as the
-                    client is always using the same server he reached on his first request. Otherwise the user's session
-                    data would need to be synchronized between both servers (session replication, see <xref linkend="clustering-http-state"/>). To enable session stickiness, you need to set
+		    balanced between node1 and node2; i.e., different requests for the same session will go to different servers. But when a user opens a session on one server, it is always necessary to always forward this user's requests to the same server, as long as that server is available. This is called a "sticky session", as the client is always using the same server he reached on his first request. To enable session stickiness, you need to set
                         <literal>worker.loadbalancer.sticky_session</literal> to 1.</para>
         <note>
-          <para>A non-loadbalanced setup with a single node required the <literal>worker.list=node1</literal>
-                        entry before mod_jk would function correctly.</para>
+          <para>A non-loadbalanced setup with a single node requires a <literal>worker.list=node1</literal>
+                        entry.</para>
         </note>
       </section>
       <section id="clustering-http-jboss">
-        <title>Configure JBoss</title>
+	      <title>Configuring JBoss to work with mod_jk</title>
         <para>Finally, we must configure the JBoss Tomcat instances on all clustered nodes so that they can
                     expect requests forwarded from the mod_jk loadbalancer.</para>
         <para>On each clustered JBoss node, we have to name the node according to the name specified in
                         <literal>workers.properties</literal>. For instance, on JBoss instance node1, edit the
-                        <literal>JBOSS_HOME/server/all/deploy/jbossweb-tomcat50.sar/server.xml</literal> file (replace
+			<literal>JBOSS_HOME/server/all/deploy/jboss-web.deployer/server.xml</literal> file (replace
                         <literal>/all</literal> with your own server name if necessary). Locate the
                         <literal>&lt;Engine&gt;</literal> element and add an attribute <literal>jvmRoute</literal>:</para>
         <programlisting>
@@ -1440,9 +1828,15 @@
 ... ...
 &lt;/Engine&gt;
             </programlisting>
-        <para>Then, for each JBoss Tomcat instance in the cluster, we need to tell it to add the
-                        <literal>jvmRoute</literal> value to its session cookies so that mod_jk can route incoming
-                    requests. Edit the
+	    <para>You also need to be sure the AJP connector in server.xml is enabled (i.e., uncommented). It is enabled by default. 
+	    </para>
+<programlisting><![CDATA[ 
+<!-- Define an AJP 1.3 Connector on port 8009 --> 
+<Connector port="8009" address="${jboss.bind.address}" protocol="AJP/1.3" 
+emptySessionPath="true" enableLookups="false" redirectPort="8443" /> ]]>
+</programlisting>
+	    
+<para>Then, for each JBoss Tomcat instance in the cluster, we need to tell it that mod_jk is in use, so it can properly manage  the <literal>jvmRoute</literal> appended to its session cookies so that mod_jk can properly route incoming requests. Edit the
                         <literal>JBOSS_HOME/server/all/deploy/jbossweb-tomcat50.sar/META-INF/jboss-service.xml</literal>
                     file (replace <literal>/all</literal> with your own server name). Locate the
                         <literal>&lt;attribute&gt;</literal> element with a name of <literal>UseJK</literal>, and set
@@ -1459,25 +1853,23 @@
                         <literal>http://wiki.jboss.org/wiki/Wiki.jsp?page=UsingMod_jk1.2WithJBoss</literal>.</para>
         </note>
       </section>
-      <section id="clustering-http-state">
-        <title>Configure HTTP session state replication</title>
-        <para>In <xref linkend="clustering-http-nodes"/>, we covered how to use sticky sessions to make sure
-                    that a client in a session always hits the same server node in order to maintain the session state.
-                    However, that is not an ideal solution. The load might be unevenly distributed over the nodes over
-                    time and if a node goes down, all its session data is lost. A better and more reliable solution is
-                    to replicate session data across all nodes in the cluster. This way, the client can hit any server
-                    node and obtain the same session states.</para>
+
+      
+<section id="clustering-http-state">
+	      <title>Configuring HTTP session state replication</title>
+	      <para>The preceding discussion has been focused on using mod_jk as a load balancer. The content of the remainder our discussion of clustering HTTP services in JBoss AS applies no matter what load balancer is used.
+	      </para>
+	      
+	      <para>In <xref linkend="clustering-http-nodes"/>, we covered how to use sticky sessions to make sure that a client in a session always hits the same server node in order to maintain the session state. However, sticky sessions by themselves are not an ideal solution. If a node goes down, all its session data is lost. A better and more reliable solution is to replicate session data across the nodes in the cluster. This way, the client can hit any server node and obtain the same session state.</para>
+	      
         <para>The <literal>jboss.cache:service=TomcatClusteringCache</literal> MBean makes use of JBoss Cache to
-                    provide HTTP session replication service to the HTTP load balancer in a JBoss Tomcat cluster. This
-                    MBean is defined in the <literal>deploy/tc5-cluster.sar/META-INF/jboss-service.xml</literal> file.</para>
+		provide HTTP session replication services to the JBoss Tomcat cluster. This MBean is defined in the <literal>deploy/jboss-web-cluster.sar/META-INF/jboss-service.xml file</literal>.</para>
         <note>
-          <para>Before AS 4.0.4 CR2, the HTTP session cache configuration file is the
-                            <literal>deploy/tc5-cluster-service.xml</literal> file. Please see AS 4.0.3 documentation
-                        for more details.</para>
+		<para>Before AS 4.2.0, the location of the HTTP session cache configuration file was <literal>deploy/tc5-cluster.sar/META-INF/jboss-service.xml</literal>. Prior to AS 4.0.4 CR2, the file was named <literal>deploy/tc5-cluster-service.xml</literal>. </para>
         </note>
-        <para>Below is a typical <literal>deploy/tc5-cluster.sar/META-INF/jboss-service.xml</literal> file. The
-                    configuration attributes in the <literal>TomcatClusteringCache</literal> MBean is very similar to
-                    those in <xref linkend="jbosscache-cache"/>.</para>
+        <para>Below is a typical <literal>deploy/jbossweb-cluster.sar/META-INF/jboss-service.xml</literal> file. The
+                    configuration attributes in the <literal>TomcatClusteringCache</literal> MBean are very similar to
+                    those in the JBoss AS cache configuration.<!--<xref linkend="jbosscache-cache"/>.--></para>
         <programlisting>
 &lt;mbean code="org.jboss.cache.aop.TreeCacheAop"
     name="jboss.cache:service=TomcatClusteringCache"&gt;
@@ -1506,41 +1898,32 @@
         ... ...
     &lt;/attribute&gt;
     
+   
     &lt;attribute name="LockAcquisitionTimeout"&gt;15000&lt;/attribute&gt;
+    &lt;attribute name="SyncReplTimeout"&gt;20000&lt;/attribute&gt;
 &lt;/mbean&gt;
             </programlisting>
-        <para>The detailed configuration for the <literal>TreeCache</literal> MBean is covered in <xref linkend="jbosscache-cache"/>. Below, we will just discuss several attributes that are most
-                    relevant to the HTTP cluster session replication.</para>
+	    
+	    <para>Note that the value of the mbean element's code attribute is org.jboss.cache.aop.TreeCacheAop, which is different from the other JBoss Cache Mbeans used in JBoss AS. This is because FIELD granularity HTTP session replication (covered below) needs the added features of the <literal>TreeCacheAop</literal> (a.k.a. <literal>PojoCache</literal>) class.
+	    </para>
+	    
+	    <para>The details of all the configuration options for a TreeCache MBean are covered in the JBoss Cache documentation. Below, we will just discuss several attributes that are most relevant to the HTTP cluster session replication.</para>
         <itemizedlist>
           <listitem>
             <para><emphasis role="bold">TransactionManagerLookupClass</emphasis> sets the transaction
                             manager factory. The default value is
                                 <literal>org.jboss.cache.BatchModeTransactionManagerLookup</literal>. It tells the cache
-                            NOT to participate in JTA-specific transactions. Instead, the cache manages its own
-                            transaction to support finely grained replications.</para>
+				NOT to participate in JTA-specific transactions. Instead, the cache manages its own transactions. Please do not change this.</para>
           </listitem>
+          
           <listitem>
-            <para><emphasis role="bold">IsolationLevel</emphasis> sets the isolation level for updates to
-                            the transactional distributed cache. The valid values are <literal>SERIALIZABLE</literal>,
-                                <literal>REPEATABLE_READ</literal>, <literal>READ_COMMITTED</literal>,
-                                <literal>READ_UNCOMMITTED</literal>, and <literal>NONE</literal>. These isolation levels
-                            mean the same thing as isolation levels on the database. The default isolation of
-                                <literal>REPEATABLE_READ</literal> makes sense for most web applications.</para>
-          </listitem>
-          <listitem>
             <para><emphasis role="bold">CacheMode</emphasis> controls how the cache is replicated. The valid
-                            values are <literal>REPL_SYNC</literal> and <literal>REPL_ASYNC</literal>, which determine
-                            whether changes are made synchronously or asynchronously. Using synchronous replication
-                            makes sure changes propagated to the cluster before the web request completes. However,
-                            synchronous replication is much slower. For asyncrhonous access, you will want to enable and
-                            tune the replication queue.</para>
+		    values are <literal>REPL_SYNC</literal> and <literal>REPL_ASYNC</literal>. With either setting the client request thread updates the local cache with the current sesssion contents and then sends a message to the caches on the other members of the cluster, telling them to make the same change.  With REPL_ASYNC (the default) the request thread returns as soon as the update message has been put on the network.  With REPL_SYNC, the request thread blocks until it gets a reply message from all cluster members, informing it that the update was successfully applied. Using synchronous replication makes sure changes are applied aroundthe cluster before the web request completes. However, synchronous replication is much slower.</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">ClusterName</emphasis> specifies the name of the cluster that the
                             cache works within. The default cluster name is the the word "Tomcat-" appended by the
-                            current JBoss partition name. All the nodes should use the same cluster name. Although
-                            session replication can share the same channel (multicast address and port) with other
-                            clustered services in JBoss, replication should have it's own cluster name.</para>
+			    current JBoss partition name. All the nodes must use the same cluster name.</para>
           </listitem>
           <listitem>
             <para>The <emphasis role="bold">UseMarshalling</emphasis> and <emphasis role="bold">InactiveOnStartup</emphasis> attributes must have the same value. They must be
@@ -1548,36 +1931,23 @@
                             (see later). Otherwise, they are default to <literal>false</literal>.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">ClusterConfig</emphasis> configures the underlying JGroups stack.
-                            The most import configuration elements are the muliticast adress and port,
-                                <literal>mcast_addr</literal> and <literal>mcast_port</literal> respectively, to use for
-                            clustered communication. These values should make sense for your network. Please refer to
-                                <xref linkend="jbosscache-jgroups"/> for more information.</para>
+		  <para><emphasis role="bold">ClusterConfig</emphasis> configures the underlying JGroups stack. Please refer to <xref linkend="jbosscache-jgroups"/> for more information.</para>
           </listitem>
           <listitem>
             <para><emphasis role="bold">LockAcquisitionTimeout</emphasis> sets the maximum number of
-                            milliseconds to wait for a lock acquisition. The default value is 15000.</para>
+		    milliseconds to wait for a lock acquisition when trying to lock a cache node. The default value is 15000.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">UseReplQueue</emphasis> determines whether to enable the replication
-                            queue when using asynchronous replication. This allows multiple cache updates to be bundled
-                            together to improve performance. The replication queue properties are controlled by the
-                                <literal>ReplQueueInterval</literal> and <literal>ReplQueueMaxElements</literal>
-                            properties.</para>
+		  <para><emphasis role="bold">SyncReplTimeout</emphasis> sets the maximum number of milliseconds to wait for a response from all nodes in the cluster when a synchronous replication message is sent out. The default value is 20000; should be a few seconds longer than LockAcquisitionTimeout.</para>
           </listitem>
-          <listitem>
-            <para><emphasis role="bold">ReplQueueInterval</emphasis> specifies the time in milliseconds
-                            JBoss Cache will wait before sending items in the replication queue.</para>
-          </listitem>
-          <listitem>
-            <para><emphasis role="bold">ReplQueueMaxElements</emphasis>: specifies the maximum number of
-                            elements allowed in the replication queue before JBoss Cache will send an update.</para>
-          </listitem>
+     
         </itemizedlist>
       </section>
+      
+      
       <section id="clustering-http-app">
         <title>Enabling session replication in your application</title>
-        <para>To enable clustering of your web application you must it as distributable in the
+        <para>To enable clustering of your web application you must tag it as distributable in the
                     <literal>web.xml</literal> descriptor. Here's an example:</para>
         <programlisting>&lt;?xml version="1.0"?&gt; 
 &lt;web-app  xmlns="http://java.sun.com/xml/ns/j2ee"
@@ -1598,79 +1968,87 @@
     &lt;/replication-config&gt;
 &lt;/jboss-web&gt;</programlisting>
         <para>The <literal>replication-trigger</literal> element determines what triggers a session replication
-                    (or when is a session is considered dirty). It has 4 options:</para>
+                    (i.e. when is a session is considered <literal>dirty</literal> and in need of replication). It has 4 options:</para>
         <itemizedlist>
           <listitem>
-            <para><emphasis role="bold">SET</emphasis>: With this policy, the session is considered dirty
-                            only when an attribute is set in the session. If your application always writes changed
-                            value back into the session, this option will be most optimized in term of performance. If
-                            an object is retrieved from the session and modified without being written back into the
-                            session, the change to that object will not be replicated.</para>
+		  <para><emphasis role="bold">SET</emphasis>: With this policy, the session is considered dirty only when an attribute is set in the session (i.e., HttpSession.setAttribute() is invoked.) If your application always writes changed values back into the session, this option will be most optimal in terms of performance. The downside of SET is that if an object is retrieved from the session and modified without being written back into the session, the session manager will not know the attribute is dirty and the change to that object may not be replicated.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">SET_AND_GET</emphasis>: With this policy, any attribute that is get
-                            or set will be marked as dirty. If an object is retrieved from the session and modified
-                            without being written back into the session, the change to that object will be replicated.
-                            This option can have significant performance implications.</para>
+		  <para><emphasis role="bold">SET_AND_GET</emphasis>: With this policy, any attribute that is get or set will be marked as dirty. If an object is retrieved from the session and modified without being written back into the session, the change to that object will be replicated. The downside of SET_AND_GET is that it can have significant performance implications, since even reading immutable objects from the session (e.g., strings, numbers) will mark the read attributes as needing to be replicated.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">SET_AND_NON_PRIMITIVE_GET</emphasis>: This policy is similar to the
-                            SET_AND_GET policy except that only non-primitive get operations are considered dirty. For
-                            example, the http session request may retrieve a non-primitive object instance from the
-                            attribute and then modify the instance. If we don't specify that non-primitive get is
-                            considered dirty, then the modification will not be replication properly. This is the
-                            default value.</para>
+		  <para><emphasis role="bold">SET_AND_NON_PRIMITIVE_GET</emphasis>: This policy is similar to the SET_AND_GET policy except that get operationsthat return attribute values with primitive types do not mark the attribute as dirty. Primitive system types (i.e., String, Integer, Long, etc.) are immutable, so there is no reason to mark an attribute with such a type as dirty just because it has been read. If a get operation returns a value of a non-primitive type, the session manager has no simple way to know whether the object is mutable, so it assumes it is an marks the attribute as dirty. This setting avoids the downside of SET while reducing the performance impact of SET_AND_GET. It is the default setting.</para>
           </listitem>
           <listitem>
-            <para><emphasis role="bold">ACCESS</emphasis>: This option causes the session to be marked as
-                            dirty whenever it is accessed. Since a the session is accessed during each HTTP request, it
-                            will be replicated with each request. The access time stamp in the session instance will be
-                            updated as well. Since the time stamp may not be updated in other clustering nodes because
-                            of no replication, the session in other nodes may expire before the active node if the HTTP
-                            request does not retrieve or modify any session attributes. When this option is set, the
-                            session timestamps will be synchronized throughout the cluster nodes. Note that use of this
-                            option can have a significant performance impact, so use it with caution.</para>
+		  <para><emphasis role="bold">ACCESS</emphasis>: This option causes the session to be marked as dirty whenever it is accessed. Since a the session is accessed during each HTTP request, it will be replicated with each request. The purpose of ACCESS is to ensure session last-access timestamps are kept in sync around the cluster.. Since with the other replication-trigger options the time stamp may not be updated in other clustering nodes because of no replication, the session in other nodes may expire before the active node if the HTTP request does not retrieve or modify any session attributes. When this option is set, the session timestamps will be synchronized throughout the cluster nodes. Note that use of this option can have a significant performance impact, so use it with caution. With the other replication-trigger options, if a session has gone 80% of its expiration interval without being replicated, as a safeguard its timestamp will be replicated no matter what. So, A!
 CCESS is only useful in special circumstances where the above safeguard is considered inadequate.</para>
           </listitem>
         </itemizedlist>
         <para>The <literal>replication-granularity</literal> element controls the size of the replication units.
                     The supported values are: </para>
         <itemizedlist>
+          
           <listitem>
-            <para><emphasis role="bold">SESSION</emphasis>: Replication is per session instance. As long as
-                            it is considered modified when the snapshot manager is called, the whole session object will
-                            be serialized.</para>
+		  <para><emphasis role="bold">ATTRIBUTE</emphasis>: Replication is only for the dirty attributes in the session plus some session data, like the last-accessed timestamp. For sessions that carry large amounts of data, this option can increase replication performance. However, attributes will be separately serialized, so if there are any shared references between objects stored in the attributes, those shared references may be broken on remote nodes. For example, say a Person object stored under key “husband” has a reference to an Address, while another Person object stored under key “wife” has a reference to that same Address object.  When the “husband” and “wife” attributes are separately deserialized  on the remote nodes, each Person object will now have a reference to its own Address object; the Address object will no longer be shared.</para>
           </listitem>
-          <listitem>
-            <para><emphasis role="bold">ATTRIBUTE</emphasis>: Replication is only for the dirty attributes
-                            in the session plus some session data, like, lastAccessTime. For session that carries large
-                            amount of data, this option can increase replication performance.</para>
+	  <listitem>
+		  <para><emphasis role="bold">SESSION</emphasis>: The entire session object is replicated if any attribute is dirty. The entire session is serialized in one unit, so shared object references are maintained on remote nodes. This is the default setting.</para>
           </listitem>
-          <listitem>
-            <para><emphasis role="bold">FIELD</emphasis>: Replication is only for data fields inside session
-                            attribute objects (see more later).</para>
+	  
+	  
+	  <listitem>
+		  <para><emphasis role="bold">FIELD</emphasis>: Replication is only for individual changed data fields inside session attribute objects. Shared object references will be preserved across the cluster. Potentially most performant, but requires changes to your application (this will be discussed later).</para>
           </listitem>
         </itemizedlist>
-        <para>The <literal>replication-field-batch-mode</literal> element indicates whether you want to have
-                    batch update between each http request or not. Default is <literal>true</literal>.</para>
+	<para>The <literal>replication-field-batch-mode</literal> element indicates whether you want all replication messages associated with a request to be batched into one message.  Only applicable if replication-granularity is FIELD. Default is <literal>true</literal>.</para>
         <para>If your sessions are generally small, SESSION is the better policy. If your session is larger and
                     some parts are infrequently accessed, ATTRIBUTE replication will be more effective. If your
                     application has very big data objects in session attributes and only fields in those objects are
-                    frequently modified, the FIELD policy would be the best. In the next section, let's discuss exactly
+                    frequently modified, the FIELD policy would be the best. In the next section, we will discuss exactly
                     how the FIELD level replication works.</para>
       </section>
+      
+      
+      
+      
+      
       <section id="clustering-http-field">
-        <title>Use FIELD level replication</title>
-        <para>FIELD-level replication only replicates modified data fields inside objects stored in the session.
-                    It could potentially drastically reduce the data traffic between clustered nodes, and hence improve
-                    the performance of the whole cluster. To use FIELD-level replication, you have to first prepare your
-                    Java class to indicate which fields are to be replicated. This is done via JDK 1.4 style annotations
-                    embedded in JavaDocs:</para>
-        <para>To annotate your POJO, we provide two annotations:
-                        <literal>@@org.jboss.web.tomcat.tc5.session.AopMarker</literal> and
-                        <literal>@@org.jboss.web.tomcat.tc5.session.InstanceAopMarker</literal>. When you annotate your
-                    class with <literal>AopMarker</literal>, you indicate that instances of this class will be used in
-                    FIELD-level replication. For exmaple,</para>
-        <programlisting>
+        <title>Using FIELD level replication</title>
+	<para>FIELD-level replication only replicates modified data fields inside objects stored in the session. Its use could potentially drastically reduce the data traffic between clustered nodes, and hence improve the performance of the whole cluster. To use FIELD-level replication, you have to first prepare (i.e., bytecode enhance) your Java class to allow the session cache to detect when fields in cached objects have been changed and need  to be replicated.     
+	</para>
+	<para>
+		The first step in doing this is to identify the classes that need to be prepared.  This is done via annotations. For example:
+	</para>
+	
+<programlisting><![CDATA[
+ at org.jboss.cache.aop.AopMarker
+public class Address 
+{
+...
+}]]> 
+</programlisting>
+	
+<para>
+If you annotate a class with InstanceAopMarker instead, then all of its subclasses will be automatically annotated as well. Similarly, you can annotate an interface with InstanceofAopMarker and all of its implementing classes will be annotated. For example:
+</para>
+<programlisting><![CDATA[
+ at org.jboss.cache.aop.InstanceOfAopMarker
+public class Person 
+{
+...
+}
+then when you have a sub-class like
+public class Student extends Person
+{
+...
+}
+]]></programlisting>
+	
+<para>There will be no need to annotate <literal>Student</literal>. It will be annotated automatically because it is a sub-class of <literal>Person</literal>.
+Jboss AS 4.2 requires JDK 5 at runtime, but some users may still need to build their projects using JDK 1.4.  In this case, annotating classes can be  done via JDK 1.4 style annotations embedded in JavaDocs. For example:
+</para>
+
+
+<programlisting>
 /*
  * My usual comments here first.
  * @@org.jboss.web.tomcat.tc5.session.AopMarker
@@ -1680,9 +2058,11 @@
 ...
 }
 </programlisting>
-        <para>If you annotate it with <literal>InstanceAopMarker</literal> instead, then all of its sub-class
-                    will be automatically annotated as well. For example,</para>
-        <programlisting>
+        
+
+<para>The anologue for <literal>@InstanceAopMarker</literal> is:</para>
+
+<programlisting>
 /*
  *
  * @@org.jboss.web.tomcat.tc5.session.InstanceOfAopMarker
@@ -1692,42 +2072,32 @@
 ...
 }
 </programlisting>
-        <para>then when you have a sub-class like</para>
+
+
+
+<para>
+	Once you have annotated your classes, you will need to perform a pre-processing step to bytecode enhance your classes for use by TreeCacheAop. You need to use the JBoss AOP pre-compiler <literal>annotationc</literal> and post-compiler <literal>aopc</literal> to process the above source code before and after they are compiled by the Java compiler. The annotationc step is only need if the JDK 1.4 style annotations are used; if JDK 5 annotations are used it is not necessary. Here is an example on how to invoke those commands from command line.
+</para>
+
         <programlisting>
-public class Student extends Person
-{
-...
-}
-</programlisting>
-        <para>there will be no need to annotate <literal>Student</literal>. It will be annotated automatically
-                    because it is a sub-class of <literal>Person</literal>.</para>
-        <para>However, since we only support JDK 1.4 style annotation (provided by JBoss Aop) now, you will need
-                    to perform a pre-processing step. You need to use the JBoss AOP pre-compiler
-                    <literal>annotationc</literal> and post-compiler <literal>aopc</literal> to process the above source
-                    code before and after they are compiled by the Java compiler. Here is an example on how to invoke
-                    those commands from command line.</para>
-        <programlisting>
 $ annotationc [classpath] [source files or directories]
 $ javac -cp [classpath] [source files or directories]
 $ aopc [classpath] [class files or directories]            
             </programlisting>
-        <para>Please see the JBoss AOP documentation for the usage of the pre- and post-compiler. The JBoss AOP
-                    project also provides easy to use ANT tasks to help integrate those steps into your application
-                    build process. In the next AS release, JDK 5.0 annotation support will be provided for greater
-                    transparency. But for now, it is important that you perform the pre- and post-compilation steps for
-                    your source code.</para>
+	    
+<para>
+	Please see the JBoss AOP documentation for the usage of the pre- and post-compiler. The JBoss AOP project also provides easy to use ANT tasks to help integrate those steps into your application build process. 
+</para>
         <note>
-          <para>Or, you can see a complete example on how to build, deploy, and validate a FIELD-level
-                        replicated web application from this page:
-                            <literal>http://wiki.jboss.org/wiki/Wiki.jsp?page=Http_session_field_level_example</literal>.
-                        The example bundles the pre- and post-compile tools so you do not need to download JBoss AOP
-                        separately.</para>
+		<para>
+			You can see a complete example on how to build, deploy, and validate a FIELD-level replicated web application from this page: <ulink url="http://wiki.jboss.org/wiki/Wiki.jsp?page=Http_session_field_level_example"/>. The example bundles the pre- and post-compile tools so you do not need to download JBoss AOP separately.
+		</para>
         </note>
         <para>When you deploy the web application into JBoss AS, make sure that the following configurations are
                     correct:</para>
         <itemizedlist>
           <listitem>
-            <para>In the server's <literal>deploy/tc5-cluster.sar/META-INF/jboss-service.xml</literal> file,
+            <para>In the server's <literal>deploy/jboss-web-cluster.sar/META-INF/jboss-service.xml</literal> file,
                             the <literal>inactiveOnStartup</literal> and <literal>useMarshalling</literal> attributes
                             must both be <literal>true</literal>.</para>
           </listitem>
@@ -1770,6 +2140,8 @@
                         <literal>printDetails</literal> operation. You should see output resembling the following.</para>
         <programlisting>/JSESSION
 
+/localhost
+
 /quote
 
 /FB04767C454BAB3B2E462A27CB571330
@@ -1779,69 +2151,275 @@
 /AxCI8Ovt5VQTfNyYy9Bomw**
 VERSION: 4
 AxCI8Ovt5VQTfNyYy9Bomw**: org.jboss.invocation.MarshalledValue at e076e4c8</programlisting>
-        <para>This output shows two separate web sessions, in one application named <emphasis>quote</emphasis>,
-                    that are being shared via JBossCache. This example uses a <literal>replication-granularity</literal>
-                    of <literal>session</literal>. Had <literal>attribute</literal> level replication been used, there
-                    would be additional entries showing each replicated session attribute. In either case, the
-                    replicated values are stored in an opaque <literal>MarshelledValue</literal> container. There aren't
-                    currently any tools that allow you to inspect the contents of the replicated session values. If you
-                    don't see any output, either the application was not correctly marked as
-                    <literal>distributable</literal> or you haven't accessed a part of application that places values in
-                    the HTTP session. The <literal>org.jboss.cache</literal> and <literal>org.jboss.web</literal>
-                    logging categories provide additional insight into session replication useful for debugging
-                    purposes. </para>
+        
+<para>
+	This output shows two separate web sessions, in one application named <emphasis>quote</emphasis>, that are being shared via JBossCache. This example uses a <literal>replication-granularity</literal> of <literal>session</literal>. Had <literal>ATTRIBUTE</literal> level replication been used, there would be additional entries showing each replicated session attribute. In either case, the replicated values are stored in an opaque <literal>MarshelledValue</literal> container. There aren't currently any tools that allow you to inspect the contents of the replicated session values. If you do not see any output, either the application was not correctly marked as <literal>distributable</literal> or you haven't accessed a part of application that places values in the HTTP session. The <literal>org.jboss.cache</literal> and <literal>org.jboss.web</literal>  logging categories provide additional insight into session replication useful for debugging purposes.
+</para>
       </section>
+      
+      
+      
       <section id="clustering-http-sso">
-        <title>Using Single Sign On</title>
-        <para> JBoss supports clustered single sign-on, allowing a user to authenticate to one application on a
-                    JBoss server and to be recognized on all applications, on that same machine or on another node in
-                    the cluster, that are deployed on the same virtual host. Authentication replication is handled by
-                    the HTTP session replication service. Although session replication does not need to be explicitly
-                    enabled for the applications in question, the <literal>tc5-cluster-service.xml</literal> file does
-                    need to be deployed. </para>
-        <para> To enable single sign-on, you must add the <literal>ClusteredSingleSignOn</literal> valve to the
-                    appropriate <literal>Host</literal> elements of the tomcat <literal>server.xml</literal> file. The
-                    valve configuration is shown here: </para>
+        <title>Using Clustered Single Sign On</title>
+	
+	<para>JBoss supports clustered single sign-on, allowing a user to authenticate to one web application on a JBoss server and to be recognized on all web applications, on that same machine or on another node in the cluster, that are deployed on the same virtual host. Authentication replication is handled by the same JBoss Cache Mbean that is used by the HTTP session replication service. Although session replication does not need to be explicitly enabled for the applications in question, the <literal>jboss-web-cluster.sar</literal> file needs to be deployed.
+	</para> 
+	<para>
+		To enable single sign-on, you must add the <literal>ClusteredSingleSignOn</literal> valve to the appropriate <literal>Host</literal> elements of the tomcat <literal>server.xml</literal> file. The valve configuration is shown here: 
+	</para>
         <programlisting>&lt;Valve className="org.jboss.web.tomcat.tc5.sso.ClusteredSingleSignOn" /&gt;</programlisting>
       </section>
-    </section>
-    <section id="clustering-jms">
+      
+      <section><title>Clustered Singleton Services</title>
+	      <para>A clustered singleton service (also known as an HA singleton) is a service that is deployed on multiple nodes in a cluster, but is providing its service on only one of the nodes. The node running the singleton service is typically called the master node. When the master fails or is shut down, another master is selected from the remaining nodes and the service is restarted on the new master. Thus, other than a brief interval when one master has stopped and another has yet to take over, the service is always being provided by one but only one node.
+	      </para>
+	      <figure id="master_node_fail.fig">
+		      <title>Topology after the Master Node fails</title>
+		      <mediaobject>
+			      <imageobject>
+				      <imagedata align="center" fileref="images/master_node_fail.png"/>
+			      </imageobject>
+		      </mediaobject>
+        </figure>    
+	      
+	<para>
+		The JBoss Application Server (AS) provides support for a number of strategies for helping you deploy clustered singleton services. In this section we will explore the different strategies. All of the strategies are built on top of the HAPartition service described in the introduction.  They rely on the <literal>HAPartition</literal> to provide notifications when different nodes in the cluster start and stop; based on those notifications each node in the cluster can independently (but consistently) determine if it is now the master node and needs to begin providing a service.
+	      </para>
+	
+	      
+	      <section><title>HASingletonDeployer service</title>
+		      <para>
+			      The simplest and most commonly used strategy for deploying an HA singleton is to take an ordinary deployment (war, ear, jar, whatever you would normally put in deploy) and deploy it in the <literal>$JBOSS_HOME/server/all/deploy-hasingleton</literal> directory instead of in <literal>deploy</literal>.  The <literal>deploy-hasingleton</literal> directory does not lie under deploy or farm, so its contents are not automatically deployed when an AS instance starts. Instead, deploying the contents of this directory is the responsibility of a special service, the  <literal>jboss.ha:service=HASingletonDeployer</literal> MBean (which itself is deployed via the deploy/deploy-hasingleton-service.xml file.) The  HASingletonDeployer service is itself an HA Singleton, one whose provided service when it becomes master is to deploy the contents of  deploy-hasingleton and whose service when it stops being the master (typically at server shutdown) is to undeploy the  contents of <lit!
 eral>deploy-hasingleton</literal>.
+		      </para>
+		      <para>
+			      So, by placing your deployments in <literal>deploy-hasingleton</literal> you know that they will be deployed only on the master node in the cluster. If the master node cleanly shuts down, they will be cleanly undeployed as part of shutdown.  If the master node fails or is shut down, they will be deployed on whatever node takes over as master.
+		      </para>
+		      <para>Using  deploy-hasingleton is very simple, but it does have two drawbacks:</para>
+		      
+		      <itemizedlist>
+			      <listitem>
+				      <para>There is no hot-deployment feature for services in <literal>deploy-hasingleton</literal>. Redeploying a service that has been deployed to <literal>deploy-hasingleton</literal> requires a server restart.
+				      </para>
+			      </listitem>
+			      
+			      <listitem>
+				      <para>If the master node fails and another node takes over as master, your singleton service needs to go through the entire deployment process before it will be providing services. Depending on how complex the deployment of your service is and what sorts of startup activities it engages in, this could take a while, during which time the service is not being provided.
+				      </para>
+			      </listitem>
+			      
+			     <!-- <listitem>
+				      <para>
+				      </para>
+			      </listitem>-->
+		      </itemizedlist>
+		      	      
+      </section>
+	      
+      <section><title>Mbean deployments using HASingletonController</title>
+	      <para>
+		      If your service is an Mbean (i.e., not a J2EE deployment like an ear or war or jar), you can deploy it along with a service called an HASingletonController in order to turn it into an HA singleton. It is the job of the  HASingletonController to work with the HAPartition service to monitor the cluster and determine if it is now the master node for its service.  If it determines it has become the master node, it invokes a method on your service telling it to begin providing service.  If it determines it is no longer the master node, it invokes a method on your service telling it to stop providing service. Let's walk through an illustration.
+	      </para>
+	      <para>First, we have an MBean service that we want to make an HA singleton. The only thing special about it is it needs to expose in its MBean interface a method that can be called when it should begin providing service, and another that can be called when it should stop providing service:
+	      </para>
+	      
+<programlisting><![CDATA[ 
+public class HASingletonExample
+implements HASingletonExampleMBean { 
+ 
+private boolean isMasterNode = false; 
+  
+public void startSingleton() { 
+isMasterNode = true; 
+} 
+. 
+public boolean isMasterNode() { 
+return isMasterNode; 
+ } 
+  
+ public void stopSingleton() { 
+ isMasterNode = false; 
+ } 
+}  ]]>
+</programlisting>
+	      
+<para>
+	We used “startSingleton” and “stopSingleton” in the above example, but you could name the methods anything.
+</para>
+<para>
+	Next, we deploy our service, along with an HASingletonController to control it, most likely packaged in a .sar file, with the following <literal>META-INF/jboss-service.xml</literal>:
+</para>
+<programlisting><![CDATA[
+ <server> 
+	 <!-- This MBean is an example of a clustered singleton --> 
+	 <mbean code="org.jboss.ha.examples.HASingletonExample" 
+		name=“jboss:service=HASingletonExample"/> 
+	 
+	 <!-- This HASingletonController manages the cluster Singleton --> 
+	 <mbean code="org.jboss.ha.singleton.HASingletonController" 
+		name="jboss:service=ExampleHASingletonController"> 
+		 
+		 <!-- Inject a ref to the HAPartition -->
+		 <depends optional-attribute-name="ClusterPartition" proxy-type="attribute">
+			 jboss:service=${jboss.partition.name:DefaultPartition}
+		 </depends>  
+		 <!-- Inject a ref to the service being controlled -->
+		 <depends optional-attribute-name="TargetName">
+			 jboss:service=HASingletonExample
+		 </depends>
+		 <!-- Methods to invoke when become master / stop being master -->
+		 <attribute name="TargetStartMethod">startSingleton</attribute> 
+		 <attribute name="TargetStopMethod">stopSingleton</attribute> 
+	 </mbean> 
+</server> ]]>
+</programlisting>
+
+<para>Voila! A clustered singleton service.
+</para>
+<para>
+	The obvious downside to this approach is it only works for MBeans.  Upsides are that the above example can be placed in <literal>deploy</literal> or <literal>farm</literal> and thus can be hot deployed and farmed deployed. Also, if our example service had complex, time-consuming startup requirements, those could potentially be implemented in create() or start() methods. JBoss will invoke create() and start() as soon as the service is deployed; it doesn't wait until the node becomes the master node. So, the service could be primed and ready to go, just waiting for the controller to implement startSingleton() at which point it can immediately provide service.
+</para>
+<para>
+	The jboss.ha:service=HASingletonDeployer service discussed above is itself an interesting example of using an HASingletonController.  Here is its deployment descriptor (extracted from the <literal>deploy/deploy-hasingleton-service.xml</literal> file):
+</para>
+<programlisting><![CDATA[ 
+<mbean code="org.jboss.ha.singleton.HASingletonController" 
+name="jboss.ha:service=HASingletonDeployer"> 
+ <depends optional-attribute-name="ClusterPartition" proxy-type="attribute">
+  jboss:service=${jboss.partition.name:DefaultPartition}
+ </depends>  
+ <depends optional-attributeame="TargetName">
+  jboss.system:service=MainDeployer
+ </depends> 
+ <attribute name="TargetStartMethod">deploy</attribute> 
+ <attribute name="TargetStartMethodArgument">
+  ${jboss.server.home.url}/deploy-hasingleton
+ </attribute> 
+ <attribute name="TargetStopMethod">undeploy</attribute> 
+ <attribute name="TargetStopMethodArgument">
+  ${jboss.server.home.url}/deploy-hasingleton
+ </attribute> 
+</mbean> ]]>
+</programlisting>
+	
+<para>
+	A few interesting things here. First the service being controlled is the <literal>MainDeployer</literal> service, which is the core deployment service in JBoss. That is, it's a service that wasn't written with an intent that it be controlled by an <literal>HASingletonController</literal>.  But it still works!  Second, the target start and stop methods are “deploy” and “undeploy”. No requirement that they have particular names, or even that they logically have “start” and “stop” functionality.  Here the functionality of the invoked methods is more like “do” and “undo”.  Finally, note the “<literal>TargetStart(Stop)MethodArgument</literal>” attributes. Your singleton service's start/stop methods can take an argument, in this case the location of the directory the <literal>MainDeployer</literal> should deploy/undeploy.
+</para>
+
+
+      </section>
+      
+      
+      <section><title>HASingleton deployments using a Barrier</title>
+	      <para>Services deployed normally inside deploy or farm that want to be started/stopped whenever the content of deploy-hasingleton gets deployed/undeployed, (i.e., whenever the current node becomes the master), need only specify a dependency on the Barrier mbean: 
+	</para>
+<programlisting><![CDATA[<depends>jboss.ha:service=HASingletonDeployer,type=Barrier</depends>]]> 
+</programlisting>
+		
+<para>
+	The way it works is that a BarrierController is deployed along with  the jboss.ha:service=HASingletonDeployer MBean and listens for JMX notifications from it. A BarrierController is a relatively simple Mbean that can subscribe to receive any JMX notification in the system. It uses the received notifications to control the lifecycle of a dynamically created Mbean called the Barrier.The Barrier is instantiated, registered and brought to the CREATE state when the BarrierController is deployed. After that, the BarrierController starts and stops the Barrier when matching JMX notifications are received. Thus, other services need only depend on the Barrier MBean using the usual &lt;depends&gt; tag, and they will be started and stopped in tandem with the Barrier. When the BarrierController is undeployed the Barrier is destroyed too. 
+</para>
+
+<para> This provides an alternative to the deploy-hasingleton approach in that we can use farming to distribute the service, while content in deploy-hasingleton must be copied manually on all nodes.
+</para>
+
+<para>
+	On the other hand, the barrier-dependent service will be instantiated/created (i.e., any create() method invoked) on all nodes, but only started on the master node. This is different with the deploy-hasingleton approach that will only deploy (instantiate/create/start) the contents of the deploy-hasingleton directory on one of the nodes. 
+</para>
+
+<para>So services depending on the barrier will need to make sure they do minimal or no work inside their create() step, rather they should use start() to do the work. 
+</para>
+
+<note><title>Note</title>
+	<para>The Barrier controls the start/stop of dependent services, but not their destruction, which happens only when the <literal>BarrierController</literal> is itself destroyed/undeployed. Thus using the <literal>Barrier</literal> to control services that need to be "destroyed" as part of their normal “undeploy” operation (like, for example, an <literal>EJBContainer</literal>) will not have the desired effect. 
+</para>
+</note>
+
+
+
+</section>
+      
+      
+<section><title>Determining the master node</title>
+	<para>The various clustered singleton management strategies all depend on the fact that each node in the cluster can independently react to changes in cluster membership and correctly decide whether it is now the “master node”. How is this done?
+	</para>
+	
+	<para>Prior to JBoss AS 4.2.0, the methodology for this was fixed and simple.  For each member of the cluster, the HAPartition mbean maintains an attribute called the CurrentView, which is basically an ordered list of the current members of the cluster. As nodes join and leave the cluster, JGroups ensures that each surviving member of the cluster gets an updated view. You can see the current view by going into the JMX console, and looking at the CurrentView attribute in the  <literal>jboss:service=DefaultPartition</literal> mbean. Every member of the cluster will have the same view, with the members in the same order.  
+	</para>
+	
+	<para>Let's say, for example, that we have a 4 node cluster, nodes A through D, and the current view can be expressed as {A, B, C, D}.  Generally speaking, the order of nodes in the view will reflect the order in which they joined the cluster (although this is not always the case, and should not be assumed to be the case.)
+	</para>
+	
+	<para>
+		To further our example, let's say there is a singleton service (i.e., an <literal>HASingletonController</literal>) named Foo that's deployed around the cluster, except, for whatever reason, on B.  The <literal>HAPartition</literal> service maintains across the cluster a registry of what services are deployed where, in view order. So, on every node in the cluster, the <literal>HAPartition</literal> service knows that the view with respect to the Foo service is {A, C, D} (no B).
+	</para>
+	
+	<para>
+		Whenever there is a change in the cluster topology of the Foo service, the <literal>HAPartition</literal> service invokes a callback on Foo notifying it of the new topology. So, for example, when Foo started on D, the Foo service running on A, C and D all got callbacks telling them the new view for Foo was {A, C, D}. That callback gives each node enough information to independently decide if it is now the master. The Foo service on each node does this by checking if they are the first member of the view – if they are, they are the master; if not, they're not.  Simple as that.
+	</para>
+	
+	<para>
+		If A were to fail or shutdown, Foo on C and D would get a callback with a new view for Foo of {C, D}. C would then become the master.  If A restarted, A, C and D would get a callback with a new view for Foo of {C, D, A}.  C would remain the master – there's nothing magic about A that would cause it to become the master again just because it was before.
+	</para>
+
+</section>
+
+
+
+      
+      
+      </section>
+      
+   
+      
+    </chapter>
+    
+    
+    
+    
+    <chapter id="clustering-jms">
       <title>Clustered JMS Services</title>
+      
       <para>JBoss AS 3.2.4 and above support high availability JMS (HA-JMS) services in the <literal>all</literal>
                 server configuration. In the current production release of JBoss AS, the HA-JMS service is implemented
-                as a clustered singleton fail-over service. <note><para>If you are willing to configure HA-JMS yourself, you can get it to work with earlier versions
-                        of JBoss AS. We have a customer who uses HA-JMS successfully in JBoss AS 3.0.7. Please contact
-                        JBoss support for more questions.</para></note>
-                <!-- TBD: Since the JBoss HA-JMS architecture has evolved significantly since JBoss AS 4.5.0, we will discuss two different HA-JMS architectures in separate sections below.--></para>
-      <section id="clustering-jms-singleton">
+                as a clustered singleton fail-over service. 
+	</para>
+<note>
+	<para>If you are willing to configure HA-JMS yourself, you can get it to work with earlier versions of JBoss AS. We have a customer who uses HA-JMS successfully in JBoss AS 3.0.7. Please contact JBoss support for more questions.
+	</para>
+</note>
+	<note>
+		<para>
+			The HA-JMS in JBoss AS 4.2.2 and earlier is based on the JBoss MQ messaging product. In later releases of the AS, JBoss MQ will be replaced by the newer JBoss Messaging project. JBoss Messaging's clustering implementation is considerably different from HA-JMS based on JBoss MQ; most notably it is not based on a singleton service only running on one node in the cluster.
+		</para>
+	</note>
+	
+       
+      
+	
+<section id="clustering-jms-singleton">
         <title>High Availability Singleton Fail-over</title>
-        <para>The JBoss HA-JMS service (i.e., message queues and topics) only runs on a single node (i.e., the
-                    master node) in the cluster at any given time. If that node fails, the cluster simply elects another
-                    node to run the JMS service (fail-over). This setup provides redundancy against server failures but
-                    does not reduce the work load on the JMS server node.</para>
+	<para>The JBoss HA-JMS service (i.e., message queues topics and supporting services) only runs on a single node (i.e., the master node) in the cluster at any given time. If that node fails, the cluster simply elects another node to run the JMS service, and the queues, topics and supporting services are deployed on that server (fail-over). This setup provides redundancy against server failures but does not reduce the work load on the JMS server node.</para>
         <note>
           <para>While you cannot load balance HA-JMS queues (there is only one master node that runs the
                         queues), you can load balance the MDBs that process messages from those queues (see <xref linkend="clustering-jms-loadbalanced"/>).</para>
         </note>
-<!-- 
-                Adrian mentioned that this example needs some work
-                
-            <note>
-                <para>A JBoss user contributed a custom HA-JMS provider to load balance Message Driven Bean (MDB)
-                    applications across nodes. You can download the code from the JBoss wiki at <ulink
-                        url="http://wiki.jboss.org/wiki/Wiki.jsp?page=LoadBalancedFaultTolerantMDBs"/> and following the
-                    instructions in the <literal>readme.txt</literal> file in the zip file.</para>
-            </note>
-            -->
-        <section id="clustering-jms-singleton-server">
+
+<section id="clustering-jms-singleton-server">
           <title>Server Side Configuration</title>
-          <para>To use the singleton fail-over HA-JMS service, you must configure JMS services identically on
-                        all nodes in the cluster. That includes all JMS related service MBeans and all deployed JMS
-                        applications.</para>
-          <para>The JMS server is configured to persist its data in the <literal>DefaultDS</literal>. By
-                        default, that is the embedded HSQLDB. In most cluster environments, however, all nodes need to
-                        persist data against a shared database. So, the first thing to do before you start clustered JMS
-                        is to setup a shared database for JMS. You need to do the following:</para>
+	  
+          <para>
+		  The biggest configuration difference between HA-JMS in the all configuration and the non-HA version found in the default configuration is the location of most configuration files.  For HA-JMS, most configuration files are found in the deploy-hasingleton/jms directory, not in deploy/jms.  Your queues and topics must be deployed in deploy-hasingleton (or a  subdirectory of it like deploy-hasingleton/jms.)  Application components that act as clients to HA-JMS (e.g., MDBs and other JMS clients) do not need to be deployed in  deploy-hasingleton. They should only be deployed there if you only want them running on one node in the cluster at a time.
+	  </para>
+	  <para>
+		  To use the singleton fail-over HA-JMS service, you must configure JMS services identically on all nodes in the cluster. That includes all JMS related service MBeans and all deployed queues and topics.  However, applications that use JMS (e.g., MDBs and other JMS clients) do not need to be deployed identically across the cluster.
+	  </para>
+          
+	  
+	  
+	  <para>
+		  The JMS server is configured to persist its data in the <literal>DefaultDS</literal>. By default, that is the embedded HSQLDB. However, for the HA-JMS service fail-over to work, the newly started HA-JMS server needs to be able to find the data persisted by the old server.  That's not likely to happen if the data is persisted in files written by the old servers' HSQLDB. In almost any cluster environments, all nodes need to persist data against a shared database. So, the first thing to do before you start clustered JMS is to setup a shared database for JMS. You need to do the following:
+	  </para>
+	  
           <itemizedlist>
             <listitem>
               <para>Configure <literal>DefaultDS</literal> to point to the database server of your choice.
@@ -1864,29 +2442,261 @@
                             under the <literal>server/all/deploy-hasingleton/jms</literal> directory. Despite the
                                 <literal>hsql</literal> in its name, it works with all SQL92 compliant databases,
                             including HSQL, MySQL, SQL Server, and more. It automatically uses the
-                            <literal>DefaultDS</literal> for storage, as we configured above.</para>
+                            <literal>DefaultDS</literal> for storage, which we configured above.</para>
           </note>
         </section>
+	
+	
+	
         <section id="clustering-jms-singleton-client">
-          <title>HA-JMS Client</title>
+		<title>Non-MDB HA-JMS Clients </title>
+		
           <para>The HA-JMS client is different from regular JMS clients in two important aspects.</para>
-          <itemizedlist>
+          
+	  <itemizedlist>
             <listitem>
-              <para>The HA-JMS client must obtain JMS connection factories from the HA-JNDI (the default
-                                port is 1100).</para>
+		    <para>
+			    The HA-JMS client must look up JMS connection factories as well as queues and topicsusing HA-JNDI (the default port is 1100). This ensures the factory/queue/topic can be found no matter which cluster node is running the HA-JMS server.
+		    </para>
             </listitem>
+    </itemizedlist>
+    
+	    <itemizedlist>
+		    <listitem>
+			    <para>
+				    If the client is a J2EE component (session bean or web application) running inside the AS, the lookup via HA-JNDI can be configured using the component's deployment descriptors: In the standard deployment descriptor (ejb-jar.xml or web.xml):
+			    </para>
+		    </listitem>
+	    
+    </itemizedlist>
+    
+	    
+<programlisting><![CDATA[
+<resource-ref>
+	 <res-ref-name>jms/ConnectionFactory</res-ref-name>
+	 <res-type>javax.jms.QueueConnectionFactory</res-type>
+	 <res-auth>Container</res-auth>
+</resource-ref>
+	 
+<resource-ref>
+	 <res-ref-name>jms/Queue</res-ref-name>
+	 <res-type>javax.jms.Queue</res-type>
+	 <res-auth>Container</res-auth>
+</resource-ref>
+]]></programlisting>
+
+<para>
+And in the JBoss-specific descriptor (jboss.xml or jboss-web.xml):
+</para>
+
+<programlisting><![CDATA[ 
+<resource-ref>
+ 	<res-ref-name>jms/ConnectionFactory</res-ref-name>
+	<!-- Use the JMS Resource Adapter, let it deal
+	 with knowing where the JMS server is -->
+	<jndi-name>java:/JmsXA</jndi-name>
+ </resource-ref>
+ 
+<resource-ref>
+	 <res-ref-name>jms/Queue</res-ref-name>
+	 <!-- Use HA-JNDI so we can find the queue on any node -->
+	 <jndi-name>jnp://localhost:1100/queue/A</jndi-name>
+</resource-ref>]]>
+</programlisting>
+
+    			
+	    <itemizedlist>
             <listitem>
-              <para>The client connection must listens for server exceptions. When the cluster fail-over
-                                to a different master node, all client operations on the current connection fails with
-                                exceptions. The client must know to re-connect.</para>
+		    <para>
+			    The HA-JMS client must deal with exceptions that will occur on the JMS connection if server failover occurs.  Unlike, for example, clustered EJB proxies, the JMS connection object does not include automatic failover logic. If the HA-JMS service fails over to a different master node, all client operations on the current connection will fail with a JMSException. To deal with this:
+	    		</para>
             </listitem>
-          </itemizedlist>
-          <note>
-            <para>While the HA-JMS connection factory knows the current master node that runs JMS services,
-                            there is no smart client side interceptor. The client stub only knows the fixed master node
-                            and cannot adjust to server topography changes.</para>
-          </note>
+    </itemizedlist>
+    
+	    <itemizedlist>
+		    <listitem>
+			    <para>If the client is running inside the application server, the client should obtain the ConnectionFactory by looking up java:/JmsXAin JNDI.  This will find the JBoss JMS Resource Adapter; the resource adapter will handle the task of detecting server failover and reconnecting to the new server when it starts.
+			    </para>
+		    </listitem>
+		    <listitem>
+			    <para>For clients outside the application server, the best approach is to register an ExceptionListener with the connection; the listener will get a callback if there is an exception on the connection.  The callback should then handle the task of closing the old connection and reconnecting. Following is a example application that continuously sends messages to a queue, handling any exceptions that occur: 
+			    </para>
+		    </listitem>
+	    </itemizedlist>
+	    
+	    
+<programlisting><![CDATA[
+package com.test.hajms.client;
+
+import javax.naming.InitialContext;
+import javax.jms.ConnectionFactory;
+import javax.jms.Destination;
+import javax.jms.Connection;
+import javax.jms.Session;
+import javax.jms.MessageProducer;
+import javax.jms.Message;
+import javax.jms.ExceptionListener;
+import javax.jms.JMSException;
+import javax.jms.DeliveryMode;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+ 
+public class FailoverJMSClient
+{
+private static final Log log = LogFactory.getLog(FailoverJMSClient.class);
+
+public static final int NUM_RETRIES = 3;
+
+volatile boolean doSend = true;
+ConnectionFactory connectionFactory;
+Destination queue;
+Connection connection;
+Session session;
+MessageProducer producer;
+
+
+public static void main(String[] args) throws Exception
+{
+FailoverJMSClient jmsClient = new FailoverJMSClient();
+jmsClient.setUpJMS();
+jmsClient.sendMessages();
+}
+
+
+public boolean setUpJMS()
+{
+InitialContext ic;
+try
+{
+// assume jndi.properties is configured for HA-JNDI
+ic = new InitialContext();
+connectionFactory = (ConnectionFactory)ic.lookup("ConnectionFactory");
+queue = (Destination)ic.lookup("queue/FailoverTestQueue");
+connection = connectionFactory.createConnection();
+try
+{
+log.debug("Connection created ...");
+
+// KEY - register for exception callbacks
+connection.setExceptionListener(new ExceptionListenerImpl());
+
+session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
+log.debug("Session created ...");
+producer = session.createProducer(queue);
+
+producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT);
+log.debug("Producer created ...");
+
+return true;
+}
+catch (Exception e)
+{
+// We failed so close the connection
+try
+{
+connection.close();
+}
+catch (JMSException ignored)
+{
+// Pointless
+}
+// Rethrow the initial problem to where we will log it
+throw e;
+} 
+finally
+{
+// And close the initial context
+// We don't want to wait for the garbage collector to close it
+// otherwise we'll have useless hanging network connections
+ic.close();
+}
+}
+catch (Exception e)
+{
+log.error("Error setting up JMS", e);
+return false;
+}
+}
+
+public void sendMessages()
+{
+int cnt = 0;
+while(doSend)
+{
+try
+{
+Thread.sleep(100);
+
+Message m = session.createObjectMessage(new Integer(cnt++));
+producer.send(m);
+
+log.trace("message " + cnt + " sent");
+
+}
+catch(Exception e)
+{
+cnt--;
+log.error(e.getMessage());
+}
+}
+}
+
+
+
+private class ExceptionListenerImpl implements ExceptionListener
+{
+public void onException(JMSException e)
+{
+			 
+for(int i = 0; i < NUM_RETRIES; i++)
+	    {
+	    log.warn("Connection has problems, trying to re-create it, attempt " +
+	    (i + 1) + " ...");
+	    
+	    try 
+	    {
+	    connection.close();  // unregisters the ExceptionListener
+	    }
+	    catch(Exception e2) {
+	    // I will get an Exception anyway, since the connection to the                     
+	    //server is broken, but close() frees up resources associated 
+	    // with the connection
+	    }
+	    
+	    boolean setupOK = setUpJMS();
+	    
+	    if (setupOK)
+	    {
+	    log.info("Connection re-established");
+	    return;
+	    }
+	    else
+	    {
+	    log.warn("Re-creating connection failed, retrying ...");
+	   }
+	    }
+	    
+	    log.error("Cannot re-establish connection, giving up ...");
+	    doSend = false;
+	    }
+	    }
+}
+]]>
+</programlisting>
+			 
+
+
+<section><title>MDBs and HA-JMS Failover</title>
+	<para>
+		When you deploy an MDB in JBoss, JBoss' MDB container handles for you all issues associated with finding the cluster singleton HA-JMS server and with reconnecting to it if it fails over.
+	</para>
+</section>
+
         </section>
+	
+	
+	
         <section id="clustering-jms-loadbalanced">
           <title>Load Balanced HA-JMS MDBs</title>
           <para>While the HA-JMS queues and topics only run on a single node at a time, MDBs on multiple nodes
@@ -1900,7 +2710,7 @@
                                 using a <literal>HashSet</literal>.</para>
             </listitem>
             <listitem>
-              <para>The <literal>org.jboss.mq.server.ReceiversImplArrayList</literal> is theimplementation
+              <para>The <literal>org.jboss.mq.server.ReceiversImplArrayList</literal> is the implementation
                                 using an <literal>ArrayList</literal>.</para>
             </listitem>
             <listitem>
@@ -1912,22 +2722,23 @@
                         defines the permanent JMS <literal>Queue</literal> or <literal>DestinationManager</literal> on
                         each node. For best load balancing performance, we suggest you
                         to use the <literal>ReceiversImplArrayList</literal> or
-                        <literal>ReceiversImplArrayList</literal> implementations due to an undesirable implementation
+                        <literal>ReceiversImplLinkedList</literal> implementations due to an undesirable implementation
                         detail of <literal>HashSet</literal> in the JVM.</para>
         </section>
       </section>
-    </section>
+
   </chapter>
+  
   <chapter id="jbosscache.chapt">
     <title>JBossCache and JGroups Services</title>
     <para>JGroups and JBossCache provide the underlying communication, node replication and caching services, for
             JBoss AS clusters. Those services are configured as MBeans. There is a set of JBossCache and JGroups MBeans
-            for each type of clustering applications (e.g., the Stateful Session EJBs, the distributed entity EJBs
-            etc.). <!-- May not be true from version XXX -->
+	    for each type of clustering applications (e.g., the Stateful Session EJBs, HTTP session replication etc.).
         </para>
     <para>The JBoss AS ships with a reasonable set of default JGroups and JBossCache MBean configurations. Most
             applications just work out of the box with the default MBean configurations. You only need to tweak them
             when you are deploying an application that has special network or performance requirements.</para>
+    
     <section id="jbosscache-jgroups">
       <title>JGroups Configuration</title>
       <para>The JGroups framework provides services to enable peer-to-peer communications between nodes in a
@@ -1946,48 +2757,87 @@
                     <literal>ClusterConfig</literal> attribute in the <literal>TreeCache</literal> MBean. You can
                 configure the behavior and properties of each protocol in JGroups via those MBean attributes. Below is
                 an example JGroups configuration in the <literal>ClusterPartition</literal> MBean.</para>
-      <programlisting>
-&lt;mbean code="org.jboss.ha.framework.server.ClusterPartition"
-    name="jboss:service=DefaultPartition"&gt;
+<programlisting><![CDATA[
+<mbean code="org.jboss.ha.framework.server.ClusterPartition"
+	name="jboss:service=${jboss.partition.name:DefaultPartition}">
+	 
+	 ... ...
+	 
+	 <attribute name="PartitionConfig">
+		 <Config>
+			 
+			 <UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" 
+			      mcast_port="${jboss.hapartition.mcast_port:45566}"
+			      tos="8"
+			      ucast_recv_buf_size="20000000"
+			      ucast_send_buf_size="640000"
+			      mcast_recv_buf_size="25000000"
+			      mcast_send_buf_size="640000"
+			      loopback="false"
+			      discard_incompatible_packets="true"
+			      enable_bundling="false"
+			      max_bundle_size="64000"
+			      max_bundle_timeout="30"
+			      use_incoming_packet_handler="true"
+			      use_outgoing_packet_handler="false"
+			      ip_ttl="${jgroups.udp.ip_ttl:2}"
+			      down_thread="false" up_thread="false"/>
+			 
+			 <PING timeout="2000"
+			       down_thread="false" up_thread="false" num_initial_members="3"/>
+			 
+			 <MERGE2 max_interval="100000"
+				 down_thread="false" up_thread="false" min_interval="20000"/>
+			 <FD_SOCK down_thread="false" up_thread="false"/>
+			 
+			 <FD timeout="10000" max_tries="5" 
+			     down_thread="false" up_thread="false" shun="true"/>
+			 <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
+			 <pbcast.NAKACK max_xmit_size="60000"
+					use_mcast_xmit="false" gc_lag="0"
+					retransmit_timeout="300,600,1200,2400,4800"
+					down_thread="false" up_thread="false"
+					discard_delivered_msgs="true"/>
+			 <UNICAST timeout="300,600,1200,2400,3600"
+				  down_thread="false" up_thread="false"/>
+			 <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
+					down_thread="false" up_thread="false"
+					max_bytes="400000"/>
+			 <pbcast.GMS print_local_addr="true" join_timeout="3000"
+				     down_thread="false" up_thread="false"
+				     join_retry_timeout="2000" shun="true"
+				     view_bundling="true"/>
+			 <FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
+			 <pbcast.STATE_TRANSFER down_thread="false" 
+						up_thread="false" use_flush="false"/>
+		 </Config>
+	 </attribute>
+</mbean> ]]>
+</programlisting>
+<para>
+	All the JGroups configuration data is contained in the &lt;Config&gt; element under the JGroups config MBean attribute. This information is used to configure a JGroups Channel; the Channel is conceptually similar to a socket, and manages communication between peers in a cluster. Each element inside the &lt;Config&gt; element defines a particular JGroups Protocol; each Protocol performs one function, and the combination of those functions is what defines the characteristics of the overall Channel. In the next several sections, we will dig into the commonly used protocols and their options and explain exactly what they mean.
+</para>
+</section>
 
-    ... ...
-    
-    &lt;attribute name="PartitionConfig"&gt;
-        &lt;Config&gt;
-            &lt;UDP mcast_addr="228.1.2.3" mcast_port="45566"
-               ip_ttl="8" ip_mcast="true"
-               mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
-               ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
-               loopback="false"/&gt;
-            &lt;PING timeout="2000" num_initial_members="3"
-               up_thread="true" down_thread="true"/&gt;
-            &lt;MERGE2 min_interval="10000" max_interval="20000"/&gt;
-            &lt;FD shun="true" up_thread="true" down_thread="true"
-               timeout="2500" max_tries="5"/&gt;
-            &lt;VERIFY_SUSPECT timeout="3000" num_msgs="3"
-               up_thread="true" down_thread="true"/&gt;
-            &lt;pbcast.NAKACK gc_lag="50"
-               retransmit_timeout="300,600,1200,2400,4800"
-               max_xmit_size="8192"
-               up_thread="true" down_thread="true"/&gt;
-            &lt;UNICAST timeout="300,600,1200,2400,4800" 
-               window_size="100" min_threshold="10"
-               down_thread="true"/&gt;
-            &lt;pbcast.STABLE desired_avg_gossip="20000"
-               up_thread="true" down_thread="true"/&gt;
-            &lt;FRAG frag_size="8192"
-               down_thread="true" up_thread="true"/&gt;
-            &lt;pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
-               shun="true" print_local_addr="true"/&gt;
-            &lt;pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/&gt;
-        &lt;/Config&gt;
-    &lt;/attribute&gt;
-&lt;/mbean&gt;
-        </programlisting>
-      <para>All the JGroups configuration data is contained in the <literal>&lt;Config&gt;</literal>
-                element under the JGroups config MBean attribute. In the next several sections, we will dig into the
-                options in the <literal>&lt;Config&gt;</literal> element and explain exactly what they mean.</para>
-      <section id="jbosscache-jgroups-transport">
+<section><title>Common Configuration Properties</title>
+	<para>The following common properties are exposed by all of the JGroups protocols discussed below:
+</para>
+<itemizedlist>
+	<listitem>
+		<para><literal>down_thread</literal> whether the protocol should create an internal queue and a queue processing thread (aka the down_thread) for messages passed down from higher layers. The higher layer could be another protocol higher in the stack, or the application itself, if the protocol is the top one on the stack. If true (the default), when a message is passed down from a higher layer, the calling thread places the message in the protocol's queue, and then returns immediately. The protocol's down_thread is responsible for reading messages off the queue, doing whatever protocol-specific processing is required, and passing the message on to the next protocol in the stack. 
+		</para>
+	</listitem>
+	<listitem>
+		<para><literal>up_thread</literal> is conceptually similar to down_thread, but here the queue and thread are for messages received from lower layers in the protocol stack. 
+		</para>
+	</listitem>
+</itemizedlist>
+<para>Generally speaking, <literal>up_thread</literal> and <literal>down_thread</literal> should be set to false.
+</para>
+
+</section>
+
+<section id="jbosscache-jgroups-transport">
         <title>Transport Protocols</title>
         <para>The transport protocols send messages from one cluster node to another (unicast) or from cluster
                     node to all other nodes in the cluster (mcast). JGroups supports UDP, TCP, and TUNNEL as transport
@@ -2003,64 +2853,73 @@
                         receive messages. If you choose UDP as the transport protocol for your cluster service, you need
                         to configure it in the <literal>UDP</literal> sub-element in the JGroups
                         <literal>Config</literal> element. Here is an example.</para>
-          <programlisting>
-&lt;UDP mcast_send_buf_size="32000"
-    mcast_port="45566"
-    ucast_recv_buf_size="64000"
-    mcast_addr="228.8.8.8"
-    bind_to_all_interfaces="true"
-    loopback="true"
-    mcast_recv_buf_size="64000"
-    max_bundle_size="30000"
-    max_bundle_timeout="30"
-    use_incoming_packet_handler="false"
-    use_outgoing_packet_handler="false"
-    ucast_send_buf_size="32000"
-    ip_ttl="32"
-    enable_bundling="false"/&gt;
-                </programlisting>
+<programlisting><![CDATA[
+<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" 
+     mcast_port="${jboss.hapartition.mcast_port:45566}"
+     tos="8"
+     ucast_recv_buf_size="20000000"
+     ucast_send_buf_size="640000"
+     mcast_recv_buf_size="25000000"
+     mcast_send_buf_size="640000"
+     loopback="false"
+     discard_incompatible_packets="true"
+     enable_bundling="false"
+     max_bundle_size="64000"
+     max_bundle_timeout="30"
+     use_incoming_packet_handler="true"
+     use_outgoing_packet_handler="false"
+     ip_ttl="${jgroups.udp.ip_ttl:2}"
+ down_thread="false" up_thread="false"/>]]>
+</programlisting>
+
+
           <para>The available attributes in the above JGroups configuration are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">ip_mcast</emphasis> specifies whether or not to use IP
-                                multicasting. The default is <literal>true</literal>.</para>
+		      multicasting. The default is <literal>true</literal>. If set to false, it will send n unicast packets rather than 1 multicast packet. Either way, packets are UDP datagrams.
+	      </para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">mcast_addr</emphasis> specifies the multicast address (class D)
-                                for joining a group (i.e., the cluster). The default is
-                            <literal>228.8.8.8</literal>.</para>
+              <para><emphasis role="bold">mcast_addr</emphasis> specifies the multicast address (class D) for joining a group (i.e., the cluster). If omitted, the default is <literal>228.8.8.8
+		</literal>.
+			</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">mcast_port</emphasis> specifies the multicast port number. The
+              <para><emphasis role="bold">mcast_port</emphasis> specifies the multicast port number. If omitted, the
                                 default is <literal>45566</literal>.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">bind_addr</emphasis> specifies the interface on which to receive
-                                and send multicasts (uses the <literal>bind.address</literal> system property, if
-                                present). If you have a multihomed machine, set the <literal>bind_addr</literal>
-                                attribute to the appropriate NIC IP address. Ignored if the
-                                <literal>ignore.bind.address</literal> property is true.</para>
+		    <para><emphasis role="bold">bind_addr</emphasis> specifies the interface on which to receive and send multicasts (uses the <literal>-Djgroups.bind_address</literal> system property, if present). If you have a multihomed machine, set the <literal>bind_addr</literal> attribute or system property to the appropriate NIC IP address. By default, system property setting takes priority over XML attribute unless -Djgroups.ignore.bind_addr system property is set.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">bind_to_all_interfaces</emphasis> specifies whether this node
+		    <para><emphasis role="bold">receive_on_all_interfaces </emphasis> specifies whether this node
                                 should listen on all interfaces for multicasts. The default is <literal>false</literal>.
                                 It overrides the <literal>bind_addr</literal> property for receiving multicasts.
                                 However, <literal>bind_addr</literal> (if set) is still used to send multicasts.</para>
             </listitem>
+	    <listitem><para><emphasis role="bold">send_on_all_interfaces</emphasis> specifies whether this node send UDP packets via all the NICs if you have a multi NIC machine. This means that the same multicast message is sent N times, so use with care.
+			    </para>
+		    </listitem>
+	    
+		    <listitem>
+			    <para><emphasis role="bold">receive_interfaces</emphasis> specifies a list of of interfaces to receive multicasts on. The multicast receive socket will listen on all of these interfaces. This is a comma-separated list of IP addresses or interface names. E.g. "<literal>192.168.5.1,eth1,127.0.0.1</literal>".
+			    </para>
+		    </listitem>   
+		    
+	    
             <listitem>
-              <para><emphasis role="bold">ip_ttl</emphasis> specifies the TTL for multicast
-                            packets.</para>
+		    <para><emphasis role="bold">ip_ttl</emphasis> specifies time-to-live for IP Multicast packets. TTL is the commonly used term in multicast networking, but is actually something of a misnomer, since the value here refers to how many network hops a packet will be allowed to travel before networking equipment will drop it.
+		    </para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">use_incoming_packet_handler</emphasis> specifies whether to use
-                                a separate thread to process incoming messages.</para>
+		    <para><emphasis role="bold">use_incoming_packet_handler</emphasis> specifies whether to use a separate thread to process incoming messages. Sometimes receivers are overloaded (they have to handle de-serialization etc). Packet handler is a separate thread taking care of de-serialization, receiver thread(s) simply put packet in queue and return immediately. Setting this to true adds one more thread. The default is <literal>true</literal>.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">use_outgoing_packet_handler</emphasis> specifies whether to use
-                                a separate thread to process outgoing messages.</para>
+		    <para><emphasis role="bold">use_outgoing_packet_handler</emphasis> specifies whether to use a separate thread to process outgoing messages. The default is false.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">enable_bundling</emphasis> specifies whether to enable bundling.
+              <para><emphasis role="bold">enable_bundling</emphasis> specifies whether to enable message bundling.
                                 If it is <literal>true</literal>, the node would queue outgoing messages until
                                     <literal>max_bundle_size</literal> bytes have accumulated, or
                                     <literal>max_bundle_time</literal> milliseconds have elapsed, whichever occurs
@@ -2070,13 +2929,13 @@
             <listitem>
               <para><emphasis role="bold">loopback</emphasis> specifies whether to loop outgoing message
                                 back up the stack. In <literal>unicast</literal> mode, the messages are sent to self. In
-                                    <literal>mcast</literal> mode, a copy of the mcast message is sent.</para>
+				<literal>mcast</literal> mode, a copy of the mcast message is sent. The default is <literal>false</literal></para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">discard_incompatibe_packets</emphasis> specifies whether to
                                 discard packets from different JGroups versions. Each message in the cluster is tagged
                                 with a JGroups version. When a message from a different version of JGroups is received,
-                                it will be discarded if set to true, otherwise a warning will be logged.</para>
+				it will be discarded if set to true, otherwise a warning will be logged. The default is <literal>false</literal></para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">mcast_send_buf_size, mcast_recv_buf_size, ucast_send_buf_size,
@@ -2084,6 +2943,11 @@
                                 have a large receiver buffer size, so packets are less likely to get dropped due to
                                 buffer overflow.</para>
             </listitem>
+	    <listitem>
+		    <para><literal>tos</literal> specifies traffic class for sending unicast and multicast datagrams.
+		    </para>
+	    </listitem>
+	    
           </itemizedlist>
           <note>
             <para>On Windows 2000 machines, because of the media sense feature being broken with multicast
@@ -2091,10 +2955,12 @@
                             <literal>loopback</literal> attribute to <literal>true</literal>.</para>
           </note>
         </section>
+	
+	
         <section id="jbosscache-jgroups-transport-tcp">
           <title>TCP configuration</title>
           <para>Alternatively, a JGroups-based cluster can also work over TCP connections. Compared with UDP,
-                        TCP generates more network traffic when the cluster size increases but TCP is more reliable. TCP
+                        TCP generates more network traffic when the cluster size increases. TCP
                         is fundamentally a unicast protocol. To send multicast messages, JGroups uses multiple TCP
                         unicasts. To use TCP as a transport protocol, you should define a <literal>TCP</literal> element
                         in the JGroups <literal>Config</literal> element. Here is an example of the
@@ -2102,31 +2968,29 @@
           <programlisting>
 &lt;TCP start_port="7800"
     bind_addr="192.168.5.1"
-    loopback="true"/&gt;
+    loopback="true"
+    down_thread="false" up_thread="false"/&gt;
                 </programlisting>
           <para>Below are the attributes available in the <literal>TCP</literal> element.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">bind_addr</emphasis> specifies the binding address. It can also
-                                be set with the <literal>-Dbind.address</literal> command line option at server
+		      be set with the <literal>-Djgroups.bind_address</literal> command line option at server
                             startup.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">start_port, end_port</emphasis> define the range of TCP ports
                                 the server should bind to. The server socket is bound to the first available port from
                                     <literal>start_port</literal>. If no available port is found (e.g., because of a
-                                firewall) before the <literal>end_port</literal>, the server throws an exception.</para>
+				    firewall) before the <literal>end_port</literal>, the server throws an exception. If no <literal>end_port</literal> is provided or <literal>end_port &lt; start_port</literal> then there is no upper limit on the port range. If <literal>start_port == end_port</literal>, then we force JGroups to use the given port (start fails if port is not available). The default is 7800. If set to 0, then the operating system will pick a port. Please, bear in mind that setting it to 0 will work only if we use MPING or TCPGOSSIP as discovery protocol because <literal>TCCPING</literal> requires listing the nodes and their corresponding ports.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">loopback</emphasis> specifies whether to loop outgoing message
                                 back up the stack. In <literal>unicast</literal> mode, the messages are sent to self. In
-                                    <literal>mcast</literal> mode, a copy of the mcast message is sent.</para>
+                                    <literal>mcast</literal> mode, a copy of the mcast message is sent. The default is false.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">mcast_send_buf_size, mcast_recv_buf_size, ucast_send_buf_size,
-                                    ucast_recv_buf_size</emphasis> define receive and send buffer sizes. It is good to
-                                have a large receiver buffer size, so packets are less likely to get dropped due to
-                                buffer overflow.</para>
+		    <para><emphasis role="bold">recv_buf_size, send_buf_size</emphasis> define receive and send buffer sizes. It is good to have a large receiver buffer size, so packets are less likely to get dropped due to buffer overflow.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">conn_expire_time</emphasis> specifies the time (in milliseconds)
@@ -2136,25 +3000,47 @@
             <listitem>
               <para><emphasis role="bold">reaper_interval</emphasis> specifies interval (in milliseconds)
                                 to run the reaper. If both values are 0, no reaping will be done. If either value is
-                                &gt; 0, reaping will be enabled.</para>
+				&gt; 0, reaping will be enabled. By default, reaper_interval is 0, which means no reaper.</para>
             </listitem>
-          </itemizedlist>
+	    <listitem>
+		    <para><emphasis role="bold">sock_conn_timeout</emphasis> specifies max time in millis for a socket creation. When doing the initial discovery, and a peer hangs, don't wait forever but go on after the timeout to ping other members. Reduces chances of *not* finding any members at all. The default is 2000.</para>
+    </listitem>
+	    <listitem>
+		    <para><emphasis role="bold">use_send_queues</emphasis> specifies whether to use separate send queues for each connection. This prevents blocking on write if the peer hangs. The default is true.</para>
+		      </listitem>
+      <listitem>
+	      <para><emphasis role="bold">external_addr</emphasis> specifies external IP address to broadcast to other group members (if different to local address). This is useful when you have use (Network Address Translation) NAT, e.g. a node on a private network, behind a firewall, but you can only route to it via an externally visible address, which is different from the local address it is bound to. Therefore, the node can be configured to broadcast its external address, while still able to bind to the local one. This avoids having to use the TUNNEL protocol, (and hence a requirement for a central gossip router) because nodes outside the firewall can still route to the node inside the firewall, but only on its external address. Without setting the external_addr, the node behind the firewall will broadcast its private address to the other nodes which will not be able to route to it.</para>
+		      </listitem>
+      <listitem>
+	      <para><emphasis role="bold">skip_suspected_members</emphasis> specifies whether unicast messages should not be sent to suspected members. The default is true.</para>
+		     </listitem>
+	<listitem>
+		<para><emphasis role="bold">tcp_nodelay</emphasis> specifies TCP_NODELAY. TCP by default nagles messages, that is, conceptually, smaller messages are bundled into larger ones. If we want to invoke synchronous cluster method calls, then we need to disable nagling in addition to disabling message bundling (by setting <literal>enable_bundling</literal> to false). Nagling is disabled by setting <literal>tcp_nodelay</literal> to true. The default is false.
+		</para>
+		     </listitem>
+	    
+    </itemizedlist>
         </section>
+	
+	
+	
+	
+	
         <section id="jbosscache-jgroups-transport-tunnel">
           <title>TUNNEL configuration</title>
           <para>The TUNNEL protocol uses an external router to send messages. The external router is known as
-                        a <literal>GossipRouter</literal>. Each node has to register with the router. All messages are
-                        sent to the router and forwarded on to their destinations. The TUNNEL approach can be used to
-                        setup communication with nodes behind firewalls. A node can establish a TCP connection to the
-                        GossipRouter through the firewall (you can use port 80). The same connection is used by the
-                        router to send messages to nodes behind the firewall. The TUNNEL configuration is defined in the
-                            <literal>TUNNEL</literal> element in the JGroups <literal>Config</literal> element. Here is
-                        an example.</para>
-          <programlisting>
+		  a <literal>GossipRouter</literal>. Each node has to register with the router. All messages are sent to the router and forwarded on to their destinations. The TUNNEL approach can be used to setup communication with nodes behind firewalls. A node can establish a TCP connection to the GossipRouter through the firewall (you can use port 80). The same connection is used by the router to send messages to nodes behind the firewall as most firewalls do not permit outside hosts to initiate a TCP connection to a host inside the firewall. The TUNNEL configuration is defined in the TUNNEL element in the JGroups Config element. Here is an example..
+	  </para>
+          
+	  
+	  <programlisting>
 &lt;TUNNEL router_port="12001"
-    router_host="192.168.5.1"/&gt;
+    router_host="192.168.5.1"
+    down_thread="false" up_thread="false/&gt;
                 </programlisting>
-          <para>The available attributes in the <literal>TUNNEL</literal> element are listed below.</para>
+
+		
+<para>The available attributes in the <literal>TUNNEL</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">router_host</emphasis> specifies the host on which the
@@ -2171,34 +3057,57 @@
           </itemizedlist>
         </section>
       </section>
+      
+      
+      
+      
       <section id="jbosscache-jgroups-discovery">
         <title>Discovery Protocols</title>
-        <para>The cluster need to maintain a list of current member nodes at all times so that the load balancer
-                    and client interceptor know how to route their requests. The discovery protocols are used to
-                    discover active nodes in the cluster. All initial nodes are discovered when the cluster starts up.
+	<para>
+		The cluster needs to maintain a list of current member nodes at all times so that the load balancer and client interceptor know how to route their requests. Discovery protocols are used to discover active nodes in the cluster and detect the oldest member of the cluster, which is the coordinator. All initial nodes are discovered when the cluster starts up.
                     When a new node joins the cluster later, it is only discovered after the group membership protocol
                     (GMS, see <xref linkend="jbosscache-jgroups-other-gms"/>) admits it into the group.</para>
-        <para>Since the discovery protocols sit on top of the transport protocol. You can choose to use
-                    different discovery protocols based on your transport protocol. The discovery protocols are also
-                    configured as sub-elements in the JGroups MBean <literal>Config</literal> element.</para>
+	    <para>Since the discovery protocols sit on top of the transport protocol, you can choose to use different discovery protocols based on your transport protocol. These are also configured as sub-elements in the JGroups MBean <literal>Config</literal> element.</para>
+	    
+	    
+	    
+	    
         <section id="jbosscache-jgroups-discovery-ping">
           <title>PING</title>
-          <para>The PING discovery protocol normally sits on top of the UDP transport protocol. Each node
-                        responds with a unicast UDP datagram back to the sender. Here is an example PING configuration
-                        under the JGroups <literal>Config</literal> element.</para>
+	  <para>
+		  PING is a discovery protocol that works by either multicasting PING requests to an IP multicast address or connecting to a gossip router. As such, PING normally sits on top of the UDP or TUNNEL transport protocols. Each node responds with a packet {C, A}, where C=coordinator's address and A=own address. After timeout milliseconds or num_initial_members replies, the joiner determines the coordinator from the responses, and sends a JOIN request to it (handled by). If nobody responds, we assume we are the first member of a group.
+	  </para>
+	  <para>Here is an example PING configuration for IP multicast. 
+	  </para>
+	  
+	  
           <programlisting>
 &lt;PING timeout="2000"
-    num_initial_members="2"/&gt;
+    num_initial_members="2"
+    down_thread="false" up_thread="false"/&gt;
                 </programlisting>
+<para>
+	Here is another example PING configuration for contacting a Gossip Router.
+<programlisting><![CDATA[	
+<PING gossip_host="localhost"
+      gossip_port="1234"
+	      timeout="3000" 
+	      num_initial_members="3"
+	      down_thread="false" up_thread="false"/>]]>
+</programlisting>
+
+	</para>
+		
+		
           <para>The available attributes in the <literal>PING</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-                                to wait for any responses.</para>
+		      to wait for any responses. The default is 3000.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-                                responses to wait for.</para>
+		      responses to wait for unless timeout has expired. The default is 2.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">gossip_host</emphasis> specifies the host on which the
@@ -2210,7 +3119,7 @@
             </listitem>
             <listitem>
               <para><emphasis role="bold">gossip_refresh</emphasis> specifies the interval (in
-                                milliseconds) for the lease from the GossipRouter.</para>
+		      milliseconds) for the lease from the GossipRouter. The default is 20000.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
@@ -2227,26 +3136,31 @@
                                 <literal>num_initial_members</literal> responses have been received.</para>
           </note>
         </section>
+	
+	
+	
         <section id="jbosscache-jgroups-discovery-tcpgossip">
           <title>TCPGOSSIP</title>
           <para>The TCPGOSSIP protocol only works with a GossipRouter. It works essentially the same way as
                         the PING protocol configuration with valid <literal>gossip_host</literal> and
-                            <literal>gossip_port</literal> attributes. It works on top of both UDP and TCP transport
-                        protocols. Here is an example.</para>
-          <programlisting>
-&lt;PING timeout="2000"
-    initial_hosts="192.168.5.1[12000],192.168.0.2[12000]"
-    num_initial_members="3"/&gt;
-                </programlisting>
+			<literal>gossip_port</literal> attributes. It works on top of both UDP and TCP transport protocols. Here is an example.</para>
+<programlisting><![CDATA[
+<TCPGOSSIP timeout="2000"
+	    initial_hosts="192.168.5.1[12000],192.168.0.2[12000]"
+	    num_initial_members="3"
+  down_thread="false" up_thread="false"/>]]>
+</programlisting>
+
+
           <para>The available attributes in the <literal>TCPGOSSIP</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-                                to wait for any responses.</para>
+		      to wait for any responses. The default is 3000.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-                                responses to wait for.</para>
+		      responses to wait for unless timeout has expired. The default is 2.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
@@ -2255,6 +3169,9 @@
             </listitem>
           </itemizedlist>
         </section>
+	
+	
+	
         <section id="jbosscache-jgroups-discovery-tcpping">
           <title>TCPPING</title>
           <para>The TCPPING protocol takes a set of known members and ping them for discovery. This is
@@ -2263,55 +3180,64 @@
                         element.</para>
           <programlisting>
 &lt;TCPPING timeout="2000"
-    initial_hosts="192.168.5.1[7800],192.168.0.2[7800]"
-    port_range="2"
-    num_initial_members="3"/&gt;
-                </programlisting>
+	initial_hosts="hosta[2300],hostb[3400],hostc[4500]"
+	port_range="3"
+	num_initial_members="3"
+         down_thread="false" up_thread="false"/&gt;
+</programlisting>
+
+
           <para>The available attributes in the <literal>TCPPING</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-                                to wait for any responses.</para>
+		      to wait for any responses. The default is 3000.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-                                responses to wait for.</para>
+		      responses to wait for unless timeout has expired. The default is 2.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
                                 (e.g., <literal>host1[12345],host2[23456]</literal>) for pinging.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">port_range</emphasis> specifies the range of ports to ping on
-                                each host in the <literal>initial_hosts</literal> list. That is because multiple nodes
-                                can run on the same host. In the above example, the cluster would ping ports 7800, 7801,
-                                and 7802 on both hosts.</para>
+		    <para>
+			    <emphasis role="bold">port_range</emphasis> specifies the number of consecutive ports to be probed when getting the initial membership, starting with the port specified in the initial_hosts parameter. Given the current values of port_range and initial_hosts above, the TCPPING layer will try to connect to hosta:2300, hosta:2301, hosta:2302, hostb:3400, hostb:3401, hostb:3402, hostc:4500, hostc:4501, hostc:4502. The configuration options allows for multiple nodes on the same host to be pinged.
+		    </para>
             </listitem>
           </itemizedlist>
         </section>
+	
+	
+	
+	
         <section id="jbosscache-jgroups-discovery-mping">
           <title>MPING</title>
-          <para>The MPING protocol is a multicast ping over TCP. It works almost the same way as PING works on
-                        UDP. It does not require external processes (GossipRouter) or static configuration (initial host
-                        list). Here is an example of the <literal>MPING</literal> configuration element in the JGroups
-                            <literal>Config</literal> element.</para>
-          <programlisting>
+	  <para>
+		  MPING uses IP multicast to discover the initial membership. It can be used with all transports, but usually this is used in combination with TCP. TCP usually requires TCPPING, which has to list all group members explicitly, but MPING doesn't have this requirement. The typical use case for this is when we want TCP as transport, but multicasting for discovery so we don't have to define a static list of initial hosts in TCPPING or require external Gossip Router. 
+	</para>
+
+<programlisting>
 &lt;MPING timeout="2000"
     bind_to_all_interfaces="true"
     mcast_addr="228.8.8.8"
     mcast_port="7500"
     ip_ttl="8"
-    num_initial_members="3"/&gt;
-                </programlisting>
+    num_initial_members="3"
+    down_thread="false" up_thread="false"/&gt;
+</programlisting>
+
+
           <para>The available attributes in the <literal>MPING</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-                                to wait for any responses.</para>
+		      to wait for any responses. The default is 3000.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-                                responses to wait for.</para>
+		      responses to wait for unless timeout has expired. The default is 2..</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">bind_addr</emphasis> specifies the interface on which to send
@@ -2328,37 +3254,44 @@
           </itemizedlist>
         </section>
       </section>
+      
+      
+      
       <section id="jbosscache-jgroups-fd">
         <title>Failure Detection Protocols</title>
-        <para>The failure detection protocols are used to detect failed nodes. Once a failed node is detected,
-                    the cluster updates its view so that the load balancer and client interceptors know to avoid the
-                    dead node. The failure detection protocols are configured as sub-elements in the JGroups MBean
+	<para>The failure detection protocols are used to detect failed nodes. Once a failed node is detected, a suspect verification phase can occur after which, if the node is still considered dead, the cluster updates its view so that the load balancer and client interceptors know to avoid the dead node. The failure detection protocols are configured as sub-elements in the JGroups MBean
                         <literal>Config</literal> element.</para>
-        <section id="jbosscache-jgroups-fd-fd">
+        
+		
+		
+<section id="jbosscache-jgroups-fd-fd">
           <title>FD</title>
-          <para>The FD discovery protocol requires each node periodically sends are-you-alive messages to its
-                        neighbor. If the neighbor fails to respond, the calling node sends a SUSPECT message to the
-                        cluster. The current group coordinator double checks that the suspect node is indeed dead and
-                        updates the cluster's view. Here is an example FD configuration.</para>
-          <programlisting>
+	  <para>
+		  FD is a failure detection protocol based on heartbeat messages. This protocol requires each node to periodically send are-you-alive messages to its neighbour. If the neighbour fails to respond, the calling node sends a SUSPECT message to the cluster. The current group coordinator can optionally double check whether the suspected node is indeed dead after which, if the node is still considered dead, updates the cluster's view. Here is an example FD configuration.
+	  </para>
+
+<programlisting>
 &lt;FD timeout="2000"
     max_tries="3"
-    shun="true"/&gt;
-                </programlisting>
+    shun="true"
+    down_thread="false" up_thread="false"/&gt;
+</programlisting>
+		
+		
           <para>The available attributes in the <literal>FD</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-                                to wait for the responses to the are-you-alive messages.</para>
+		      to wait for the responses to the are-you-alive messages. The default is 3000.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">max_tries</emphasis> specifies the number of missed
-                                are-you-alive messages from a node before the node is suspected.</para>
+		      are-you-alive messages from a node before the node is suspected. The default is 2.</para>
             </listitem>
             <listitem>
               <para><emphasis role="bold">shun</emphasis> specifies whether a failed node will be shunned.
                                 Once shunned, the node will be expelled from the cluster even if it comes back later.
-                                The shunned node would have to re-join the cluster through the discovery process.</para>
+				The shunned node would have to re-join the cluster through the discovery process. JGroups allows to configure itself such that shunning leads to automatic rejoins and state transfer, which is the default behaivour within JBoss Application Server.</para>
             </listitem>
           </itemizedlist>
           <note>
@@ -2366,96 +3299,177 @@
                             only sent when there is no regular traffic to the node for sometime.</para>
           </note>
         </section>
+	
+	
+	
+	
         <section id="jbosscache-jgroups-fd-fdsock">
           <title>FD_SOCK</title>
-          <para>The are-you-alive messages in the FD protocol could increase the network load when there are
-                        many nodes. It could also produce false suspicions. For instance, if the network is too busy and
-                        the timeout is too short, nodes could be falsely suspected. Also, if one node is suspended in a
-                        debugger or profiler, it could also be suspected and shunned. The FD_SOCK protocol addresses the
-                        above issues by suspecting node failures only when a regular TCP connection to the node fails.
-                        However, the problem with such passive detection is that hung nodes will not be detected until
-                        it is accessed and the TCP timeouts after several minutes. FD_SOCK works best in high load
-                        networks where all nodes are frequently accessed. The simplest FD_SOCK configuration does not
-                        take any attribute. You can just declare an empty <literal>FD_SOCK</literal> element in
-                        JGroups's <literal>Config</literal> element.</para>
-          <programlisting>
-&lt;FD_SOCK/&gt;
-                </programlisting>
-          <para>There is only one optional attribute in the <literal>FD_SOCK</literal> element.</para>
+	  <para>
+FD_SOCK is a failure detection protocol based on a ring of TCP sockets created between group members. Each member in a group connects to its neighbor (last member connects to first) thus forming a ring. Member B is suspected when its neighbor A detects abnormally closed TCP socket (presumably due to a node B crash). However, if a member B is about to leave gracefully, it lets its neighbor A know, so that it does not become suspected. The simplest FD_SOCK configuration does not take any attribute. You can just declare an empty <literal>FD_SOCK</literal> element in JGroups's <literal>Config</literal> element.</para>
+          
+
+<programlisting>
+&lt;FD_SOCK_down_thread="false" up_thread="false"/&gt;
+</programlisting>
+          
+<para>There available attributes in the <literal>FD_SOCK</literal> element are listed below.</para>
           <itemizedlist>
             <listitem>
-              <para><emphasis role="bold">srv_sock_bind_addr</emphasis> specifies the interface to which
-                                the server socket should bind to. If it is omitted, the <literal>-D
-                                bind.address</literal> property from the server startup command line is used.</para>
+		    <para><emphasis role="bold">bind_addr</emphasis> specifies the interface to which the server socket should bind to.  If -Djgroups.bind_address system property is defined, XML value will be ignore. This behaivour can be reversed setting -Djgroups.ignore.bind_addr=true system property.</para>
             </listitem>
           </itemizedlist>
+	  
         </section>
-        <section id="jbosscache-jgroups-fd-fdsimple">
-          <title>FD_SIMPLE</title>
-          <para>The FD_SIMPLE protocol is a more tolerant (less false suspicions) protocol based on
-                        are-you-alive messages. Each node periodically sends are-you-alive messages to a randomly
-                        choosen node and wait for a response. If a response has not been received within a certain
-                        timeout time, a counter associated with that node will be incremented. If the counter exceeds a
-                        certain value, that node will be suspected. When a response to an are-you-alive message is
-                        received, the counter resets to zero. Here is an example configuration for the
-                            <literal>FD_SIMPLE</literal> protocol.</para>
-          <programlisting>
-&lt;FD_SIMPLE timeout="2000"
-    max_missed_hbs="10"/&gt;
-                </programlisting>
-          <para>The available attributes in the <literal>FD_SIMPLE</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the timeout (in milliseconds) for
-                                the are-you-alive message. If a response is not received within timeout, the counter for
-                                the target node is increased.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">max_missed_hbs</emphasis> specifies maximum number of
-                                are-you-alive messages (i.e., the counter value) a node can miss before it is suspected
-                                failure.</para>
-            </listitem>
-          </itemizedlist>
-        </section>
-<!-- This algorithm is not recommended: Bela
-            <section id="jbosscache-jgroups-fd-fdprob">
-                <title>FD_PROB</title>
-                <para>The FD_PROB protocol uses a probailistic failure detection algorithm. Each node in the clluster
-                    maintains a list of all other nodes. For each node, 2 data points are maintained: a heartbeat
-                    counter and the time of the last increment of the counter. Each member (P) periodically sends its
-                    own heartbeat counter list to a randomly chosen member (Q). Q updates its own heartbeat counter list
-                    and the associated time (if counter was incremented). Each member periodically increments its own
-                    counter. If, when sending its heartbeat counter list, a member P detects that another member Q's
-                    heartbeat counter was not incremented for timeout seconds, Q will be suspected. Here is an example
-                    configuration for the <literal>FD_PROB</literal> protocol.</para>
-                <programlisting>
-&lt;FD_PROB timeout="2000"/>
-                </programlisting>
-                <para>The available attributes in the <literal>FD_SIMPLE</literal> element are listed below.</para>
-                <itemizedlist>
-                    <listitem><para><emphasis role="bold">timeout</emphasis> specifies the timeout (in milliseconds) for each
-                        node to increase its heartbeat counter. If a node does not increase its counter before timeout,
-                        the node is suspected of failure.</para></listitem>
-                </itemizedlist>
-            </section>
-            -->
-      </section>
+        
+	
+	
+	<section><title>VERIFY_SUSPECT</title>
+		<para>
+			This protocol verifies whether a suspected member is really dead by pinging that member once again. This verification is performed by the coordinator of the cluster. The suspected member is dropped from the cluster group if confirmed to be dead. The aim of this protocol is to minimize false suspicions. Here's an example.
+		</para>
+
+<programlisting><![CDATA[			
+<VERIFY_SUSPECT timeout="1500"
+	down_thread="false" up_thread="false"/>]]>
+</programlisting>
+
+	<para>
+		The available attributes in the FD_SOCK element are listed below. 
+	</para>
+	<itemizedlist>
+		<listitem>
+			<para>
+		timeout specifies how long to wait for a response from the suspected member before considering it dead.
+	</para>
+</listitem>
+</itemizedlist>
+		
+	</section>
+	
+	
+	
+	
+	<section><title>FD versus FD_SOCK</title>
+		<para>
+			FD and FD_SOCK, each taken individually, do not provide a solid failure detection layer. Let's look at the the differences between these failure detection protocols to understand how they complement each other:
+		</para>
+		<itemizedlist>
+			<listitem><para><emphasis>FD</emphasis></para>
+				</listitem>
+			</itemizedlist>
+				<itemizedlist>
+					<listitem>
+						<para>
+							An overloaded machine might be slow in sending are-you-alive responses.
+						</para>
+				</listitem>
+				<listitem>
+						<para>
+			A member will be suspected when suspended in a debugger/profiler.
+		</para>
+	</listitem>
+	<listitem>
+						<para>
+			Low timeouts lead to higher probability of false suspicions and higher network traffic.
+		</para>
+	</listitem>
+	<listitem>
+						<para>
+			High timeouts will not detect and remove crashed members for some time.
+		</para>
+	</listitem>
+</itemizedlist>
+
+<itemizedlist>
+<listitem><para><emphasis>FD_SOCK</emphasis>:</para>
+</listitem>
+</itemizedlist>
+
+<itemizedlist>
+	<listitem>
+		<para>
+			Suspended in a debugger is no problem because the TCP connection is still open.
+		</para>
+	</listitem>
+	<listitem>
+						<para>
+			High load no problem either for the same reason.
+		</para>
+	</listitem>
+	<listitem>
+						<para>
+			Members will only be suspected when TCP connection breaks
+		</para>
+	</listitem>
+</itemizedlist>
+
+
+	<itemizedlist>
+		<listitem>
+			<para>
+			So hung members will not be detected.
+		</para>
+	</listitem>
+	<listitem>
+		<para>
+
+			Also, a crashed switch will not be detected until the connection runs into the TCP timeout (between 2-20 minutes, depending on TCP/IP stack implementation).
+		</para>
+	</listitem>
+</itemizedlist>
+
+<para>
+			The aim of a failure detection layer is to report real failures and therefore avoid false suspicions. There are two solutions:
+		</para>
+		<orderedlist>
+			<listitem>
+		<para>			
+			By default, JGroups configures the FD_SOCK socket with KEEP_ALIVE, which means that TCP sends a heartbeat on socket on which no traffic has been received in 2 hours. If a host crashed (or an intermediate switch or router crashed) without closing the TCP connection properly, we would detect this after 2 hours (plus a few minutes). This is of course better than never closing the connection (if KEEP_ALIVE is off), but may not be of much help. So, the first solution would be to lower the timeout value for KEEP_ALIVE. This can only be done for the entire kernel in most operating systems, so if this is lowered to 15 minutes, this will affect all TCP sockets.
+		</para>
+	</listitem>
+	<listitem>
+		<para>
+			The second solution is to combine FD_SOCK and FD; the timeout in FD can be set such that it is much lower than the TCP timeout, and this can be configured individually per process. FD_SOCK will already generate a suspect message if the socket was closed abnormally. However, in the case of a crashed switch or host, FD will make sure the socket is eventually closed and the suspect message generated. Example:
+		</para>
+	</listitem>
+</orderedlist>
+<programlisting><![CDATA[
+<FD_SOCK down_thread="false" up_thread="false"/>
+<FD timeout="10000" max_tries="5" shun="true" 
+down_thread="false" up_thread="false" /> ]]>
+</programlisting>
+
+<para>
+			This suspects a member when the socket to the neighbor has been closed abonormally (e.g. 	process crash, because the OS closes all sockets). However, f a host or switch crashes, then the 	sockets won't be closed, therefore, as a seond line of defense, FD will suspect the neighbor after 	50 seconds. Note that with this example, if you have your system stopped in a breakpoint in the 	debugger, the node you're debugging will be suspected after ca 50 seconds.
+	</para>
+<para>
+			A combination of FD and FD_SOCK provides a solid failure detection layer and for this reason, such technique is used accross JGroups configurations included within JBoss Application Server.
+	</para>
+	</section>
+    </section>
+    
+    
+    
       <section id="jbosscache-jgroups-reliable">
         <title>Reliable Delivery Protocols</title>
-        <para>The reliable delivery protocols in the JGroups stack ensure that data pockets are actually
-                    delivered in the right order (FIFO) to the destination node. The basis for reliable message delivery
-                    is positive and negative delivery acknowledgments (ACK and NAK). In the ACK mode, the sender resends
-                    the message until the acknowledgment is received from the receiver. In the NAK mode, the receiver
-                    requests retransmission when it discovers a gap.</para>
+	<para>
+		Reliable delivery protocols within the JGroups stack ensure that data pockets are actually delivered in the right order (FIFO) to the destination node. The basis for reliable message delivery is positive and negative delivery acknowledgments (ACK and NAK). In the ACK mode, the sender resends the message until the acknowledgment is received from the receiver. In the NAK mode, the receiver requests retransmission when it discovers a gap.
+	</para>
+	
+	
+	
         <section id="jbosscache-jgroups-reliable-unicast">
           <title>UNICAST</title>
-          <para>The UNICAST protocol is used for unicast messages. It uses ACK. It is configured as a
-                        sub-element under the JGroups <literal>Config</literal> element. Here is an example
-                        configuration for the <literal>UNICAST</literal> protocol.</para>
-          <programlisting>
-&lt;UNICAST timeout="100,200,400,800"/&gt;
-                </programlisting>
-          <para>There is only one configurable attribute in the <literal>UNICAST</literal> element.</para>
+          <para>
+		  The UNICAST protocol is used for unicast messages. It uses ACK. It is configured as a sub-element under the JGroups Config element. If the JGroups stack is configured with TCP transport protocol, UNICAST is not necessary because TCP itself guarantees FIFO delivery of unicast messages. Here is an example configuration for the <literal>UNICAST</literal> protocol.</para>
+
+<programlisting>
+&lt;UNICAST timeout="100,200,400,800"
+down_thread="false" up_thread="false"/&gt;
+</programlisting>
+          
+<para>There is only one configurable attribute in the <literal>UNICAST</literal> element.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">timeout</emphasis> specifies the retransmission timeout (in
@@ -2465,6 +3479,8 @@
             </listitem>
           </itemizedlist>
         </section>
+	
+	
         <section id="jbosscache-jgroups-reliable-nakack">
           <title>NAKACK</title>
           <para>The NAKACK protocol is used for multicast messages. It uses NAK. Under this protocol, each
@@ -2473,13 +3489,17 @@
                         the sender to retransmit the missing message. The NAKACK protocol is configured as the
                             <literal>pbcast.NAKACK</literal> sub-element under the JGroups <literal>Config</literal>
                         element. Here is an example configuration.</para>
-          <programlisting>
-&lt;pbcast.NAKACK
-    max_xmit_size="8192"
-    use_mcast_xmit="true" 
-    retransmit_timeout="600,1200,2400,4800"/&gt;
-                </programlisting>
-          <para>The configurable attributes in the <literal>pbcast.NAKACK</literal> element are as follows.</para>
+
+<programlisting>
+&lt;pbcast.NAKACK max_xmit_size="60000" use_mcast_xmit="false" 
+   
+   retransmit_timeout="300,600,1200,2400,4800" gc_lag="0"
+   discard_delivered_msgs="true"
+   down_thread="false" up_thread="false"/&gt;
+</programlisting>
+          
+
+<para>The configurable attributes in the <literal>pbcast.NAKACK</literal> element are as follows.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">retransmit_timeout</emphasis> specifies the retransmission
@@ -2502,13 +3522,23 @@
                                 However, if we only ask the sender to resend their messages, we can enable this option
                                 and discard delivered messages.</para>
             </listitem>
+	    
+	    <listitem>
+		    <para><emphasis role="bold">gc_lag specifies</emphasis> the number of messages garbage collection lags behind.
+		    </para>
+	    </listitem>
+	    
           </itemizedlist>
         </section>
-      </section>
+     </section>
+
+      
+      
       <section id="jbosscache-jgroups-other">
         <title>Other Configuration Options</title>
-        <para>In addition to the protocol stacks, you can also configure JGroups network services in the
-                        <literal>Config</literal> element.</para>
+        <para>In addition to the protocol stacks, you can also configure JGroups network services in the <literal>Config</literal> element.</para>
+		
+		
         <section id="jbosscache-jgroups-other-gms">
           <title>Group Membership</title>
           <para>The group membership service in the JGroups stack maintains a list of active nodes. It handles
@@ -2517,14 +3547,18 @@
                         interceptors, are notified if the group membership changes. The group membership service is
                         configured in the <literal>pbcast.GMS</literal> sub-element under the JGroups
                         <literal>Config</literal> element. Here is an example configuration.</para>
-          <programlisting>
+<programlisting>
 &lt;pbcast.GMS print_local_addr="true"
     join_timeout="3000"
-    down_thread="false" 
+    down_thread="false" up_thread="false"
     join_retry_timeout="2000"
-    shun="true"/&gt;
-                </programlisting>
-          <para>The configurable attributes in the <literal>pbcast.GMS</literal> element are as follows.</para>
+    shun="true"
+    view_bundling="true"/&gt;
+</programlisting>
+          
+
+
+<para>The configurable attributes in the <literal>pbcast.GMS</literal> element are as follows.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">join_timeout</emphasis> specifies the maximum number of
@@ -2546,8 +3580,15 @@
               <para><emphasis role="bold">disable_initial_coord</emphasis> specifies whether to prevent
                                 this node as the cluster coordinator.</para>
             </listitem>
+	    <listitem>
+		    <para><emphasis role="bold">view_bundling</emphasis> specifies whether multiple JOIN or LEAVE request arriving at the same time are bundled and handled together at the same time, only sending out 1 new view / bundle. This is is more efficient than handling each request separately.
+		      </para>
+            </listitem>
+	    
           </itemizedlist>
         </section>
+	
+	
         <section id="jbosscache-jgroups-other-fc">
           <title>Flow Control</title>
           <para>The flow control service tries to adapt the sending data rate and the receiving data among
@@ -2560,12 +3601,15 @@
                         used up, the sender blocks until it receives credits from the receiver. The flow control service
                         is configured in the <literal>FC</literal> sub-element under the JGroups
                         <literal>Config</literal> element. Here is an example configuration.</para>
-          <programlisting>
+
+<programlisting>
 &lt;FC max_credits="1000000"
-    down_thread="false" 
+down_thread="false" up_thread="false" 
     min_threshold="0.10"/&gt;
-                </programlisting>
-          <para>The configurable attributes in the <literal>FC</literal> element are as follows.</para>
+</programlisting>
+          
+
+<para>The configurable attributes in the <literal>FC</literal> element are as follows.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">max_credits</emphasis> specifies the maximum number of credits
@@ -2580,34 +3624,72 @@
                                 threshold. It overrides the <literal>min_credits</literal> attribute.</para>
             </listitem>
           </itemizedlist>
+
+<note><title>Note</title>
+	<para>
+		Applications that use synchronous group RPC calls primarily do not require FC protocol in their JGroups protocol stack because synchronous communication, where the hread that makes the call blocks waiting for responses from all the members of the group, already slows overall rate of calls. Even though TCP provides flow control by itself, FC is still required in TCP based JGroups stacks because of group communication, where we essentially have to send group messages at the highest speed the slowest receiver can keep up with. TCP flow control only takes into account individual node  communications and has not a notion of who's the slowest in the group, which is why FC is required.
+	</para>
+</note>
+	  
+	  
         </section>
+	
+	
+<section><title>Fragmentation</title>
+	<para>
+		This protocol fragments messages larger than certain size. Unfragments at the receiver's side. It works for both unicast and multicast messages. It is configured in the FRAG2 sub-element under the JGroups Config element. Here is an example configuration.
+	</para>
+<programlisting><![CDATA[	
+		<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>]]>
+</programlisting>
+
+<para>
+The configurable attributes in the FRAG2 element are as follows.
+</para>
+
+<itemizedlist>
+	<listitem><para><emphasis role="bold">frag_size</emphasis> specifies the max frag size in bytes. Messages larger than that are fragmented.</para></listitem>
+</itemizedlist>
+
+<note><title>Note</title>
+	<para>
+		TCP protocol already provides fragmentation but a fragmentation JGroups protocol is still needed if FC is used. The reason for this is that if you send a message larger than FC.max_bytes, FC protocol would block. So, frag_size within FRAG2 needs to be set to always be less than FC.max_bytes.
+	</para>
+</note>
+
+
+</section>
+	
+	
+	
         <section id="jbosscache-jgroups-other-st">
           <title>State Transfer</title>
           <para>The state transfer service transfers the state from an existing node (i.e., the cluster
                         coordinator) to a newly joining node. It is configured in the
                         <literal>pbcast.STATE_TRANSFER</literal> sub-element under the JGroups <literal>Config</literal>
                         element. It does not have any configurable attribute. Here is an example configuration.</para>
-          <programlisting>
-&lt;pbcast.STATE_TRANSFER 
-    down_thread="false"
-    up_thread="false"/&gt;
-                </programlisting>
+<programlisting>
+&lt;pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/&gt;
+</programlisting>
         </section>
-        <section id="jbosscache-jgroups-other-gc">
+        
+	
+	
+	
+	<section id="jbosscache-jgroups-other-gc">
           <title>Distributed Garbage Collection</title>
-          <para>In a JGroups cluster, all nodes have to store all messages received for potential
-                        retransmission in case of a failure. However, if we store all messages forever, we will run out
-                        of memory. So, the distributed garbage collection service in JGroups periodically purges
-                        messages that have seen by all nodes from the memory in each node. The distributed garbage
-                        collection service is configured in the <literal>pbcast.STABLE</literal> sub-element under the
-                        JGroups <literal>Config</literal> element. Here is an example configuration.</para>
-          <programlisting>
+          <para>
+		  In a JGroups cluster, all nodes have to store all messages received for potential retransmission in case of a failure. However, if we store all messages forever, we will run out of memory. So, the distributed garbage collection service in JGroups periodically purges messages that have seen by all nodes from the memory in each node. The distributed garbage  collection service is configured in the <literal>pbcast.STABLE</literal> sub-element under the JGroups  <literal>Config</literal> element. Here is an example configuration.
+	  </para>
+
+<programlisting>
 &lt;pbcast.STABLE stability_delay="1000"
     desired_avg_gossip="5000" 
-    down_thread="false"
-    max_bytes="250000"/&gt;
-                </programlisting>
-          <para>The configurable attributes in the <literal>pbcast.STABLE</literal> element are as follows.</para>
+    down_thread="false" up_thread="false"
+       max_bytes="400000"/&gt;
+</programlisting>
+          
+<para>The configurable attributes in the <literal>pbcast.STABLE</literal> element are as follows.</para>
           <itemizedlist>
             <listitem>
               <para><emphasis role="bold">desired_avg_gossip</emphasis> specifies intervals (in
@@ -2620,9 +3702,7 @@
                                 <literal>0</literal> disables this service.</para>
             </listitem>
             <listitem>
-              <para><emphasis role="bold">max_gossip_runs</emphasis> specifies the maximum garbage
-                                collections runs before any changes. After this number is reached, there is no garbage
-                                collection until the message is received.</para>
+		    <para><emphasis role="bold">stability_delay</emphasis> specifies delay before we send STABILITY msg (give others a change to send first). If used together with max_bytes, this attribute should be set to a small number.</para>
             </listitem>
           </itemizedlist>
           <note>
@@ -2632,15 +3712,17 @@
         </section>
         <section id="jbosscache-jgroups-other-merge">
           <title>Merging</title>
-          <para>When a network error occurs, the cluster might be partitioned into several different
-                        partitions. JGroups has a MERGE service that allows the coordinators in partitions to
-                        communicate with each other and form a single cluster back again. The flow control service is
-                        configured in the <literal>MERGE2</literal> sub-element under the JGroups
-                        <literal>Config</literal> element. Here is an example configuration.</para>
-          <programlisting>
+          <para>
+		  When a network error occurs, the cluster might be partitioned into several different partitions. JGroups has a MERGE service that allows the coordinators in partitions to communicate with each other and form a single cluster back again. The flow control service is configured in the <literal>MERGE2</literal> sub-element under the JGroups <literal>Config</literal> element. Here is an example configuration.
+		</para>
+
+<programlisting>
 &lt;MERGE2 max_interval="10000"
-    min_interval="2000"/&gt;
-                </programlisting>
+    min_interval="2000"
+    down_thread="false" up_thread="false"/&gt;
+</programlisting>
+		
+		
           <para>The configurable attributes in the <literal>FC</literal> element are as follows.</para>
           <itemizedlist>
             <listitem>
@@ -2655,215 +3737,201 @@
           <para>JGroups chooses a random value between <literal>min_interval</literal> and
                             <literal>max_interval</literal> to send out the MERGE message.</para>
           <note>
-            <para>The cluster states are not merged in a merger. This has to be done by the
-                        application.</para>
+		  <para>
+			  The cluster states are not merged in a merger. This has to be done by the application. If <literal>MERGE2</literal> is used in conjunction with TCPPING, the <literal>initial_hosts</literal> attribute must contain all the nodes that could potentially be merged back, in order for the merge process to work properly. Otherwise, the merge process would not merge all the nodes even though shunning is disabled. Alternatively use MPING, which is commonly used with TCP to provide multicast member discovery capabilities, instead of TCPPING to avoid having to specify all the nodes.
+		  </para>
           </note>
         </section>
-      </section>
-    </section>
-    <section id="jbosscache-cache">
-      <title>JBossCache Configuration</title>
-      <para>JBoss Cache provides distributed cache and state replication services for the JBoss cluster. A JBoss
-                cluster can have multiple JBoss Cache MBeans (known as the <literal>TreeCache</literal> MBean), one for
-                HTTP session replication, one for stateful session beans, one for cached entity beans, etc. A generic
-                    <literal>TreeCache</literal> MBean configuration is listed below. Application specific
-                    <literal>TreeCache</literal> MBean configurations are covered in later chapters when those
-                applications are discussed.</para>
-      <programlisting>
-&lt;mbean code="org.jboss.cache.TreeCache" 
-        name="jboss.cache:service=TreeCache"&gt;
-    
-    &lt;depends&gt;jboss:service=Naming&lt;/depends&gt; 
-    &lt;depends&gt;jboss:service=TransactionManager&lt;/depends&gt; 
+	
+	<section><title>Binding JGroups Channels to a particular interface</title>
+		<para>
+			In the Transport Protocols section above, we briefly touched on how the interface to which JGroups will bind sockets is configured. Let's get into this topic in more depth:
+		</para>
+		<para>
+			First, it's important to understand that the value set in any bind_addr element in an XML configuration file will be ignored by JGroups if it finds that system property jgroups.bind_addr (or a deprecated earlier name for the same thing, <literal>bind.address</literal>) has been set. The system property trumps XML. If JBoss AS is started with the -b (a.k.a. --host) switch, the AS will set <literal>jgroups.bind_addr</literal> to the specified value.
+		</para>
+		<para>
+			Beginning with AS 4.2.0, for security reasons the AS will bind most services to localhost if -b is not set. The effect of this is that in most cases users are going to be setting -b and thus jgroups.bind_addr is going to be set and any XML setting will be ignored.
+		</para>
+		<para>
+			So, what are <emphasis>best practices</emphasis> for managing how JGroups binds to interfaces?
+		</para>
+		<itemizedlist>
+			<listitem>
+				<para>
+			Binding JGroups to the same interface as other services.  Simple, just use -b:
+		<screen>./run.sh -b 192.168.1.100 -c all</screen>
+	</para>
+</listitem>
+<listitem>
+	<para>
+			Binding services (e.g., JBoss Web) to one interface, but use a different one for JGroups:
+		<screen>./run.sh -b 10.0.0.100 -Djgroups.bind_addr=192.168.1.100 -c all</screen>
+		
+			Specifically setting the system property overrides the -b value. This is a common usage pattern; put client traffic on one network, with intra-cluster traffic on another.
+		</para>
+	</listitem>
+	<listitem>
+	<para>
+			
+			Binding services (e.g., JBoss Web) to all interfaces.  This can be done like this:
+		<screen>./run.sh -b 0.0.0.0 -c all</screen>
+		However, doing this will not cause JGroups to bind to all interfaces! Instead , JGroups will bind to the machine's default interface.  See the Transport Protocols section for how to tell JGroups to receive or send on all interfaces, if that is what you really want.
+	</para>
+	</listitem>
+	<listitem>
+	<para>	
+		Binding services (e.g., JBoss Web) to all interfaces, but specify the JGroups interface:
+		<screen>./run.sh -b 0.0.0.0 -Djgroups.bind_addr=192.168.1.100 -c all</screen>
+		
+			Again, specifically setting the system property overrides the -b value.
+		</para>
+	</listitem>
+	<listitem>
+	<para>	
+		Using different interfaces for different channels:
+		<screen>./run.sh -b 10.0.0.100 -Djgroups.ignore.bind_addr=true -c all</screen>
+	</para>
+	</listitem>
+</itemizedlist>
 
-    &lt;! -- Configure the TransactionManager --&gt; 
-    &lt;attribute name="TransactionManagerLookupClass"&gt;
-        org.jboss.cache.DummyTransactionManagerLookup
-    &lt;/attribute&gt; 
+<para>
+This setting tells JGroups to ignore the <literal>jgroups.bind_addr</literal> system property, and instead use whatever is specfied in XML. You would need to edit the various XML configuration files to set the <literal>bind_addr</literal> to the desired interfaces.	
+		</para>
+	</section>
+	
+	<section><title>Isolating JGroups Channels</title>
+		<para>
+			Within JBoss AS, there are a number of services that independently create JGroups channels -- 3 different JBoss Cache services (used for HttpSession replication, EJB3 SFSB replication and EJB3 entity replication) along with the general purpose clustering service called HAPartition that underlies most other JBossHA services.
+		</para>
+		<para>
+			It is critical that these channels only communicate with their intended peers; not with the channels used by other services and not with channels for the same service opened on machines not meant to be part of the group. Nodes improperly communicating with each other is one of the most common issues users have with JBoss AS clustering.
+		</para>
+		<para>
+			Whom a JGroups channel will communicate with is defined by its group name, multicast address, and multicast port, so isolating JGroups channels comes down to ensuring different channels use different values for the group name, multicast address and multicast port.
+		</para>
+		<para>
+			To isolate JGroups channels for different services on the same set of AS instances from each other, you MUST change the group name and the multicast port. In other words, each channel must have its own set of values.
+		</para>
+		<para>
+			For example, say we have a production cluster of 3 machines, each of which has an HAPartition deployed along with a JBoss Cache used for web session clustering. The HAPartition channels should not communicate with the JBoss Cache channels. They should use a different group name and multicast port. They can use the same multicast address, although they don't need to.
+		</para>
+		<para>
+			To isolate JGroups channels for the same service from other instances of the service on the network, you MUST change ALL three values. Each channel must have its own group name, multicast address, and multicast port.
+		</para>
+		<para>
+			For example, say we have a production cluster of 3 machines, each of which has an HAPartition deployed. On the same network there is also a QA cluster of 3 machines, which also has an HAPartition deployed. The HAPartition group name, multicast address, and multicast port for the production machines must be different from those used on the QA machines. 
+		</para>
+	</section>
+	
+	
+	<section><title>Changing the Group Name</title>
+		<para>
+			The group name for a JGroups channel is configured via the service that starts the channel. Unfortunately, different services use different attribute names for configuring this. For HAPartition and related services configured in the deploy/cluster-service.xml file, this is configured via a PartitionName attribute. For JBoss Cache services, the name of the attribute is ClusterName.
+		</para>
+		<para>
+			Starting with JBoss AS 4.0.4, for the HAPartition and all the standard JBoss Cache services, we make it easy for you to create unique groups names simply by using the -g (a.k.a. –partition) switch when starting JBoss:
+			<screen>./run.sh -g QAPartition -b 192.168.1.100 -c all</screen> 
+			This switch sets the jboss.partition.name system property, which is used as a component in the configuration of the group name in all the standard clustering configuration files. For example, 
+<screen><![CDATA[<attribute name="ClusterName">Tomcat-${jboss.partition.name:Cluster}</attribute>]]></screen>
+		</para>
+	</section>
+	
+	
+	
+<section><title>Changing the multicast address and port</title>
+	<para>
+		The -u (a.k.a. --udp) command line switch may be used to control the multicast address used by the JGroups channels opened by all standard AS services.
+<screen><![CDATA[/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c all]]></screen>
+		This switch sets the jboss.partition.udpGroup system property, which you can see referenced in all of the standard protocol stack configs in JBoss AS: 
+	</para>
+	
+<programlisting><![CDATA[<Config>
+<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}"
+ ....]]>
+</programlisting>
+<para>
+		     Unfortunately, setting the multicast ports is not so simple. As described above, by default there are four separate JGroups channels in the standard JBoss AS all configuration, and each should be given a unique port. There are no command line switches to set these, but the standard configuration files do use system properties to set them.  So, they can be configured from the command line by using -D. For example,
+	     </para>
+<programlisting>
+	/run.sh -u 230.1.2.3 -g QAPartition -Djboss.hapartition.mcast_port=12345 -Djboss.webpartition.mcast_port=23456 -Djboss.ejb3entitypartition.mcast_port=34567 -Djboss.ejb3sfsbpartition.mcast_port=45678 -b 192.168.1.100 -c all
+</programlisting>
+	
+<para><emphasis>Why isn't it sufficient to change the group name?</emphasis></para>
+<para>
+			     If channels with different group names share the same multicast address and port, the lower level JGroups protocols in each channel will see, process and eventually discard messages intended for the other group. This will at a minimum hurt performance and can lead to anomalous behavior.
+		     </para>
+			     
+	<para><emphasis>Why do I need to change the multicast port if I change the address?</emphasis></para>
+		<para>
+			It should be sufficient to just change the address, but there is a problem on several operating systems whereby packets addressed to a particular multicast port are delivered to all listeners on that port, regardless of the multicast address they are listening on. So the recommendation is to change both the address and the port.
+		</para>
+</section>
+	
+	
+<section><title>JGroups Troubleshooting</title>
+	<para><emphasis>Nodes do not form a cluster</emphasis></para>
+	
+	<para>
+		Make sure your machine is set up correctly for IP multicast. There are 2 test programs that can be used to detect this: McastReceiverTest and McastSenderTest. Go to the <literal>$JBOSS_HOME/server/all/lib</literal> directory and start McastReceiverTest, for example:
+<screen>java -cp jgroups.jar org.jgroups.tests.McastReceiverTest -mcast_addr 224.10.10.10 -port 5555 </screen>
+</para>
 
-    &lt;! -- 
-            Node locking level : SERIALIZABLE
-                                 REPEATABLE_READ (default)
-                                 READ_COMMITTED
-                                 READ_UNCOMMITTED
-                                 NONE        
-    --&gt; 
-    &lt;attribute name="IsolationLevel"&gt;REPEATABLE_READ&lt;/attribute&gt; 
+<para>
+Then in another window start <literal>McastSenderTest</literal>:
+<screen>java -cp jgroups.jar org.jgroups.tests.McastSenderTest -mcast_addr 224.10.10.10 -port 5555</screen>
+</para>
 
-    &lt;! --     Valid modes are LOCAL
-                             REPL_ASYNC
-                             REPL_SYNC
-    --&gt; 
-    &lt;attribute name="CacheMode"&gt;LOCAL&lt;/attribute&gt;
- 
-    &lt;! -- Name of cluster. Needs to be the same for all clusters, in order
-             to find each other --&gt; 
-    &lt;attribute name="ClusterName"&gt;TreeCache-Cluster&lt;/attribute&gt; 
+<para>
+	If you want to bind to a specific network interface card (NIC), use <literal>-bind_addr 192.168.0.2</literal>, where 192.168.0.2 is the IP address of the NIC to which you want to bind. Use this parameter in both the sender and the receiver.
+</para>
+<para>
+	You should be able to type in the <literal>McastSenderTest</literal> window and see the output in the <literal>McastReceiverTest</literal> window. If not, try to use -ttl 32 in the sender. If this still fails, consult a system administrator to help you setup IP multicast correctly, and ask the admin to make sure that multicast will work on the interface you have chosen or, if the machines have multiple interfaces, ask to be told the correct interface.
+Once you know multicast is working properly on each machine in your cluster, you can repeat the above test to test the network, putting the sender on one machine and the receiver on another.
+		
+	</para>
+</section>
+	
+	
 
-    &lt;! --    The max amount of time (in milliseconds) we wait until the
-            initial state (ie. the contents of the cache) are 
-            retrieved from existing members in a clustered environment
-    --&gt; 
-    &lt;attribute name="InitialStateRetrievalTimeout"&gt;5000&lt;/attribute&gt; 
 
-    &lt;! --    Number of milliseconds to wait until all responses for a
-            synchronous call have been received.
-    --&gt; 
-    &lt;attribute name="SyncReplTimeout"&gt;10000&lt;/attribute&gt; 
 
-    &lt;! --  Max number of milliseconds to wait for a lock acquisition --&gt; 
-    &lt;attribute name="LockAcquisitionTimeout"&gt;15000&lt;/attribute&gt; 
 
-    &lt;! --  Name of the eviction policy class. --&gt; 
-    &lt;attribute name="EvictionPolicyClass"&gt;
-        org.jboss.cache.eviction.LRUPolicy
-    &lt;/attribute&gt; 
+<section><title>Causes of missing heartbeats in FD</title>
+	<para>
+		Sometimes a member is suspected by FD because a heartbeat ack has not been received for some time T (defined by timeout and max_tries). This can have multiple reasons, e.g. in a cluster of A,B,C,D; C can be suspected if (note that A pings B, B pings C, C pings D and D pings A):
+	</para>
+	
+	<itemizedlist>
+		<listitem>
+			<para>
+			B or C are running at 100% CPU for more than T seconds. So even if C sends a heartbeat ack to B, B may not be able to process it because it is at 100%
+			</para>
+		</listitem>
+		<listitem>
+			<para>
+			B or C are garbage collecting, same as above.
+			</para>
+		</listitem>
+		<listitem>
+			<para>
+			A combination of the 2 cases above
+			</para>
+		</listitem>
+		<listitem>
+			<para>
+			The network loses packets. This usually happens when there is a lot of traffic on the network, and the switch starts dropping packets (usually broadcasts first, then IP multicasts, TCP packets last).
+			</para>
+		</listitem>
+		<listitem>
+			<para>
+			B or C are processing a callback. Let's say C received a remote method call  over its channel and takes T+1 seconds to process it. During this time, C will not process any other messages, including heartbeats, and therefore B will not receive the heartbeat ack and will suspect C.
+		</para>
+	</listitem>
+</itemizedlist>
 
-    &lt;! --  Specific eviction policy configurations. This is LRU --&gt; 
-    &lt;attribute name="EvictionPolicyConfig"&gt;
-        &lt;config&gt;
-            &lt;attribute name="wakeUpIntervalSeconds"&gt;5&lt;/attribute&gt; 
-            &lt;!--  Cache wide default --&gt; 
-            &lt;region name="/_default_"&gt;
-                &lt;attribute name="maxNodes"&gt;5000&lt;/attribute&gt; 
-                &lt;attribute name="timeToLiveSeconds"&gt;1000&lt;/attribute&gt; 
-            &lt;/region&gt;
+</section>
+</section>
 
-            &lt;region name="/org/jboss/data"&gt;
-                &lt;attribute name="maxNodes"&gt;5000&lt;/attribute&gt; 
-                &lt;attribute name="timeToLiveSeconds"&gt;1000&lt;/attribute&gt; 
-            &lt;/region&gt;
-
-            &lt;region name="/org/jboss/test/data"&gt;
-                &lt;attribute name="maxNodes"&gt;5&lt;/attribute&gt; 
-                &lt;attribute name="timeToLiveSeconds"&gt;4&lt;/attribute&gt; 
-            &lt;/region&gt;
-        &lt;/config&gt;
-    &lt;/attribute&gt;
-
-    &lt;attribute name="CacheLoaderClass"&gt;
-        org.jboss.cache.loader.bdbje.BdbjeCacheLoader
-    &lt;/attribute&gt;
-    
-    &lt;attribute name="CacheLoaderConfig"&gt;
-       location=c:\\tmp
-    &lt;/attribute&gt;
-    &lt;attribute name="CacheLoaderShared"&gt;true&lt;/attribute&gt;
-    &lt;attribute name="CacheLoaderPreload"&gt;
-        /a/b/c,/all/my/objects
-    &lt;/attribute&gt;
-    &lt;attribute name="CacheLoaderFetchTransientState"&gt;false&lt;/attribute&gt;
-    &lt;attribute name="CacheLoaderFetchPersistentState"&gt;true&lt;/attribute&gt;
-    
-    &lt;attribute name="ClusterConfig"&gt;
-        ... JGroups config for the cluster ...
-    &lt;/attribute&gt;
-&lt;/mbean&gt;
-        </programlisting>
-      <para>The JGroups configuration element (i.e., the <literal>ClusterConfig</literal> attribute) is omitted
-                from the above listing. You have learned how to configure JGroups earlier in this chapter (<xref linkend="jbosscache-jgroups"/>). The <literal>TreeCache</literal> MBean takes the following
-                attributes.</para>
-      <itemizedlist>
-        <listitem>
-          <para><emphasis role="bold">CacheLoaderClass</emphasis> specifies the fully qualified class name of
-                        the <literal>CacheLoader</literal> implementation.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">CacheLoaderConfig</emphasis> contains a set of properties from which the
-                        specific CacheLoader implementation can configure itself.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">CacheLoaderFetchPersistentState</emphasis> specifies whether to fetch
-                        the persistent state from another node. The persistence is fetched only if
-                            <literal>CacheLoaderShared</literal> is <literal>false</literal>. This attribute is only
-                        used if <literal>FetchStateOnStartup</literal> is <literal>true</literal>.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">CacheLoaderFetchTransientState</emphasis> specifies whether to fetch the
-                        in-memory state from another node. This attribute is only used if
-                        <literal>FetchStateOnStartup</literal> is <literal>true</literal>.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">CacheLoaderPreload</emphasis> contains a list of comma-separate nodes
-                        that need to be preloaded (e.g., <literal>/aop</literal>,
-                    <literal>/productcatalogue</literal>).</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">CacheLoaderShared</emphasis> specifies whether we want to shared a
-                        datastore, or whether each node wants to have its own local datastore.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">CacheMode</emphasis> specifies how to synchronize cache between nodes.
-                        The possible values are <literal>LOCAL</literal>, <literal>REPL_SYNC</literal>, or
-                            <literal>REPL_ASYNC</literal>. <!-- May need a sublist here to explain the modes --></para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">ClusterName</emphasis> specifies the name of the cluster. This value
-                        needs to be the same for all nodes in a cluster in order for them to find each other.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">ClusterConfig</emphasis> contains the configuration of the underlying
-                        JGroups stack (see <xref linkend="jbosscache-jgroups"/>.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">EvictionPolicyClass</emphasis> specifies the name of a class
-                        implementing <literal>EvictionPolicy</literal>. You can use a JBoss Cache provided
-                            <literal>EvictionPolicy</literal> class or provide your own policy implementation. If this
-                        attribute is empty, no eviction policy is enabled.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">EvictionPolicyConfig</emphasis> contains the configuration parameter for
-                        the specified eviction policy. Note that the content is provider specific.
-                        <!-- Add an example?? --></para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">FetchStateOnStartup</emphasis> specifies whether or not to acquire the
-                        initial state from existing members. It allows for warm/hot caches
-                        (<literal>true/false</literal>). This can be further defined by
-                            <literal>CacheLoaderFetchTransientState</literal> and
-                            <literal>CacheLoaderFetchPersistentState</literal>.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">InitialStateRetrievalTimeout</emphasis> specifies the time in
-                        milliseconds to wait for initial state retrieval.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">IsolationLevel</emphasis> specifies the node locking level. Possible
-                        values are <literal>SERIALIZABLE</literal>, <literal>REPEATABLE_READ</literal> (default),
-                            <literal>READ_COMMITTED</literal>, <literal>READ_UNCOMMITTED</literal>, and
-                        <literal>NONE</literal>. <!-- more docs needed --></para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">LockAcquisitionTimeout</emphasis> specifies the time in milliseconds to
-                        wait for a lock to be acquired. If a lock cannot be acquired an exception will be thrown.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">ReplQueueInterval</emphasis> specifies the time in milliseconds for
-                        elements from the replication queue to be replicated.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">SyncReplTimeout</emphasis> specifies the time in milliseconds to wait
-                        until replication ACKs have been received from all nodes in the cluster. This attribute applies
-                        to synchronous replication mode only (i.e., <literal>CacheMode</literal> attribute is
-                            <literal>REPL_SYNC</literal>).</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">UseReplQueue</emphasis> specifies whether or not to use a replication
-                        queue (<literal>true/false</literal>). This attribute applies to synchronous replication mode
-                        only (i.e., <literal>CacheMode</literal> attribute is <literal>REPL_ASYNC</literal>).</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">ReplQueueMaxElements</emphasis> specifies the maximum number of elements
-                        in the replication queue until replication kicks in.</para>
-        </listitem>
-        <listitem>
-          <para><emphasis role="bold">TransactionManagerLookupClass</emphasis> specifies the fully qualified
-                        name of a class implementing <literal>TransactionManagerLookup</literal>. The default is
-                            <literal>JBossTransactionManagerLookup</literal> for the transaction manager inside the
-                        JBoss AS. There is also an option of <literal>DummyTransactionManagerLookup</literal> for simple
-                        standalone examples.</para>
-        </listitem>
-      </itemizedlist>
-    </section>
   </chapter>
 </book>