[jboss-cvs] JBossAS SVN: r99868 - projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US.

jboss-cvs-commits at lists.jboss.org jboss-cvs-commits at lists.jboss.org
Sun Jan 24 21:53:44 EST 2010


Author: laubai
Date: 2010-01-24 21:53:44 -0500 (Sun, 24 Jan 2010)
New Revision: 99868

Modified:
   projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Administration_And_Configuration_Guide.xml
   projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JBoss_Cache_JGroups.xml
   projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml
Log:
Edited JGroups chapter.

Modified: projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Administration_And_Configuration_Guide.xml
===================================================================
--- projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Administration_And_Configuration_Guide.xml	2010-01-25 02:26:32 UTC (rev 99867)
+++ projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Administration_And_Configuration_Guide.xml	2010-01-25 02:53:44 UTC (rev 99868)
@@ -30,7 +30,7 @@
 		<xi:include href="Transactions.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 		<!-- <xi:include href="JGroups.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />-->
 		<xi:include href="Remoting.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
-		<xi:include href="Messaging.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
+<!--		<xi:include href="Messaging.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />-->
 		
 		<xi:include href="Alternative_DBs.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 		<xi:include href="Pooling.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
@@ -50,7 +50,7 @@
 		<xi:include href="Clustering_Guide_EJBs.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 		<xi:include href="Clustering_Guide_Entity_EJBs.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 		<xi:include href="Clustering_Guide_HTTP.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
-		<xi:include href="Clustering_Guide_JMS.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
+<!--		<xi:include href="Clustering_Guide_JMS.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />-->
       <xi:include href="Clustering_Guide_Deployment.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 		<xi:include href="Clustering_Guide_JGroups.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
 		<xi:include href="Clustering_Guide_JBoss_Cache.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />

Modified: projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JBoss_Cache_JGroups.xml
===================================================================
--- projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JBoss_Cache_JGroups.xml	2010-01-25 02:26:32 UTC (rev 99867)
+++ projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JBoss_Cache_JGroups.xml	2010-01-25 02:53:44 UTC (rev 99868)
@@ -2,7 +2,7 @@
 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN" "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd">
 
 <chapter id="jbosscache.chapt">
-    <title>JBossCache and JGroups Services</title>
+    <title>JBoss Cache and JGroups Services</title>
     <para>
        <indexterm><primary>Clustering</primary><secondary>JBossCache and JGroups Services</secondary></indexterm>
 
@@ -646,9 +646,10 @@
 	
         <section id="jbosscache-jgroups-discovery-tcpgossip">
           <title>TCPGOSSIP</title>
-          <para>The <literal>TCPGOSSIP</literal> protocol only works with a <classname>GossipRouter</classname>. 
-          It works similarly to the <literal>PING</literal> protocol configuration with valid 
-          <varname>gossip_host</varname> and <varname>gossip_port</varname> attributes. It works on top of both 
+          <para>The <literal>TCPGOSSIP</literal> protocol only works with a 
+          <classname>GossipRouter</classname>. It works similarly to the <literal>PING</literal> 
+          protocol configuration with valid <varname>gossip_host</varname> and 
+          <varname>gossip_port</varname> attributes. It works on top of both 
           UDP and TCP transport protocols, like so:</para>
 <programlisting><![CDATA[
 <TCPGOSSIP timeout="2000"
@@ -876,7 +877,7 @@
           
 <programlisting><![CDATA[<FD_SOCK down_thread="false" up_thread="false"/>]]></programlisting>
           
-<para>There available attributes in the <literal>FD_SOCK</literal> element are listed below.</para>
+<para>The available attributes in the <literal>FD_SOCK</literal> element are listed below.</para>
     <variablelist>
       <varlistentry>
         <term><varname>bind_addr</varname></term>
@@ -1395,7 +1396,7 @@
           </variablelist>
           <para>
             JGroups selects a random value between <varname>min_interval</varname> and 
-            <varname>max_interval</varname> to send out the <literal>MERGE</literal> message.
+            <varname>max_interval</varname> to send the <literal>MERGE</literal> message.
           </para>
           <note>
             <title><literal>MERGE</literal> does not merge cluster states</title>

Modified: projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml
===================================================================
--- projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml	2010-01-25 02:26:32 UTC (rev 99867)
+++ projects/docs/enterprise/EWP_5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml	2010-01-25 02:53:44 UTC (rev 99868)
@@ -61,7 +61,7 @@
       
       <para>The JGroups configurations used in JBoss Enterprise Web Platform 
       appear as nested elements in the 
-      <filename>$JBOSS_HOME/server/all/cluster/jgroups-channelfactory.sar/META-INF/jgroups-channelfactory-stacks.xml</filename> 
+      <filename>$JBOSS_HOME/server/production/cluster/jgroups-channelfactory.sar/META-INF/jgroups-channelfactory-stacks.xml</filename> 
       file. This file is parsed by the <literal>ChannelFactory</literal> service, 
       which uses the contents to provide correctly configured channels to the clustered services 
       that require them. See <xref linkend="clustering-blocks-jgroups-channelfactory"/>
@@ -157,10 +157,9 @@
   </listitem>
 </itemizedlist>
 <note>
-   <para>All of the protocols in the versions of JGroups used in JBoss Application Server 3.x and 4.x exposed <literal>down_thread</literal> and <literal>up_thread</literal> attributes.
-   The JGroups version included in JBoss Application Server 5 and later no longer uses those attributes,
-   and a <literal>WARN</literal> message will be written to the server log if they are configured
-   for any protocol.</para>
+   <para>Past versions of JGroups used <literal>down_thread</literal> and <literal>up_thread</literal>
+     attributes. These attributes are no longer used. A <literal>WARN</literal> message will be written to
+     the server log if they are configured for any protocol.</para>
 </note>
 
 </section>
@@ -170,7 +169,8 @@
         <para>
             The transport protocols send and receive messages to and from the network. They also 
             manage the thread pools used to deliver incoming messages to addresses higher in 
-            the protocol stack. JGroups supports <literal>UDP</literal>, <literal>TCP</literal> and <literal>TUNNEL</literal> as transport protocols.
+            the protocol stack. JGroups supports <literal>UDP</literal>, <literal>TCP</literal> and
+            <literal>TUNNEL</literal> as transport protocols.
         </para>
         
         <note>
@@ -232,297 +232,345 @@
           <para>
             The attributes particular to the <literal>UDP</literal> protocol are:
           </para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">ip_mcast</emphasis> specifies whether or not to use IP
-		      multicasting. The default is <literal>true</literal>. If set to <literal>false</literal>, 
-              multiple unicast packets will be sent instead of one multicast packet. Any packet sent 
-              via <literal>UDP</literal> protocol are UDP datagrams.
-         </para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">mcast_addr</emphasis> specifies the 
-              multicast address (class D) for communicating with the group (i.e., the cluster).
-              The standard protocol stack configurations in JBoss AS use the
-              value of system property <literal>jboss.partition.udpGroup</literal>,
-              if set, as the value for this attribute. Using the <literal>-u</literal>
-              command line switch when starting JBoss Application Server sets that value.
-              See <xref linkend="clustering-jgroups-isolation"/> for information about using 
-              this configuration attribute to ensure that JGroups channels are properly 
-              isolated from one another. 
-              If this attribute is omitted, the default value is <literal>228.11.11.11</literal>.
-         </para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">mcast_port</emphasis> specifies the port 
-              to use for multicast communication with the group. 
-              See <xref linkend="clustering-jgroups-isolation"/> for how to use this configuration attribute
-              to ensure JGroups channels are properly isolated from one another.
-              If this attribute is omitted, the default is <literal>45688</literal>.</para>
-            </listitem>   
-            <listitem>
-              <para>
-                <literal>mcast_send_buf_size</literal>, <literal>mcast_recv_buf_size</literal>, 
-                <literal>ucast_send_buf_size</literal> and <literal>ucast_recv_buf_size</literal> 
-                define the socket send and receive buffer sizes that JGroups will request 
-                from the operating system. A large buffer size helps to ensure that packets are 
-                not dropped due to buffer overflow. However, socket buffer sizes are limited at the 
-                operating system level, so obtaining the desired buffer may require configuration 
-                at the operating system level. See <xref linkend="jgroups-perf-udpbuffer"/> for further
-                details.</para>
-            </listitem>  
-           <listitem>
-          <para><emphasis role="bold">bind_port</emphasis> specifies the port to
-              which the unicast receive socket should be bound. The default is
-              <literal>0</literal>; i.e. use an ephemeral port.</para>
-            </listitem>
-            <listitem>
-		    <para><emphasis role="bold">port_range</emphasis> specifies the number of
-          ports to try if the port identified by <literal>bind_port</literal> 
-          is not available. The default is <literal>1</literal>, which specifies that only 
-          <literal>bind_port</literal> will be tried.</para>
-            </listitem>
-            <listitem>
-		    <para><emphasis role="bold">ip_ttl</emphasis> specifies time-to-live (TTL) for IP Multicast packets. TTL is the commonly used term in multicast networking, but is actually something of a misnomer, since the value here refers to how many network hops a packet will be allowed to travel before networking equipment will drop it.
-          </para>
-            </listitem>
-            <listitem>
-          <para><emphasis role="bold">tos</emphasis> specifies the traffic class for sending unicast and multicast datagrams.</para>
-             </listitem>
-          </itemizedlist>
+          <variablelist>
+            <varlistentry>
+              <term><varname>ip_multicast</varname></term>
+              <listitem><para>Specifies whether to use IP multicasting. The default value is
+              <literal>true</literal>. If set to <literal>false</literal>, multiple unicast
+              packets will be sent instead of one multicast packet. Any packet sent via
+              <literal>UDP</literal> is sent as a UDP datagram.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>mcast_addr</varname></term>
+              <listitem><para>Specifies the multicast address (class D) for communicating with
+              the group. The value of system property <varname>jboss.partition.udpGroup</varname>
+              is used as the value for this attribute, if set. (Set this property at startup with
+              the <code>-u</code> command line switch.) If omitted, the default value is 
+              <literal>228.11.11.11</literal>.</para></listitem>
+            </varlistentry> 
+            <varlistentry>
+              <term><varname>mcast_port</varname></term>
+              <listitem><para>Specifies the port to use for multicast communication with the
+              group. If omitted, the default value is <literal>45688</literal>. See
+              <xref linkend="clustering-jgroups-isolation"/> to ensure JGroups channels are 
+              properly isolated from each other.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>mcast_send_buf_size</varname>, <varname>mcast_recv_buf_size</varname>,
+                    <varname>ucast_send_buf_size</varname>, <varname>ucast_recv_buf_size</varname></term>
+              <listitem><para>Define the socket send and receive buffer sizes that JGroups 
+              requests from the operating system. A large buffer helps prevent packets being 
+              dropped due to buffer overflow. However, socket buffer sizes are limited at the 
+              operating system level, and may require operating system-level configuration. See
+              <xref linkend="jgroups-perf-udpbuffer"/> for details.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>bind_port</varname></term>
+              <listitem><para>Specifies the port that binds the unicast receive socket.
+              The default value is <literal>0</literal> (use an ephemeral port).</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>port_range</varname></term>
+              <listitem><para>Specifies the range of ports to try if the <varname>bind_port</varname>
+              is not available. The default is <literal>1</literal>, which specifies that only
+              <varname>bind_port</varname> will be tried.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>ip_ttl</varname></term>
+              <listitem><para>Specifies the time-to-live (TTL) for IP multicast packets.
+              The value here refers to the number of network hops a packet is allowed to make
+              before it is dropped.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>tos</varname></term>
+              <listitem><para>Specifies the traffic class for sending unicast and multicast
+              datagrams.</para></listitem>
+            </varlistentry>
+          </variablelist>
           
-          <para>The attributes that are common to all transport protocols,
-          and thus have the same meanings when used with <literal>TCP</literal> or 
-          <literal>TUNNEL</literal>, are:
-          </para>
-          <itemizedlist>
-
-            <listitem>
-		    <para><emphasis role="bold">singleton_name</emphasis> provides
-            a unique name for this transport protocol configuration. Used by the application server's  <literal>ChannelFactory</literal>
-            to support sharing of a transport protocol instance by different channels 
-            that use the same transport protocol configuration. See 
-            <xref linkend="clustering-blocks-jgroups-sharedtransport"/>.</para>
-            </listitem>
-            <listitem>
-		    <para><emphasis role="bold">bind_addr</emphasis> specifies the interface 
-              on which to receive and send messages.
-              By default, JGroups uses the value of system property <literal>jgroups.bind_addr</literal>. 
-              This can also be set with the <literal>-b</literal> command line switch.
-              See <xref linkend="jgroups-other"/> for more on binding JGroups 
-              sockets.</para>
-            </listitem>
-            <listitem>
-              <para>
-                <emphasis role="bold">receive_on_all_interfaces</emphasis> specifies whether this node
-                should listen on all interfaces for multicasts. The default is <literal>false</literal>.
-                It overrides the <literal>bind_addr</literal> property for receiving multicasts.
-                However, <literal>bind_addr</literal> (if set) is still used to send multicasts.</para>
-            </listitem>
-            <listitem><para><emphasis role="bold">send_on_all_interfaces</emphasis> specifies whether this node sends UDP packets via all available network interface controllers, if your machine has 
-            multiple network interface controllers available. This means that the same multicast message is sent N times, so use with care.
-             </para>
-          </listitem>
-       
-          <listitem>
-             <para><emphasis role="bold">receive_interfaces</emphasis> specifies a list of of interfaces on which to receive multicasts. The multicast receive socket will listen on all of these interfaces. This is a comma-separated list of IP addresses or interface names, for example, <literal>192.168.5.1,eth1,127.0.0.1</literal>.
-             </para>
-          </listitem>  
-       
-          <listitem>
-             <para><emphasis role="bold">send_interfaces</emphasis> specifies a 
-             list of of interfaces via which to send multicasts. The multicast 
-             sender socket will send on all of these interfaces. This is a 
-             comma-separated list of IP addresses or interface names, for example, 
-             <literal>192.168.5.1,eth1,127.0.0.1</literal>.This means that the 
-             same multicast message is sent N times, so use with care.
-             </para>
-          </listitem>   
-            <listitem>
-              <para><emphasis role="bold">enable_bundling</emphasis> specifies 
-              whether to enable message bundling.
-              If <literal>true</literal>, the tranpsort protocol queues outgoing messages until
-              <literal>max_bundle_size</literal> bytes have accumulated, or
-              <literal>max_bundle_time</literal> milliseconds have elapsed, whichever occurs
-              first. Then the transport protocol bundles queued messages into one 
-              large message and sends it. The messages are
-              unbundled at the receiver. The default is <literal>false</literal>.</para>
-              <para>Message bundling can have significant performance benefits for channels
-              that are used for high volume sending of messages where the sender does
-              not block waiting for a response from recipients (for example, a JBoss Cache
-              instance configured for <literal>REPL_ASYNC</literal>.) It can add considerable latency
-              to applications where senders need to block waiting for responses, so
-              it is not recommended for certain situations, such as where a JBoss Cache 
-              instance is configured for <literal>REPL_SYNC</literal>.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">loopback</emphasis> specifies whether the thread sending a message
-              to the group should itself carry the message back up the stack for delivery. (Messages sent to
-              the group are always delivered to the sending node as well.) If 
-              <literal>false</literal>, the sending thread does not carry the message;
-              the transport protocol waits to read the message off the network
-              and uses one of the message delivery pool threads for delivery.
-              The default is <literal>false</literal>, but <literal>true</literal> is recommended 
-              to ensure that the channel receives its own messages, in case the network interface 
-              goes down.</para>
-            </listitem>
-            
-            
-            <listitem>
-              <para><emphasis role="bold">discard_incompatible_packets</emphasis> specifies 
-              whether to discard packets sent by peers that use a different version of JGroups. 
-              Each message in the cluster is tagged with a JGroups version. If <literal>discard_incompatible_packets</literal> is set to <literal>true</literal>, 
-              messages received from different versions of JGroups will be silently discarded. 
-              Otherwise, a warning will be logged. <emphasis>In no case will the message be delivered.</emphasis> The default value is <literal>false</literal>.
-              </para>
-            </listitem>
-	    <listitem>
-          <para><emphasis role="bold">enable_diagnostics</emphasis> specifies that the transport 
-          should open a multicast socket on address <literal>diagnostics_addr</literal> and port 
-          <literal>diagnostics_port</literal> to listen for diagnostic requests
-          sent by the JGroups <ulink url="http://www.jboss.org/community/wiki/Probe"><emphasis role="bold">Probe</emphasis> utility</ulink>.
-          </para>
-       </listitem>
-       
-       <listitem>
-         <para>The various <emphasis role="bold">thread_pool</emphasis> 
-         attributes configure the behavior of the pool of threads JGroups uses
-         to carry ordinary incoming messages up the stack. The various attributes
-         provide the constructor arguments for an instance of 
-         <literal>java.util.concurrent.ThreadPoolExecutorService</literal>.
-         In the example above, the pool will have a minimum or <emphasis>core size</emphasis> 
-         of 8 threads, and a maximum size of 200. If more than
-         8 pool threads have been created, a thread returning from carrying
-         a message will wait for up to 5000 milliseconds to be assigned a new message to
-         carry, after which it will terminate. If no threads are available to
-         carry a message, the (separate) thread reading messages off the socket
-         will place messages in a queue; the queue will hold up to 1000 messages.
-         If the queue is full, the thread reading messages off the socket will
-         discard the message.</para>
-       </listitem>
-       
-       <listitem>
-          <para>The various <emphasis role="bold">oob_thread_pool</emphasis> attributes
-          are similar to the <emphasis role="bold">thread_pool</emphasis> attributes in that
-          they configure a <literal>java.util.concurrent.ThreadPoolExecutorService</literal>
-          used to carry incoming messages up the protocol stack. In this case,
-          the pool is used to carry a special type of message known as an Out-Of-Band (OOB)
-          message. OOB messages are exempt from the ordered-delivery
-          requirements of protocols like NAKACK and UNICAST and thus can be delivered
-          up the stack even if NAKACK or UNICAST are queueing up messages from
-          a particular sender. OOB messages are often used internally by JGroups
-          protocols and can be used by applications as well. For example, when JBoss Cache is in <literal>REPL_SYNC</literal> mode, it uses OOB messages for the second phase of its
-          two-phase-commit protocol.</para>
-       </listitem>
-	    
-          </itemizedlist>
+          <para>The following attributes are common to all transport protocols:</para>
+          <variablelist>
+            <varlistentry>
+              <term><varname>singleton_name</varname></term>
+              <listitem><para>The unique name of this transport protocol configuration. The
+              <classname>ChannelFactory</classname> uses this to share transport protocol
+              instances between different channels with the same transport protocol configuration.
+              See <xref linkend="clustering-blocks-jgroups-sharedtransport"/> for details.
+              </para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>bind_addr</varname></term>
+              <listitem><para>Specifies the interface that sends and receives messages. By default,
+              JGroups uses the value of system property <varname>jgroups.bind_addr</varname>.
+              This can be set with the <code>-b</code> command line switch. See
+              <xref linkend="jgroups-other"/> for more about binding JGroups sockets.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>receive_on_all_interfaces</varname></term>
+              <listitem><para>Specifies that this node should listen on all interfaces for
+              multicasts. The default value is <literal>false</literal>. Specifying this
+              overrides the <varname>bind_addr</varname> property for receiving multicasts.
+              (It does not override <varname>bind_addr</varname> for sending multicasts.)</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>send_on_all_interfaces</varname></term>
+              <listitem><para>Specifies that the node send UDP packets via all available network
+              interface controllers (NICs). The same multicast message will be sent multiple times
+              so use this attribute with care.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>receive_interfaces</varname></term>
+              <listitem><para>A comma-separated list of interfaces on which to receive multicasts, 
+              for example, <literal>192.168.5.1,eth1,127.0.0.1</literal>. The multicast receive
+              socket will listen on all listed interfaces.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>send_interfaces</varname></term>
+              <listitem><para>A comma-separated list of interfaces on which to send multicasts,
+              for example, <literal>192.168.5.1,eth1,127.0.0.1</literal>. The multicast sender 
+              socket will send on all listed interfaces. The same multicast message will be sent
+              multiple times, so use with care.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>enable_bundling</varname></term>
+              <listitem><para>Specifies whether to enable message bundling. If 
+              <literal>true</literal>, the transport protocol queues outgoing messages
+              until <varname>max_bundle_size</varname> bytes have accumulated or 
+              <varname>max_bundle_time</varname> milliseconds have elapsed. The transport
+              protocol then bundles queued messages into one large message and sends it.
+              Messages are unbundled at the receiver. The defailt value is <literal>false</literal>.</para>
+              <para>Message bundling can improve performance where senders do not block waiting
+              for a response from recipients, for example, a JBoss Cache instance configured for
+              <literal>REPL_ASYNC</literal>. It adds latency to applications where senders must
+              block waiting for responses, so it is not recommended in some circumstances, for
+              example, a JBoss Cache instance configured for <literal>REPL_SYNC</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>loopback</varname></term>
+              <listitem><para>Specifies whether the thread should carry a message back up the stack for
+              delivery. (Messages sent to the group are always sent to the sending node as well.)
+              If <literal>false</literal>, a message delivery pool thread is used instead of the
+              sending thread. <literal>false</literal> is the default, but <literal>true</literal>
+              is recommended to ensure that a channel receives its own messages should the
+              network interface fail.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>discard_incompatible_packets</varname></term>
+              <listitem><para>Specifies whether to discard packets sent by peers that use a
+              different version of JGroups. If <literal>true</literal>, messages tagged with
+              a different JGroups version are silently discarded. If <literal>false</literal>,
+              a warning is logged. <emphasis>In neither case will the message be delivered.</emphasis>
+              The default is <literal>false</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>enable_diagnostics</varname></term>
+              <listitem><para>Specifies that the transport should open a multicast socket on
+              <varname>diagnostics_addr</varname> and <varname>diagnostics_port</varname>
+              to listen for diagnostic requests sent by the JGroups 
+              <application>Probe</application> utility.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>thread_pool</varname></term>
+              <listitem><para>The various <varname>thread_pool</varname> attributes configure the
+              behavior of the pool of threads JGroups uses to carry incoming messages up the stack.
+              They provide the constructor arguments for an instance of 
+              <classname>java.util.concurrent.ThreadPoolExecutorService</classname>.</para>
+<screen>thread_pool.enabled="true"
+thread_pool.min_threads="8"
+thread_pool.max_threads="200"
+thread_pool.keep_alive_time="5000"
+thread_pool.queue_enabled="true"
+thread_pool.queue_max_size="1000"
+thread_pool.rejection_policy="discard"</screen>
+              <para>Here, the pool will have a minimum or <emphasis>core size</emphasis> of 8 threads, 
+              and a maximum size of 200. If more than 8 pool threads have been created, a thread 
+              returning from carrying a message will wait for up to 5000 milliseconds to be assigned
+              a new message to carry, after which it will terminate. If no thread is available to be
+              assigned a new message, the (separate) thread reading messages from the socket will
+              place messages in a queue (<varname>thread_pool.queue_enabled</varname>). This queue
+              will hold up to 1000 messages. If the queue is full, the thread reading messages 
+              from the queue will discard new messages.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>oob_thread_pool</varname></term>
+              <listitem><para>The various <varname>oob_thread_pool</varname> attributes are
+              similar to the <varname>thread_pool</varname> attributes in that they configure
+              an instance of <classname>java.util.concurrent.ThreadPoolExecutorService</classname>
+              used to carry messages up the protocol stack. In this case, the pool carries a 
+              special type of message known as an Out-of-Band (OOB) message.</para>
+              <para>OOB messages are exempt from the ordered delivery requirements of 
+              protocols such as NAKACK and UNICAST, and can be delivered up the stack even if
+              messages are queued ahead of them. OOB messages are often used internally by
+              JGroups protocols. They can also be used by applications, for example, when
+              JBoss Cache is in <literal>REPL_SYNC</literal> mode, it uses OOB messages for the
+              second phase of its two-phase commit protocol.</para></listitem>
+            </varlistentry>
+          </variablelist>
         </section>
 	
 	
         <section id="jgroups-transport-tcp">
           <title>TCP configuration</title>
-          <para>Alternatively, a JGroups-based cluster can also work over TCP connections. Compared with UDP,
-                        TCP generates more network traffic when the cluster size increases. TCP
-                        is fundamentally a unicast protocol. To send multicast messages, JGroups uses multiple TCP
-                        unicasts. To use TCP as a transport protocol, you should define a <literal>TCP</literal> element
-                        in the JGroups <literal>config</literal> element. Here is an example of the
-                        <literal>TCP</literal> element.</para>
-          <programlisting>
-&lt;TCP singleton_name="tcp" 
-        start_port="7800" end_port="7800"/&gt;
-                </programlisting>
+          <para>
+            Alternatively, a JGroups-based cluster can also work over TCP connections. Compared with UDP,
+            TCP generates more network traffic when the cluster size increases. TCP is fundamentally 
+            a unicast protocol. To send multicast messages, JGroups uses multiple TCP unicasts. To 
+            use TCP as a transport protocol, you should define a <literal>TCP</literal> element
+            in the JGroups <literal>Config</literal> element, like so:</para>
+          <programlisting><![CDATA[<TCP singleton_name="tcp" 
+        start_port="7800" end_port="7800"/>]]></programlisting>
           <para>The following attributes are specific to the <literal>TCP</literal> element:</para>
-          <itemizedlist>
-            <listitem>
-              <para>
-                <literal>start_port</literal> and <literal>end_port</literal> define the range of 
-                TCP ports to which the server should bind. The server socket is bound to the first 
-                available port beginning with <literal>start_port</literal>. If no available port is 
-                found (for example, because the ports are in use by other sockets) before the 
-                <literal>end_port</literal>, the server throws an exception. If no 
-                <literal>end_port</literal> is provided, or <literal>end_port</literal> is lower than 
-                <literal>start_port</literal>, no upper limit is applied to the port range. If 
-                <literal>start_port</literal> is equal to <literal>end_port</literal>, JGroups is 
-                forced to use the specified port, since <literal>start_port</literal> fails if the 
-                specified port in not available. The default value is <literal>7800</literal>. 
-                If set to <literal>0</literal>, the operating system will select a port. (This will only work for <literal>MPING</literal> or <literal>TCPGOSSIP</literal> discovery protocols. 
-                <literal>TCCPING</literal> requires that nodes and their required ports are listed.)
-              </para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">bind_port</emphasis> in TCP acts as an alias for <literal>start_port</literal>. If configured internally, it sets 
-              <literal>start_port</literal>.</para>
-            </listitem>
-            <listitem>
-		    <para><emphasis role="bold">recv_buf_size, send_buf_size</emphasis> define receive and send buffer sizes. It is good to have a large receiver buffer size, so packets are less likely to get dropped due to buffer overflow.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">conn_expire_time</emphasis> specifies the time (in milliseconds)
-                                after which a connection can be closed by the reaper if no traffic has been
-                            received.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">reaper_interval</emphasis> specifies interval (in milliseconds)
-                                to run the reaper. If both values are 0, no reaping will be done. If either value is
-				&gt; 0, reaping will be enabled. By default, reaper_interval is 0, which means no reaper.</para>
-            </listitem>
-	    <listitem>
-		    <para><emphasis role="bold">sock_conn_timeout</emphasis> specifies max time in millis for a socket creation. When doing the initial discovery, and a peer hangs, don't wait forever but go on after the timeout to ping other members. Reduces chances of *not* finding any members at all. The default is 2000.</para>
-    </listitem>
-	    <listitem>
-		    <para><emphasis role="bold">use_send_queues</emphasis> specifies whether to use separate send queues for each connection. This prevents blocking on write if the peer hangs. The default is true.</para>
-		      </listitem>
-      <listitem>
-	      <para><emphasis role="bold">external_addr</emphasis> specifies external IP address to broadcast to other group members (if different to local address). This is useful when you have use (Network Address Translation) NAT, e.g. a node on a private network, behind a firewall, but you can only route to it via an externally visible address, which is different from the local address it is bound to. Therefore, the node can be configured to broadcast its external address, while still able to bind to the local one. This avoids having to use the TUNNEL protocol, (and hence a requirement for a central gossip router) because nodes outside the firewall can still route to the node inside the firewall, but only on its external address. Without setting the external_addr, the node behind the firewall will broadcast its private address to the other nodes which will not be able to route to it.</para>
-		      </listitem>
-      <listitem>
-	      <para><emphasis role="bold">skip_suspected_members</emphasis> specifies whether unicast messages should not be sent to suspected members. The default is true.</para>
-		     </listitem>
-	<listitem>
-		<para><emphasis role="bold">tcp_nodelay</emphasis> specifies TCP_NODELAY. TCP by default nagles messages, that is, conceptually, smaller messages are bundled into larger ones. If we want to invoke synchronous cluster method calls, then we need to disable nagling in addition to disabling message bundling (by setting <literal>enable_bundling</literal> to false). Nagling is disabled by setting <literal>tcp_nodelay</literal> to true. The default is false.
-		</para>
-		     </listitem>
-	    
-    </itemizedlist>
+          <variablelist>
+            <varlistentry>
+              <term><varname>start_port</varname>, <varname>end_port</varname></term>
+              <listitem><para>Define the range of TCP ports to which the server should bind.
+              The server socket is bound to the first available port, beginning with
+              <varname>start_port</varname>. If no available port is found before the server
+              reaches <varname>end_port</varname>, the server throws an exception. If no
+              <varname>end_port</varname> is provided, or <varname>end_port</varname> is lower
+              than <varname>start_port</varname>, no upper limit is applied to the port range.
+              If <varname>start_port</varname> is equal to <varname>end_port</varname>, JGroups
+              is forced to use the port specified, and will fail if it is unavailable.
+              The default value is <literal>7800</literal>. If set to <literal>0</literal>,
+              the operating system will select a port. (This works only for MPING or TCPGOSSIP.
+              TCPPING requires that nodes and their required ports are listed.)</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>bind_port</varname></term>
+              <listitem><para>Acts as an alias for <varname>start_port</varname>. If configured
+              internally, sets <varname>start_port</varname>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>recv_buf_size</varname>, <varname>send_buf_size</varname></term>
+              <listitem><para>Define receive and send puffer sizes. A large buffer size means 
+              packets are less likely to be dropped due to buffer overflow.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>conn_expire_time</varname></term>
+              <listitem><para>Specifies the time in milliseconds after which a connection can be
+              closed by the reaper if no traffic has been received.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>reaper_interval</varname></term>
+              <listitem><para>Specifies the interval in milliseconds at which to run the reaper.
+              If both values are <literal>0</literal>, no reaping will be done. If either value is
+              greater than zero, reaping will be enabled. The reaper is disabled by default.
+              </para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>sock_conn_timeout</varname></term>
+              <listitem><para>Specifies the maximum time in milliseconds for socket creation.
+              When a peer hangs during initial discovery, instead of waiting forever, other members 
+              will be pinged after this timeout period. This reduces the chances of not finding
+              any members at all. The default value is <literal>2000</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>use_send_queues</varname></term>
+              <listitem><para>Specifies whether to use separate send queues for each connection.
+              This prevents blocking on write if the peer hangs. The default value is
+              <literal>true</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>external_addr</varname></term>
+              <listitem><para>Specifies an external IP address to broadcast to other group members
+              (if not the local address). This is useful for Network Address Translation (NAT).
+              Say a node on a private network exists behind a firewall, but can only be routed to
+              via an externally visible address, not the local address to which it is bound.
+              The node can be configured to broadcast its external address while remaining bound to 
+              the local one. This lets you avoid using the TUNNEL protocol and a central gossip 
+              router. Without setting the <varname>external_addr</varname>, the node behind the 
+              firewall broadcasts its private address to the other nodes, which will not be able to 
+              route to it.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>skip_suspected_members</varname></term>
+              <listitem><para>Specifies whether unicast messages should not be sent to suspected members.
+              The default value is <literal>true</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>tcp_nodelay</varname></term>
+              <listitem><para>
+                Specifies <literal>TCP_NODELAY</literal>. By default, TCP <emphasis>nagles</emphasis>
+                messages (bundles smaller messages together into a larger message). To invoke 
+                synchronous cluster method calls, we must disable nagling in addition to disabling 
+                message bundling. To do this, set <varname>tcp_nodelay</varname> to <literal>true</literal>
+                and <varname>enable_bundling</varname> to <literal>false</literal>. The default value
+                for <varname>tcp_nodelay</varname> is <literal>false</literal>.
+              </para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname></varname></term>
+              <listitem><para></para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname></varname></term>
+              <listitem><para></para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname></varname></term>
+              <listitem><para></para></listitem>
+            </varlistentry>
+          </variablelist>
+
        <note>
           <para>All of the attributes common to all protocols discussed in
-          the UDP protocol section also apply to TCP.</para>
+            the UDP protocol section also apply to TCP.</para>
        </note>
-        </section>
+      </section>
         
         
 	
         <section id="jgroups-transport-tunnel">
           <title>TUNNEL configuration</title>
-          <para>The <literal>TUNNEL</literal> protocol uses an external router process to send messages. The external router is a Java process that runs the            <literal>org.jgroups.stack.GossipRouter</literal> main class. Each node has to register with the router. All messages are sent to the router and forwarded on to their destinations. The TUNNEL approach can be used to set up communication with nodes behind firewalls. A node can establish a TCP connection to the <classname>GossipRouter</classname> through the firewall (you can use port 80). This connection is also used by the router to send messages to nodes behind the firewall, as most firewalls do not permit outside hosts to initiate a TCP connection to a host inside the firewall. The <literal>TUNNEL</literal> configuration is defined in the <literal>TUNNEL</literal> element within the JGroups <literal>&lt;config&gt;</literal> element, like so:
-	  </para>
+          <para>
+            The TUNNEL protocol uses an external router known as the <classname>GossipRouter</classname>
+            to send messages. Each node must register with this router. All messages are sent to the
+            router and forwarded to their destinations. The TUNNEL approach can be used to set up
+            communication with nodes behind firewalls. A node can establish a TCP connection to the
+            <classname>GossipRouter</classname> through the firewall via port 80. This connection is
+            also used by the router to send messages to nodes behind the firewall, since most firewalls
+            do not permit outside hosts to initiate a TCP connection to a host inside the firewall.
+            The TUNNEL configuration is defined in the <literal>TUNNEL</literal> sub-element in the
+            JGroups <literal>Config</literal> element, like so:
+      	  </para>
           
 	  
-	  <programlisting>
-&lt;TUNNEL  singleton_name="tunnel"
+	  <programlisting>&lt;TUNNEL  singleton_name="tunnel"
             router_port="12001"
-            router_host="192.168.5.1"/&gt;
-                </programlisting>
+            router_host="192.168.5.1"/&gt;</programlisting>
 
 		
 <para>The available attributes in the <literal>TUNNEL</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">router_host</emphasis> specifies the host on which the
-                                GossipRouter is running.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">router_port</emphasis> specifies the port on which the
-                                GossipRouter is listening.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">reconnect_interval</emphasis> specifies the interval of time (in milliseconds) for which <literal>TUNNEL</literal> will attempt to connect to the <classname>GossipRouter</classname> if the connection is not established. The default value is <literal>5000</literal>.</para>
-            </listitem>
-          </itemizedlist>
+<variablelist>
+  <varlistentry>
+    <term><varname>router_host</varname></term>
+    <listitem>
+      <para>
+        Specifies the host on which the <classname>GossipRouter</classname> runs.
+      </para>
+    </listitem>
+  </varlistentry>
+  <varlistentry>
+    <term><varname>router_port</varname></term>
+    <listitem>
+      <para>
+        Specifies the port on which the <classname>GossipRouter</classname> listens.
+      </para>
+    </listitem>
+  </varlistentry>
+  <varlistentry>
+    <term><varname>reconnect_interval</varname></term>
+    <listitem><para>
+      Specifies the interval in milliseconds for which <literal>TUNNEL</literal> will attempt to connect
+      to the <classname>GossipRouter</classname> if the connection is not established.
+    </para></listitem>
+  </varlistentry>
+</variablelist>
+
           <note>
-          <para>All of the attributes common to all protocols discussed in
-          the UDP protocol section also apply to <literal>TUNNEL</literal>.</para>
-       </note>
+            <para>All of the attributes common to all protocols discussed in
+            the UDP protocol section also apply to <literal>TUNNEL</literal>.</para>
+          </note>
         </section>
       </section>
       
@@ -538,19 +586,27 @@
       the group). Discovery protocols are used to find active nodes in the cluster and 
       to determine which is the coordinator. This information is then provided to the 
       group membership protocol (GMS), which communicates with the coordinator's GMS to 
-      add the newly-connecting node to the group. (For more information about group membership protocols, see <xref linkend="jgroups-other-gms"/>.)
+      add the newly-connecting node to the group. (For more information about group 
+      membership protocols, see <xref linkend="jgroups-other-gms"/>.)
     </para>
     <para>
       Discovery protocols also assist merge protocols (see <xref linkend="jgroups-other-merge"/>) 
       to detect cluster-split situations.</para>
     <para>
-      The discovery protocols sit on top of the transport protocol, so you can choose to use different discovery protocols depending on your transport protocol. These are also configured as sub-elements in the JGroups <literal>&lt;config&gt;</literal> element.
+      The discovery protocols sit on top of the transport protocol, so you can choose to use different discovery protocols depending on your transport protocol. These are also configured as sub-elements in the JGroups <literal>&lt;Config&gt;</literal> element.
     </para>
 	    
         <section id="jgroups-discovery-ping">
           <title>PING</title>
-	  <para>
-		  PING is a discovery protocol that works by either multicasting PING requests to an IP multicast address or connecting to a gossip router. As such, PING normally sits on top of the UDP or TUNNEL transport protocols. Each node responds with a packet {C, A}, where C=coordinator's address and A=own address. After timeout milliseconds or num_initial_members replies, the joiner determines the coordinator from the responses, and sends a JOIN request to it (handled by). If nobody responds, we assume we are the first member of a group.
+<para>
+		  <literal>PING</literal> is a discovery protocol that works by either multicasting 
+      <literal>PING</literal> requests to an IP multicast address or connecting to a gossip router. 
+      As such, <literal>PING</literal> normally sits on top of the UDP or TUNNEL transport protocols. 
+      Each node responds with a packet <literal>{C, A}</literal>, where <literal>C</literal> is the 
+      coordinator's address, and <literal>A</literal> is the node's own address. After 
+      <varname>timeout</varname> milliseconds or <varname>num_initial_members</varname> replies, the 
+      joiner determines the coordinator from the responses, and sends a JOIN request to it (handled by). 
+      If no node responds, it assumes it is the first member of a group.
 	  </para>
 	  <para>Here is an example PING configuration for IP multicast. 
 	  </para>
@@ -567,41 +623,46 @@
       num_initial_members="3"/>]]>
 </programlisting>		
           <para>The available attributes in the <literal>PING</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-		      to wait for any responses. The default is 3000.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-		      responses to wait for unless timeout has expired. The default is 2.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">gossip_host</emphasis> specifies the host on which the
-                                GossipRouter is running.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">gossip_port</emphasis> specifies the port on which the
-                                GossipRouter is listening on.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">gossip_refresh</emphasis> specifies the interval (in
-		      milliseconds) for the lease from the GossipRouter. The default is 20000.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">initial_hosts</emphasis> is a comma-separated list of addresses or ports (for example, <literal>host1[12345],host2[23456]</literal>) which are pinged for
-              discovery. Default is <literal>null</literal>, meaning multicast
-              discovery should be used. If <literal>initial_hosts</literal>
-              is specified, you must list all possible cluster members, not just a few well-known hosts, or <literal>MERGE2</literal> cluster split discovery will not work reliably.</para>
-            </listitem>
-          </itemizedlist>
-          <para>If both <literal>gossip_host</literal> and <literal>gossip_port</literal> are defined, the
-          cluster uses the GossipRouter for the initial discovery. If the <literal>initial_hosts</literal>
-          is specified, the cluster pings that static list of addresses for discovery. Otherwise, the
-          cluster uses IP multicasting for discovery.</para>
+          <variablelist>
+        <varlistentry>
+          <term><varname>timeout</varname></term>
+          <listitem><para>Specifies the maximum number of milliseconds to wait for any responses.
+          The default value is <literal>3000</literal>.</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term><varname>num_initial_members</varname></term>
+          <listitem><para>Specifies the maximum number of responses to wait for unless the 
+          <varname>timeout</varname> has expired. The default value is <literal>2</literal>.</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term><varname>gossip_host</varname></term>
+          <listitem><para>Specifies the host on which the <classname>GossipRouter</classname> is running.
+          </para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term><varname>gossip_port</varname></term>
+          <listitem><para>Specifies the port on which the <classname>GossipRouter</classname> is listening.
+          </para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term><varname>gossip_refresh</varname></term>
+          <listitem><para>Specifies the interval, in milliseconds, for the lease from the 
+          <classname>GossipRouter</classname>. The default value is <literal>20000</literal>.</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term><varname>initial_hosts</varname></term>
+          <listitem><para>A comma-separated list of addresses to ping for discovery, for example,
+          <literal>host1[12345],host2[23456]</literal>.</para></listitem>
+        </varlistentry>
+      </variablelist>
+
+          <para>If both <varname>gossip_host</varname> and <varname>gossip_port</varname> are defined, the
+          cluster uses the <classname>GossipRouter</classname> for the initial discovery. If  
+          <varname>initial_hosts</varname> is specified, the cluster pings that static list of addresses 
+          for discovery. Otherwise, the cluster uses IP multicasting for discovery.</para>
           <note>
-            <para>The discovery phase returns when the <literal>timeout</literal> ms have elapsed or the
-                                <literal>num_initial_members</literal> responses have been received.</para>
+            <para>The discovery phase returns when the <varname>timeout</varname> period has elapsed or 
+            <varname>num_initial_members</varname> responses have been received.</para>
           </note>
         </section>
 	
@@ -609,87 +670,95 @@
 	
         <section id="jgroups-discovery-tcpgossip">
           <title>TCPGOSSIP</title>
-          <para>The TCPGOSSIP protocol only works with a GossipRouter. It works essentially the same way as
-                        the PING protocol configuration with valid <literal>gossip_host</literal> and
-			<literal>gossip_port</literal> attributes. It works on top of both UDP and TCP transport protocols. Here is an example.</para>
+          <para>The <literal>TCPGOSSIP</literal> protocol only works with a 
+          <classname>GossipRouter</classname>. It works similarly to the <literal>PING</literal> 
+          protocol configuration with valid <varname>gossip_host</varname> and 
+          <varname>gossip_port</varname> attributes. It works on top of both 
+          UDP and TCP transport protocols, like so:</para>
 <programlisting><![CDATA[<TCPGOSSIP timeout="2000"
        num_initial_members="3"
        initial_hosts="192.168.5.1[12000],192.168.0.2[12000]"/>]]>
 </programlisting>
 
 
-          <para>The available attributes in the <literal>TCPGOSSIP</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-		      to wait for any responses. The default is 3000.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-		      responses to wait for unless timeout has expired. The default is 2.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">initial_hosts</emphasis> is a comma-separated list of addresses/ports
-              (for example, <literal>host1[12345],host2[23456]</literal>) of <literal>GossipRouter</literal>s to register</para>
-            </listitem>
-          </itemizedlist>
+<para>The available attributes in the <literal>TCPGOSSIP</literal> element are listed below.</para>
+          <variablelist>
+            <varlistentry><term><varname>timeout</varname></term>
+              <listitem>
+                <para>Specifies the maximum number of milliseconds to wait for any responses. 
+                The default value is <literal>3000</literal>.</para>
+              </listitem>
+            </varlistentry>
+            <varlistentry><term><varname>num_initial_members</varname></term>
+              <listitem>
+                <para>Specifies the maximum number of responses to wait for unless <varname>timeout</varname> 
+                has expired. The default value is <literal>2</literal>.</para>
+              </listitem>
+            </varlistentry>
+            <varlistentry><term><varname>initial_hosts</varname></term>
+              <listitem>
+                <para>A comma-seperated list of addresses for <classname>GossipRouter</classname>s to register
+                with, for example, <literal>host1[12345],host2[23456]</literal>.</para>
+              </listitem>
+            </varlistentry>
+          </variablelist>
         </section>
 	
 	
 	
         <section id="jgroups-discovery-tcpping">
           <title>TCPPING</title>
-          <para>The TCPPING protocol takes a set of known members and pings them for discovery. This is
-                        essentially a static configuration. It works on top of TCP. Here is an example of the
-                            <literal>TCPPING</literal> configuration element in the JGroups <literal>config</literal>
-                        element.</para>
+          <para>The <literal>TCPPING</literal> protocol takes a set of known members and pings them for 
+          discovery. This is a static configuration. It works on top of TCP. Here is an example of the
+          <literal>TCPPING</literal> configuration sub-element in the JGroups <literal>Config</literal>
+          element.</para>
           <programlisting>&lt;TCPPING timeout="2000"
      num_initial_members="3"/
      initial_hosts="hosta[2300],hostb[3400],hostc[4500]"
      port_range="3"&gt;
 </programlisting>
 
-
-          <para>The available attributes in the <literal>TCPPING</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-		      to wait for any responses. The default is 3000.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-		      responses to wait for unless timeout has expired. The default is 2.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
-                    (for example, <literal>host1[12345],host2[23456]</literal>) for pinging.</para>
-            </listitem>
-            <listitem>
-		    <para>
-			    <emphasis role="bold">port_range</emphasis> specifies the number of consecutive ports to be probed when getting the initial membership, starting with the port specified in the <varname>initial_hosts</varname> parameter. Given the current values of <literal>port_range</literal> and <literal>initial_hosts</literal> above, the <literal>TCPPING</literal> layer will try to connect to <literal>hosta[2300]</literal>, <literal>hosta[2301]</literal>, <literal>hosta[2302]</literal>, <literal>hostb[3400]</literal>, <literal>hostb[3401]</literal>, <literal>hostb[3402]</literal>, <literal>hostc[4500]</literal>, <literal>hostc[4501]</literal>, and <literal>hostc[4502]</literal>. This configuration option allows for multiple possible ports on the same host to be pinged without having to spell out all possible combinations.
-                If in your TCP protocol configuration your <literal>end_port</literal> is greater than your <literal>start_port</literal>, we recommend using a TCPPING <literal>port_range</literal> equal to the difference, to ensure
-                a node is pinged no matter which port it is bound to within the allowed range.
-		    </para>
-            </listitem>
-          </itemizedlist>
+<para>The available attributes in the <literal>TCPPING</literal> element are listed below.</para>
+          <variablelist>
+            <varlistentry>
+              <term><varname>timeout</varname></term>
+              <listitem><para>Specifies the maximum number of milliseconds to wait for any responses.
+              The default value is <literal>3000</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>num_initial_members</varname></term>
+              <listitem><para>Specifies the maximum number of responses to wait for unless the
+              <varname>timeout</varname> has expired. The default value is <literal>2</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>initial_hosts</varname></term>
+              <listitem><para>A comma-separated list of addresses to ping fgor discovery, for example,
+              <literal>host1[12345],host2[23456]</literal>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>port_range</varname></term>
+              <listitem><para>Specifies the number of consecutive ports to be probed when getting the
+              initial membership, starting from the port specified in the <varname>initial_hosts</varname>
+              parameter. Given the values of <varname>port_range</varname> and <varname>initial_hosts</varname>
+              given in the example code, the <literal>TCPPING</literal> layer will try to connect to
+              <literal>hosta:2300</literal>, <literal>hosta:2301</literal>, <literal>hosta:2302</literal>, 
+              <literal>hostb:3400</literal>, <literal>hostb:3401</literal>, <literal>hostb:3402</literal>,
+              <literal>hostc:4500</literal>, <literal>hostc:4501</literal> and <literal>hostc:4502</literal>.
+              The configuration options allow for multiple nodes on the same host to be pinged.
+              </para></listitem>
+            </varlistentry>
+          </variablelist>
         </section>
 	
-	
-	
-	
         <section id="jgroups-discovery-mping">
           <title>MPING</title>
-	  <para>
-        <literal>MPING</literal> uses IP multicast to discover the initial membership. Unlike the 
-        other discovery protocols, which delegate the sending and receiving of 
-        discovery messages on the network to the transport protocol, <literal>MPING</literal>
-        opens its own sockets to send and receive multicast discovery messages.
-        As a result it can be used with all transports, but it is most often used 
-        with <literal>TCP</literal>. <literal>TCP</literal> usually requires 
-        <literal>TCPPING</literal>, which must explicitly list all possible group members. 
-        <literal>MPING</literal> does not have this requirement, and is typically used where 
-        <literal>TCP</literal> is required for regular message transport, and UDP multicasting 
-        is allowed for discovery.
+<para>
+		  <literal>MPING</literal> uses IP multicast to discover the initial membership. It can be used 
+      with all transports, but usually this is used in combination with TCP. TCP usually requires 
+      <literal>TCPPING</literal>, which has to list all group members explicitly, but <literal>MPING</literal>
+      does not have this requirement. The typical use case for this is when we want TCP as transport, 
+      but multicasting for discovery so we don't have to define a static list of initial hosts in 
+      <literal>TCPPING</literal> or require an external <classname>GossipRouter</classname>. 
 	</para>
 
 <programlisting>
@@ -701,34 +770,57 @@
     ip_ttl="8"/&gt;
 </programlisting>
 
-
           <para>The available attributes in the <literal>MPING</literal> element are listed below.</para>
-          <itemizedlist>
+        <variablelist>
+          <varlistentry>
+            <term><varname>timeout</varname></term>
+            <listitem><para>Specifies the maximum number of milliseconds to wait for any responses.
+            The default value is <literal>3000</literal>.</para></listitem>
+          </varlistentry>
+          <varlistentry>
+            <term><varname>num_initial_members</varname></term>
+            <listitem><para>Specifies the maximum number of responses to wait for unless
+            <varname>timeout</varname> has expired. The default value is <literal>2</literal>.</para></listitem>
+          </varlistentry>
+          <varlistentry>
+            <term><varname>bind_addr</varname></term>
+            <listitem><para>Specifies the interface on which to send and receive multicast packets.</para></listitem>
+          </varlistentry>
+          <varlistentry>
+            <term><varname>bind_to_all_interfaces</varname></term>
+            <listitem><para>Overrides the <varname>bind_addr</varname> value and uses all interfaces
+            in multihome nodes.</para></listitem>
+          </varlistentry>
+          <varlistentry>
+            <term><varname>mcast_addr</varname></term>
             <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-		      to wait for any responses. The default is 3000.</para>
+              <para>
+                Specifies the multicast address <!--(class D)--> for joining a cluster. If omitted,
+                the default is <literal>228.8.8.8</literal>.
+              </para>
             </listitem>
+          </varlistentry>
+          <varlistentry>
+            <term><varname>mcast_port</varname></term>
             <listitem>
-              <para><emphasis role="bold">num_initial_members</emphasis> specifies the maximum number of
-		      responses to wait for unless timeout has expired. The default is 2..</para>
+              <para>
+                Specifies the multicast port number. If omitted, the default is
+                <literal>45566</literal>.
+              </para>
             </listitem>
+          </varlistentry>
+          <varlistentry>
+            <term><varname>ip_ttl</varname></term>
             <listitem>
               <para>
-                <emphasis role="bold">bind_addr</emphasis> specifies the interface on which to send
-                and receive multicast packets. By default JGroups uses the value of the system property <literal>jgroups.bind_addr</literal>, which can be set with the <code>-b</code> 
-                command line switch. See <xref linkend="jgroups-other"/> for more on binding JGroups 
-                sockets.
+                Specifies the <emphasis>time to live</emphasis> (TTL) for IP multicast packets.
+                TTL is the common term in multicast networking, but the value actually refers to
+                how many network hops a packet will be allowed to travel before networking
+                equipment drops it.
               </para>
             </listitem>
-            <listitem>
-              <para><emphasis role="bold">bind_to_all_interfaces</emphasis> overrides the
-                                    <literal>bind_addr</literal> and uses all interfaces in multihome nodes.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">mcast_addr, mcast_port, ip_ttl</emphasis> attributes are the
-                                same as related attributes in the UDP protocol configuration.</para>
-            </listitem>
-          </itemizedlist>
+          </varlistentry>
+        </variablelist>
         </section>
       </section>
       
@@ -736,16 +828,24 @@
       
       <section id="jgroups-fd">
         <title>Failure Detection Protocols</title>
-          <para>
-            The failure detection protocols are used to detect failed nodes. Once a failed node is detected, a <emphasis>suspect verification</emphasis> phase can occur. If the node is still considered dead after this phase is complete, the cluster updates its membership view so that further messages are not sent to the failed node. The service using JGroups is informed that the node is no longer part of the cluster. Failure detection protocols are configured as sub-elements in the JGroups <literal>&lt;config&gt;</literal> element.
-          </para>
+	<para>The failure detection protocols are used to detect failed nodes. Once a failed node is detected, 
+  a suspect verification phase can occur. If the node is still considered dead after this phase, the 
+  cluster updates its view so that the load balancer and client interceptors know to avoid the dead node. 
+  The failure detection protocols are configured as sub-elements in the JGroups MBean 
+  <literal>Config</literal> element.</para>
         
 		
 		
 <section id="jgroups-fd-fd">
           <title>FD</title>
 	  <para>
-		  <literal>FD</literal> is a failure detection protocol based on 'heartbeat' messages. This protocol requires that eat node periodically ping its neighbour. If the neighbour fails to respond, the calling node sends a <literal>SUSPECT</literal> message to the cluster. The current group coordinator can optionally verify that the suspected node is dead (<literal>VERIFY_SUSPECT</literal>). If the node is still considered dead after this verification step, the coordinator updates the cluster's membership view. The following is an example of <literal>FD</literal> configuration:
+		  <literal>FD</literal> is a failure detection protocol based on <emphasis>heartbeat</emphasis>
+      messages. This protocol requires each node to periodically send messages to its neighbour to check 
+      that the neighbour is alive. If the neighbour fails to respond, the calling node sends a SUSPECT 
+      message to the cluster. 
+      The current group coordinator can optionally double check whether the suspected node is indeed 
+      dead. If the node is still considered dead after this check, the group coordinator updates the
+      cluster's view. Here is an example FD configuration:
 	  </para>
 
 <programlisting>
@@ -755,23 +855,30 @@
 </programlisting>
 		
 		
-          <para>The available attributes in the <literal>FD</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the maximum number of milliseconds
-		      to wait for the responses to the are-you-alive messages. The default is 3000.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">max_tries</emphasis> specifies the number of missed
-		      are-you-alive messages from a node before the node is suspected. The default is 2.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">shun</emphasis> specifies whether a failed node will be forbidden from sending messages to the group without formally rejoining. A shunned node would need to rejoin the cluster via the discovery process. JGroups allows applications to configure a channel such that, when a channel is shunned, the process of rejoining the cluster and transferring state. (This is default behavior for JBoss Application Server.)</para>
-            </listitem>
-          </itemizedlist>
+             <para>The available attributes in the <literal>FD</literal> element are listed below.</para>
+    <variablelist>
+      <varlistentry>
+        <term><varname>timeout</varname></term>
+        <listitem><para>Specifies the maximum number of milliseconds to wait for a response to
+        the heartbeat messages. The default value is <literal>3000</literal>.</para></listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><varname>max_tries</varname></term>
+        <listitem><para>Specifies the number of heartbeat messages that a node can fail to reply to
+        before the node is suspected. The default value is <literal>2</literal>.</para></listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><varname>shun</varname></term>
+        <listitem><para>Specifies whether a failed node will be shunned. Once shunned, the node will
+        be expelled from the cluster even if it is later revived. The shunned node would have to rejoin
+        the cluster through the discovery process. You can configure JGroups so that shunning leads to 
+        automatic rejoins and state transfer (the default behavior).</para></listitem>
+      </varlistentry>
+    </variablelist>
+
           <note>
-            <para>
-              Regular traffic from a node is proof of life, so heartbeat messages are only sent when no regular traffic is detected on the node for a long period of time.</para>
+            <para>Normal node traffic is considered proof of life, so heartbeat messages are
+            sent only when there is no normal traffic to the node for some time.</para>
           </note>
         </section>
 	
@@ -779,30 +886,39 @@
         <section id="jgroups-fd-fdsock">
           <title>FD_SOCK</title>
 	  <para>
-        <literal>FD_SOCK</literal> is a failure detection protocol based on a ring of TCP sockets created between group members. Each member in a group connects to its neighbor, with the final member connecting to the first, forming a ring. Node B becomes suspected when its neighbour, Node A, detects an abnormally closed TCP socket, presumably due to a crash in Node B. (When nodes intend to leave the group, they inform their neighbours so that they do not become suspected.)
-      </para>
-      <para>
-        The simplest <literal>FD_SOCK</literal> configuration does not take any attribute. You can declare an empty <literal>FD_SOCK</literal> element in the JGroups <literal>&lt;config&gt;</literal> element.
-      </para>
+      <literal>FD_SOCK</literal> is a failure detection protocol based on a ring of TCP sockets 
+      created between group members. Each member in a group connects to its neighbor (last member 
+      connects to first) thus forming a ring. Member B is suspected when its neighbor A detects an 
+      abnormally closed TCP socket (presumably due to a node B crash). However, if a member B is 
+      about to leave gracefully, it lets its neighbor A know, so that it does not become suspected. 
+      The simplest <literal>FD_SOCK</literal> configuration does not take any attribute. You can 
+      just declare an empty <literal>FD_SOCK</literal> element in JGroups's 
+      <literal>Config</literal> element.</para>
           
 <programlisting>
 &lt;FD_SOCK/&gt;
 </programlisting>
           
-<para>The attributes available to the <literal>FD_SOCK</literal> element are listed below.</para>
-          <itemizedlist>
-            <listitem>
-		    <para><emphasis role="bold">bind_addr</emphasis> specifies the interface to which the server socket should be bound. By default, JGroups uses the value of the system property <literal>jgroups.bind_addr</literal>. This system property can be set with the <code>-b</code> command line switch. For more information about binding JGroups sockets, see <xref linkend="jgroups-other"/>.</para>
-            </listitem>
-          </itemizedlist>
-	  
+<para>The available attributes in the <literal>FD_SOCK</literal> element are listed below.</para>
+    <variablelist>
+      <varlistentry>
+        <term><varname>bind_addr</varname></term>
+        <listitem><para>Specifies the interface to which the server socket should bind. If
+        <code>-Djgroups.bind_address</code> system property is defined, this XML value will be ignored. 
+        This behavior can be reversed by setting the <code>-Djgroups.ignore.bind_addr=true</code> 
+        system property.</para></listitem>
+      </varlistentry>
+    </variablelist>	  
         </section>
         
 	
 	
 	<section><title>VERIFY_SUSPECT</title>
 		<para>
-			This protocol verifies whether a suspected member is really dead by pinging that member once again. This verification is performed by the coordinator of the cluster. The suspected member is dropped from the cluster group if confirmed to be dead. The aim of this protocol is to minimize false suspicions. Here's an example.
+			This protocol verifies that a suspected member is dead by pinging them a second time.
+      This verification is performed by the coordinator of the cluster. The suspected member 
+      is dropped from the cluster group if confirmed dead. The aim of this protocol is to 
+      minimize false suspicions. See the following code for an example:
 		</para>
 
 <programlisting><![CDATA[			
@@ -812,109 +928,104 @@
 	<para>
 		The available attributes in the <literal>VERIFY_SUSPECT</literal> element are listed below. 
 	</para>
-	<itemizedlist>
-		<listitem>
-			<para>
-              <emphasis role="bold">timeout</emphasis> specifies how long to wait for a response from the suspected member before considering it dead.
-            </para>
-        </listitem>
-        </itemizedlist>
-		
+	<variablelist>
+    <varlistentry>
+      <term><varname>timeout</varname></term>
+      <listitem><para>Specifies how long to wait for a response from the suspected member before 
+      considering it dead.</para></listitem>
+    </varlistentry>
+  </variablelist>		
 	</section>
 	
 	
 	<section><title>FD versus FD_SOCK</title>
-		<para>
-			FD and FD_SOCK, each taken individually, do not provide a solid failure detection layer. Let's look at the the differences between these failure detection protocols to understand how they complement each other:
+					<para><literal>FD</literal> and <literal>FD_SOCK</literal> do not individually provide a solid
+      failure detection layer. Their differences are outlined below to show how they complement
+      each other.
 		</para>
-		<itemizedlist>
-			<listitem><para><emphasis>FD</emphasis></para>
-				<itemizedlist>
-					<listitem>
-						<para>
-							An overloaded machine might be slow in sending are-you-alive responses.
-						</para>
-				</listitem>
-				<listitem>
-						<para>
-			A member will be suspected when suspended in a debugger/profiler.
-		</para>
-	</listitem>
-	<listitem>
-        <para>
-			Low timeouts lead to higher probability of false suspicions and higher network traffic.
-		</para>
-	</listitem>
-	<listitem>
-        <para>
-			High timeouts will not detect and remove crashed members for some time.
-		</para>
-	</listitem>
-</itemizedlist>
-</listitem>
+    <itemizedlist>
+      <title><literal>FD</literal></title>
+      <listitem>
+        <para>An overloaded machine might be slow in sending heartbeat responses.</para>
+      </listitem>
+      <listitem>
+        <para>A member will become suspected when suspended in a debugger or profiler.</para>
+      </listitem>
+      <listitem>
+        <para>Low timeouts lead to a higher probability of false suspicions and
+              higher network traffic.</para>
+      </listitem>
+      <listitem>
+        <para>High timeouts will not detect and remove crashed members for a long
+              period of time.</para>
+      </listitem>
+    </itemizedlist>
 
-<listitem><para><emphasis>FD_SOCK</emphasis>:</para>
-<itemizedlist>
-	<listitem>
-		<para>
-			Suspended in a debugger is no problem because the TCP connection is still open.
-		</para>
-	</listitem>
-	<listitem>
-						<para>
-			High load no problem either for the same reason.
-		</para>
-	</listitem>
-	<listitem>
-						<para>
-			Members will only be suspected when TCP connection breaks
-		</para>
-	</listitem>
-</itemizedlist>
+    <itemizedlist>
+      <title><literal>FD_SOCK</literal></title>
+      <listitem>
+        <para>Suspension in a debugger does not mean a member will become suspected
+              because the TCP connection remains open.</para>
+      </listitem>
+      <listitem>
+        <para>High load is not a problem for the same reason.</para>
+      </listitem>
+      <listitem>
+        <para>Members will be suspected only when the TCP connection breaks, so hung 
+              members will not be detected.</para>
+      </listitem>
+      <listitem>
+        <para>A crashed switch will not be detected until the connection encounters
+              the TCP timeout (between two and twenty minutes, depending on TCP/IP
+              stack implementation).</para>
+      </listitem>
+    </itemizedlist>
 
+<para>
+      A failure detection layer aims to report real failures and avoid reporting false
+      suspicions. Two methods of achieving this are outlined in the following paragraphs.
+    </para>
 
-	<itemizedlist>
-		<listitem>
-			<para>
-			So hung members will not be detected.
-		</para>
-	</listitem>
-	<listitem>
-		<para>
+    <para>
+      By default, JGroups configures the <varname>FD_SOCK</varname> socket with 
+      <literal>KEEP_ALIVE</literal>, which means that TCP sends a heartbeat to a socket
+      that has received no traffic in two hours. If a host or immediate switch or router
+      crashed without closing the TCP connection properly, it would be detected shortly
+      after two hours. This is better than never closing the connection (where
+      <literal>KEEP_ALIVE</literal> is off), but may not be helpful. The first solution,
+      therefore, is to lower the timeout value for <literal>KEEP_ALIVE</literal>. This is
+      a kernel-wide value on most operating systems and therefore affects all TCP sockets.
+    </para>
 
-			Also, a crashed switch will not be detected until the connection runs into the TCP timeout (between 2-20 minutes, depending on TCP/IP stack implementation).
-		</para>
-	</listitem>
-</itemizedlist>
-</listitem>
-</itemizedlist>
-
-        <para>
-          A failure detection layer is intended to report real failures promptly, while avoiding false suspicions. There are two solutions:
-		</para>
-		<orderedlist>
-			<listitem>
-		<para>			
-			By default, JGroups configures the FD_SOCK socket with KEEP_ALIVE, which means that TCP sends a heartbeat on socket on which no traffic has been received in 2 hours. If a host crashed (or an intermediate switch or router crashed) without closing the TCP connection properly, we would detect this after 2 hours (plus a few minutes). This is of course better than never closing the connection (if KEEP_ALIVE is off), but may not be of much help. So, the first solution would be to lower the timeout value for KEEP_ALIVE. This can only be done for the entire kernel in most operating systems, so if this is lowered to 15 minutes, this will affect all TCP sockets.
-		</para>
-	</listitem>
-	<listitem>
-		<para>
-			The second solution is to combine FD_SOCK and FD; the timeout in FD can be set such that it is much lower than the TCP timeout, and this can be configured individually per process. FD_SOCK will already generate a suspect message if the socket was closed abnormally. However, in the case of a crashed switch or host, FD will make sure the socket is eventually closed and the suspect message generated. Example:
-		</para>
-	</listitem>
-</orderedlist>
+    <para>
+      Alternatively, you can combine <literal>FD_SOCK</literal> and <literal>FD</literal>.
+      The <varname>timeout</varname> in FD can be set such that it is much lower than the TCP
+      <varname>timeout</varname>. This can be configured on a per-process basis.
+      <literal>FD_SOCK</literal> generates a SUSPECT message if the socket closes abnormally,
+      but in the case of a crashed switch or host, <literal>FD</literal> ensures that the
+      socket is eventually closed, and a suspect message generated.
+    </para>
+    <para>
+      The following code shows how the two could be combined:
+    </para>
 <programlisting><![CDATA[<FD_SOCK/>
 <FD timeout="6000" max_tries="5" shun="true"/>
 <VERIFY_SUSPECT timeout="1500"/>]]>
 </programlisting>
+<para>
+      This code suspects a member when the socket to its neighbour has been closed abnormally
+      (for example, in a process crash, since the operating system closes all sockets).
+      However, if a host or switch crashes, the sockets would not be closed. As a secondary
+      line of defense, <literal>FD</literal> suspects the neighbour after <literal>50</literal>
+      seconds. Note that if you use this example code and your system is stopped in a debugging
+      breakpoint, the node you are debugging will be suspected after the specified fifty seconds.
+    </para>
 
     <para>
-      In this example, a member becomes suspected when the neighbouring socket has been closed abnormally, in a process crash, for instance, since the operating system closes all sockets. However, if a host or switch crashes, the sockets will not be closed. <literal>FD</literal> will suspect the neighbour after sixty seconds (<literal>6000</literal> milliseconds). Note that if this example system were stopped in a breakpoint in the debugger, the node being debugged will be suspected once the <varname>timeout</varname> has elapsed.
+			Combining <literal>FD</literal> and <literal>FD_SOCK</literal> provides a solid failure detection 
+      layer. This technique is used across the JGroups configurations included in
+      JBoss Enterprise Web Platform.
 	</para>
-    <para>
-        A combination of <literal>FD</literal> and <literal>FD_SOCK</literal> provides a solid failure detection layer, which is why this technique is used across the JGroups configurations included with JBoss Application Server.
-	</para>
 	</section>
     </section>
     
@@ -922,35 +1033,55 @@
     
       <section id="jgroups-reliable">
         <title>Reliable Delivery Protocols</title>
-	<para>
-		Reliable delivery protocols within the JGroups stack ensure that messages are actually delivered, and delivered in the correct order (First In, First Out, or FIFO) to the destination node. The basis for reliable message delivery is positive and negative delivery acknowledgments (ACK and NAK). In <literal>ACK</literal> mode, the sender resends the message until acknowledgment is received from the receiver. In <literal>NAK</literal> mode, the receiver requests retransmission when it discovers a gap.
-	</para>
+      	<para>
+      		Reliable delivery protocols within the JGroups stack ensure that data packets are delivered 
+          in the correct order (FIFO) to the destination node. The basis for reliable message delivery 
+          is positive and negative delivery acknowledgments: respectively, <literal>ACK</literal>
+          and <literal>NAK</literal>. In <literal>ACK</literal> mode, the sender resends the message
+          until acknowledgement is received. In <literal>NAK</literal> mode, the receiver requests
+          retransmission when it discovers a gap.
+	      </para>
 	
 	
 	
         <section id="jgroups-reliable-unicast">
           <title>UNICAST</title>
           <para>
-		  The <literal>UNICAST</literal> protocol is used for unicast messages. It uses positive acknowlegements (<literal>ACK</literal>). It is configured as a sub-element under the JGroups <literal>config</literal> element. If the JGroups stack is configured with the TCP transport protocol, <literal>UNICAST</literal> is not necessary because TCP itself guarantees FIFO delivery of unicast messages. Here is an example configuration for the <literal>UNICAST</literal> protocol:</para>
+		        The <literal>UNICAST</literal> protocol is used for unicast messages. It uses 
+            <literal>ACK</literal>. It is configured as a sub-element under the JGroups 
+            <literal>Config</literal> element. <literal>UNICAST</literal> is not required for 
+            JGroups stacks configured with the TCP transport protocol, since TCP guarantees 
+            FIFO delivery of unicast messages. The following is an example of 
+            <literal>UNICAST</literal> protocol:</para>
 
 <programlisting>
 &lt;UNICAST timeout="300,600,1200,2400,3600"/&gt;
 </programlisting>
           
 <para>There is only one configurable attribute in the <literal>UNICAST</literal> element.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">timeout</emphasis> specifies the retransmission timeout (in
-                  milliseconds). For instance, if the timeout is <literal>100,200,400,800</literal>, the sender resends the message if it has not received an <literal>ACK</literal> after 100 milliseconds the first time, and the second time it waits for 200 milliseconds before resending, and so on. A low value for the first timeout allows for prompt retransmission of dropped messages, but means that messages may be transmitted more than once if they have not actually been lost (that is, the message has been sent, but the <literal>ACK</literal> has not been received before the timeout). High values (<literal>1000,2000,3000</literal>) can improve performance if the network is tuned such that UDP datagram loss is infrequent. High values on networks with frequent losses will be harmful to performance, since later messages will not be delivered until lost messages have been retransmitted.
-              </para>
-            </listitem>
-          </itemizedlist>
+          <variablelist>
+            <varlistentry>
+              <term><varname>timeout</varname></term>
+              <listitem><para>Specifies the retransmission timeout in milliseconds. For example,
+              if the timeout is <literal>"100,200,400,800"</literal>, the sender resends the message
+              if it has not received an <literal>ACK</literal> after 100 milliseconds the first time,
+              200 milliseconds the second time, and so on. A low value for the first timeout allows
+              for prompt retransmisson of dropped message, but means that messages can be sent more
+              than once if only the acknowledgement was not received before timeout. High values
+              can improve performance if the network is tuned such that datagram loss is infrequent.</para></listitem>
+            </varlistentry>
+          </variablelist>
         </section>
 	
 	
         <section id="jgroups-reliable-nakack">
           <title>NAKACK</title>
-          <para>The <literal>NAKACK</literal> protocol is used for multicast messages. It uses negative acknowlegements (<literal>NAK</literal>). Under this protocol, each message is tagged with a sequence number. The receiver keeps track of the received sequence numbers and delivers the messages in order. When a gap in the series of received sequence numbers is detected, the receiver schedules a task to periodically ask the sender to retransmit the missing message. The task is cancelled if the missing message is received. <literal>NAKACK</literal> protocol is configured as the <literal>pbcast.NAKACK</literal> sub-element under the JGroups <literal>&lt;config&gt;</literal> element. Here is an example configuration:</para>
+          <para>The <literal>NAKACK</literal> protocol is used for multicast messages. It uses 
+          <literal>NAK</literal>. Under this protocol, each message is tagged with a sequence number.
+          The receiver tracks the sequence numbers to deliver the messages in order. When a gap in the
+          sequence is detected, the receiver asks the sender to retransmit the missing message. The
+          <literal>NAKACK</literal> protocol is configured as the <literal>pbcast.NAKACK</literal>
+          sub-element under the JGroups <literal>Config</literal> element, like so:</para>
 
 <programlisting>
 &lt;pbcast.NAKACK max_xmit_size="60000" use_mcast_xmit="false" 
@@ -960,168 +1091,222 @@
           
 
 <para>The configurable attributes in the <literal>pbcast.NAKACK</literal> element are as follows.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">retransmit_timeout</emphasis> specifies the series of timeouts (in milliseconds) after which retransmission
-              is requested if a missing message has not yet been received.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">use_mcast_xmit</emphasis> determines whether the sender should
-                send the retransmission to the entire cluster rather than just to the node requesting it.
-                This is useful when the <emphasis>sender</emphasis>'s network layer tends to drop packets,
-                avoiding the need to individually retransmit to each node.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">max_xmit_size</emphasis> specifies the maximum size (in bytes) for a bundled retransmission, if multiple messages are reported missing.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">discard_delivered_msgs</emphasis> specifies whether to discard
-                delivered messages on the receiver nodes. By default, nodes save delivered messages so
-                any node can retransmit a lost message in case the original sender has crashed
-                or left the group. However, if we only ask the sender to resend its messages, we can enable this option and discard delivered messages.</para>
-            </listitem>
-	    
-	    <listitem>
-		    <para><emphasis role="bold">gc_lag</emphasis> specifies the number of messages to keep in memory for retransmission, even after the periodic cleanup protocol (see <xref linkend="jgroups-other-gc"/>) indicates all peers have received the message. The default value is <literal>20</literal>.
-		    </para>
-	    </listitem>
-	    
-          </itemizedlist>
+          <variablelist>
+            <varlistentry>
+              <term><varname>retransmit_timeout</varname></term>
+              <listitem><para>Specifies the retransmission timeout in milliseconds. This is same as
+              the <varname>timeout</varname> attribute in the <literal>UNICAST</literal> protocol.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>use_mcast_xmit</varname></term>
+              <listitem><para>Determines whether the sender should send retransmit to the entire cluster
+              rather than just the node requesting the retransmit. This is useful when the sender drops
+              the packet, so that we do not need to retransmit for each node.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>max_xmit_size</varname></term>
+              <listitem><para>Specifies maximum size for a bundled retransmission, if multiple packets
+              are reported missing.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>discard_delivered_msgs</varname></term>
+              <listitem><para>Specifies whether to discard delivered messages on receiver nodes. By
+              default, we save all delivered messages. If the sender can resend the message, we can
+              enable this option and discard delivered messages.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>gc_lag</varname></term>
+              <listitem><para>Specifies the number of messages to keep in memory for retransmission,
+              even after the periodic cleanup protocol (see <xref linkend="jgroups-other-gc"/>). 
+              The default value is <literal>20</literal>.</para></listitem>
+            </varlistentry>
+          </variablelist>
         </section>
-     </section><section id="jgroups-other-gms">
+      </section>
+      
+      <section id="jgroups-other-gms">
           <title>Group Membership (GMS)</title>
-          <para>The group membership service (GMS) protocol in the JGroups stack 
-          maintains a list of active nodes. It handles the requests to join and 
-          leave the cluster. It also handles the SUSPECT messages sent by failure
-          detection protocols. All nodes in the cluster, as well as any interested
-          services like JBoss Cache or HAPartition, are notified if the group membership changes. The group membership service is configured in the <literal>pbcast.GMS</literal> sub-element under the JGroups <literal>config</literal> element. Here is an example configuration.</para>
-<programlisting>
-&lt;pbcast.GMS print_local_addr="true"
+          <para>The group membership service in the JGroups stack maintains a list of active nodes. 
+          It handles requests to join and leave the cluster. It also handles the SUSPECT messages 
+          sent by failure detection protocols. All nodes in the cluster, as well as the load balancer 
+          and client side interceptors, are notified if the group membership changes. The group 
+          membership service is configured in the <literal>pbcast.GMS</literal> sub-element under the 
+          JGroups <literal>Config</literal> element, like so:</para>
+<programlisting>&lt;pbcast.GMS print_local_addr="true"
     join_timeout="3000"
     join_retry_timeout="2000"
     shun="true"
-    view_bundling="true"/&gt;
-</programlisting>
+    view_bundling="true"/&gt;</programlisting>
           
 
 
 <para>The configurable attributes in the <literal>pbcast.GMS</literal> element are as follows.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">join_timeout</emphasis> specifies the maximum number of
-                                milliseconds to wait for a new node JOIN request to succeed. Retry afterwards.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">join_retry_timeout</emphasis> specifies the number of
-                                milliseconds to wait after a failed JOIN before trying again.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">print_local_addr</emphasis> specifies whether to dump the node's
-                                own address to the standard output when started.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">shun</emphasis> specifies whether a node should shun 
-              (that is, disconnect) itself if it receives a cluster view in which it is not a member node.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">disable_initial_coord</emphasis> specifies whether to prevent
-                    this node from becoming the cluster coordinator during the initial connection of the channel. This flag does not prevent a node becoming the coordinator after the initial channel connection, if the current coordinator leaves the group.
-              </para>
-            </listitem>
-       <listitem>
-          <para><emphasis role="bold">view_bundling</emphasis> specifies whether multiple JOIN or LEAVE requests arriving at the same time are bundled and handled together at the same time, resulting in only one new view that incorporates all changes. This is is more efficient than handling each request separately.
-            </para>
-            </listitem>
-       
-          </itemizedlist>
+
+          <variablelist>
+            <varlistentry>
+              <term><varname>join_timeout</varname></term>
+              <listitem><para>Specifies the maximum number of milliseconds to wait for a new node
+              <literal>JOIN</literal> request to succeed. Retries afterward.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>join_retry_timeout</varname></term>
+              <listitem><para>Sepcifies the maximum numver of milliseconds to wait after a failed
+              <literal>JOIN</literal> request to resubmit the request.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>print_local_addr</varname></term>
+              <listitem><para>Specifies whether to dump the node's own address to the output
+              when started.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>shun</varname></term>
+              <listitem><para>Specifies whether a node should shun itself if it receives a cluster view
+              that is not a member node.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>disable_initial_coord</varname></term>
+              <listitem><para>Specifies whether to prevent this node from becoming the cluster
+              coordinator.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>view_bundling</varname></term>
+              <listitem><para>Specifies whether multiple <literal>JOIN</literal> or 
+              <literal>LEAVE</literal> requests arriving at the same time are bundled together
+              and handled at the same time. This is more efficient than handling each request
+              separately, as it sends only one new view.
+              </para></listitem>
+            </varlistentry>
+          </variablelist>
         </section>
    
    
         <section id="jgroups-other-fc">
           <title>Flow Control (FC)</title>
-          <para>The flow control (FC) protocol tries to adapt the data sending rate 
-          to the data receipt rate among nodes. If a sender node is too fast, it 
-          might overwhelm the receiver node and result in out-of-memory conditions 
-          or dropped packets that have to be retransmitted. In JGroups, flow control is implemented via a
+          <para>The flow control service tries to adapt the sending data rate and the receiving data among
+          nodes. If a sender node is too fast, it might overwhelm the receiver node and result in dropped
+          packets that have to be retransmitted. In JGroups, the flow control is implemented via a
           credit-based system. The sender and receiver nodes have the same number of credits (bytes) to
           start with. The sender subtracts credits by the number of bytes in messages it sends. The
           receiver accumulates credits for the bytes in the messages it receives. When the sender's credit
-          drops to a threshold, the receivers send some credit to the sender. If the sender's credit is
-          used up, the sender blocks until it receives credits from the receiver. The flow control protocol
+          drops to a threshold, the receivers sends some credit to the sender. If the sender's credit is
+          used up, the sender blocks until it receives credits from the receiver. The flow control service
           is configured in the <literal>FC</literal> sub-element under the JGroups
-          <literal>config</literal> element. Here is an example configuration.</para>
+          <literal>Config</literal> element. Here is an example configuration.</para>
 
-<programlisting>
-&lt;FC max_credits="2000000"
+<programlisting>&lt;FC max_credits="2000000"
     min_threshold="0.10" 
-    ignore_synchronous_response="true"/&gt;
-</programlisting>
+    ignore_synchronous_response="true"/&gt;</programlisting>
           
 
 <para>The configurable attributes in the <literal>FC</literal> element are as follows.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">max_credits</emphasis> specifies the maximum number of credits
-                                (in bytes). This value should be smaller than the JVM heap size.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">min_credits</emphasis> specifies the minimum number of bytes that must be received before the receiver will send more credits to the sender.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">min_threshold</emphasis> specifies the percentage of the
-                                <literal>max_credits</literal> that should be used to calculate <literal>min_credits</literal>. 
-                                Setting this overrides the <literal>min_credits</literal> attribute.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">ignore_synchronous_response</emphasis> specifies whether threads that have carried messages up to the application should be allowed to carry outgoing messages back down through FC without blocking for credits. <emphasis>Synchronous response</emphasis> refers to the fact that these messages are generally responses to incoming RPC-type messages. Forbidding JGroups threads to carry messages up to block in FC can help prevent certain deadlock scenarios, so we recommend setting this to <literal>true</literal>.</para>
-            </listitem>
-          </itemizedlist>
+          <variablelist>
+            <varlistentry>
+              <term><varname>max_credits</varname></term>
+              <listitem><para>Specifies the maximum number of credits in bytes. This value
+              should be smaller than the JVM heap size.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>min_credits</varname></term>
+              <listitem><para>Specifies the threshold credit on the sender, below which the
+              receiver should send more credits.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>min_threshold</varname></term>
+              <listitem><para>Specifies percentage value of the threshold. This attribute
+              overrides <varname>min_credits</varname>.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>ignore_synchronous_response</varname></term>
+              <listitem><para>
+                Specifies whether threads that have carried messages to the application are allowed
+                to carry outgoing messages back down through flow control without blocking for
+                credits. <emphasis>Synchronous response</emphasis> refers to these messages usually
+                being responses to incoming RPC-type messages. We recommend setting this to 
+                <literal>true</literal> to help prevent certain deadlock scenarios.
+              </para></listitem>
+            </varlistentry>
+          </variablelist>
     <note>
-   <title>Why is FC needed on top of TCP ? TCP has its own flow control!</title>
-   <para>
-      FC is required for group communication where group messages must be sent at the highest speed that the slowest receiver can handle. For example, say we have a cluster comprised of nodes A, B, C and D. D is slow (perhaps overloaded), while the rest are fast. When A sends a group message, it does so via TCP connections: A-A (theoretically), A-B, A-C and A-D.
+	    <title>Why is FC needed on top of TCP? TCP has its own flow control!</title>
+	    <para>
+	      The <literal>FC</literal> element is required for group communication where group messages must be 
+        sent at the highest speed that the slowest receiver can handle.
+      </para>
+      <para>
+        Say we have a cluster, <literal>{A,B,C,D}</literal>. Node <literal>D</literal> is slow, and the
+        other nodes are fast. When <literal>A</literal> sends a group message, it establishes the 
+        following TCP connections: <literal>A-A</literal>, <literal>A-B</literal>, <literal>A-C</literal>, 
+        and <literal>A-D</literal>.
+      </para>
+      <para>
+        <literal>A</literal> sends 100 million messages to the cluster. TCP's flow control applies to the
+        connections between <literal>A-B</literal>, <literal>A-C</literal> and <literal>A-D</literal>
+        individually, but not to <literal>A-{B,C,D}</literal>, where <literal>{B,C,D}</literal> is the
+        group. It is therefore possible that nodes <literal>A</literal>, <literal>B</literal> and 
+        <literal>C</literal> receive the 100 million messages, but that node <literal>D</literal> will 
+        only receive one million messages. This is also the reason we need <literal>NAKACK</literal>,
+        even though TCP does its own retransmission.
+	    </para>
+	    <para>
+        JGroups has to buffer all messages in memory in case the original sender dies and a node asks for
+        retransmission of a message. Because all members buffer all messages they receive, they must
+        occasionally purge <emphasis>stable</emphasis> messages (messages seen by all nodes). This is done
+        with the <literal>STABLE</literal> protocol, which can be configured to run the stability protocol
+        based on either time (for example, every fifty seconds) or size (every 400 kilobytes of data
+        received).
+	    </para>
+	    <para>
+        In the example case, the slow node <literal>D</literal> will prevent the group from purging
+        messages other than the one million seen by <literal>D</literal>. In most cases this leads to
+        out-of-memory exceptions, so messages must be sent at a rate that the slowest receiver can handle.
+	    </para>
+    </note>
+
+  <note>
+		<title>So do I always need FC?</title>
+	  <para>
+      This depends on the application's use of the JGroups channel. If node <literal>A</literal> from
+      the previous example was able to slow its send rate because <literal>D</literal> was not keeping up,
+      <literal>FC</literal> would not be required.
     </para>
     <para>
-      Say A sends 100 million messages to the cluster. TCP's flow control applies to A-B, A-C and A-D individually, but not to A-BCD as a group. Therefore, A, B and C will receive the 100 million messages, but D will receive only 1 million. (This is also why <literal>NAKACK</literal> is required, even though TCP handles its own retransmission.)
-   </para>
-   <para>
-      JGroups must buffer all messages in memory in case an original sender <emphasis>S</emphasis> dies and a node requests retransmission of a message sent by <emphasis>S</emphasis>. Since all members buffer all messages that they receive, stable messages (messages seen by every node) must sometimes be purged. (The purging process is managed by the <literal>STABLE</literal> protocol. For more information, see <xref linkend="jgroups-other-gc"/>.)
-   </para>
-   <para>      
-      In the above case, the slow node D will prevent the group from purging messages above 1M, so every member will buffer 99M messages ! This in most cases leads to OOM exceptions. Note that - although the sliding window protocol in TCP will cause writes to block if the window is full - we assume in the above case that this is still much faster for A-B and A-C than for A-D.
-   </para>
-   <para>
-      So, in summary, even with TCP we need to FC to ensure we send messages at a rate the slowest receiver (D) can handle.
-   </para>
-</note>
+      Applications that make synchronous group RPC calls are unlikely to require <literal>FC</literal>.
+      In synchronous applications, the thread that makes the call blocks waiting for responses from all
+      group members. This means that the threads on node <literal>A</literal> that make the calls would
+      block waiting for responses from node <literal>D</literal>, naturally slowing the overall rate
+      of calls.
+    </para>
+    <para>
+      A JBoss Cache cluster configured for <varname>REPL_SYNC</varname> is one example of an application
+      that mades synchronous group RPC calls. If a channel is used only for a cache configured for
+      <varname>REPL_SYNC</varname>, we recommend removing <literal>FC</literal> from its protocol stack.
+    </para>
+    <para>
+      If your cluster consists of two nodes, including <literal>FC</literal> in a TCP-based protocol
+      stack is unnecessary, since TCP's internal flow control can handle one peer-to-peer
+      relationship.
+    </para>
+    <para>
+      <literal>FC</literal> may also be omitted where a channel is used by a JBoss Cache configured for
+      buddy replication with a single buddy. Such a channel acts much like a two-node cluster, where
+      messages are only exchanged with one other node. Other messages related to data gravitation will be
+      sent to all members, but these should be infrequent.
+    </para>
+  </note>
 
-<note>
-      <title>So do I always need FC?</title>
-   <para>
-      This depends on how the application uses the JGroups channel. Referring to the example above, if there was something about the application that would naturally cause A to slow down its rate of sending because D wasn't keeping up, then FC would not be needed.
-   </para>
-   <para>
-      A good example of such an application is one that uses JGroups to make synchronous group RPC calls. By synchronous, we mean the thread that makes the call blocks waiting for responses from all the members of the group. In that kind of application, the threads on A that are making calls would block waiting for responses from D, thus naturally slowing the overall rate of calls.
-   </para>
-   <para>
-      A JBoss Cache cluster configured for REPL_SYNC is a good example of an application that makes synchronous group RPC calls. If a channel is only used for a cache configured for REPL_SYNC, we recommend you remove FC from its protocol stack.
-   </para>
-   <para>
-      And, of course, if your cluster only consists of two nodes, including FC in a TCP-based protocol stack is unnecessary. There is no group beyond the single peer-to-peer relationship, and TCP's internal flow control will handle that just fine.
-   </para>
-   <para>
-      Another case where FC may not be needed is for a channel used by a JBoss Cache configured for buddy replication and a single buddy. Such a channel will in many respects act like a two node cluster, where messages are only exchanged with one other node, the buddy. (There may be other messages related to data gravitation that go to all members, but in a properly engineered buddy replication use case these should be infrequent. But if you remove FC be sure to load test your application.)
-   </para>  </note>
+  <important>
+    <title>If you remove <literal>FC</literal></title>
+    <para>Be sure to load test your application if you remove the <literal>FC</literal> element.</para>
+  </important>
 </section>
-   
-
-     
         </section>
    
 
 <section><title>Fragmentation (FRAG2)</title>
    <para>
-      This protocol fragments messages that are larger than a certain size, and reassembles them at the receiver's side. It works for both unicast and multicast messages. It is configured with the <literal>FRAG2</literal> sub-element in the JGroups <literal>config</literal> element. Here is an example configuration:
+		This protocol fragments messages larger than certain size. Messages are rejoined at the
+    receiving end. This works for both unicast and multicast messages. It is configured in the
+    <literal>FRAG2</literal> sub-element under the JGroups <literal>Config</literal> element, like so:
    </para>
 <programlisting><![CDATA[  
       <FRAG2 frag_size="60000"/>]]>
@@ -1131,25 +1316,36 @@
 The configurable attributes in the FRAG2 element are as follows.
 </para>
 
-<itemizedlist>
-   <listitem><para><emphasis role="bold">frag_size</emphasis> specifies the maximum message size (in bytes) before fragmentation occurs. Messages larger than this size are fragmented. For stacks that use the UDP transport, this value must be lower than 64 kilobytes (the maximum UDP datagram size). For TCP-based stacks, it must be lower than the value of <varname>max_credits</varname> in the FC protocol.</para></listitem>
-</itemizedlist>
+<variablelist>
+  <varlistentry>
+    <term><varname>frag_size</varname></term>
+    <listitem><para>Specifies the maximum size of a fragment, in bytes. Messages larger than
+    this value are fragmented. For stacks that use the UDP transport, this value must be lower
+    than 64 kilobytes (the maximum UDP datagram size). For TCP-based stacks, it must be lower than
+    the value of <varname>max_credits</varname> in the FC protocol.</para></listitem>
+  </varlistentry>
+</variablelist>
 
-<note>
-   <para>
-      TCP protocol already provides fragmentation, but a JGroups fragmentation protocol is still required if FC is used. The reason for this is that if you send a message larger than <literal>FC.max_credits</literal>, the FC protocol will block forever. So, <literal>frag_size</literal> within FRAG2 must always be set to a value lower than that of  <literal>FC.max_credits</literal>.
-   </para>
-</note>
+<important>
+	<para>
+		The TCP protocol provides fragmentation, but a JGroups fragmentation protocol is still
+    required if <literal>FC</literal> is used, because if you send a message larger than
+    <literal>FC.max_credits</literal>, the <literal>FC</literal> protocol blocks. The
+    <varname>frag_size</varname> within <literal>FRAG2</literal> must always be less than
+    <literal>FC.max_credits</literal>.
+	</para>
+</important>
 
 
 </section>
    
         <section id="jgroups-other-st">
           <title>State Transfer</title>
-          <para>The state transfer service transfers the state from an existing node (i.e., the cluster
-                        coordinator) to a newly joining node. It is configured in the
-                        <literal>pbcast.STATE_TRANSFER</literal> sub-element under the JGroups <literal>Config</literal>
-                        element. It does not have any configurable attribute. Here is an example configuration.</para>
+          <para>The state transfer service transfers the state from an existing node (that is, the 
+          cluster coordinator) to a newly joining node. It is configured with the 
+          <literal>pbcast.STATE_TRANSFER</literal> sub-element under the JGroups 
+          <literal>Config</literal> element, as seen in the following code example. It has no 
+          configurable attribute.</para>
 <programlisting>
 &lt;pbcast.STATE_TRANSFER/&gt;
 </programlisting>
@@ -1157,9 +1353,14 @@
 
    <section id="jgroups-other-gc">
           <title>Distributed Garbage Collection (STABLE)</title>
-          <para>
-            In a JGroups cluster, all nodes must store all messages received for potential retransmission in case of a failure. However, if we store all messages forever, we will run out of memory. The distributed garbage collection service periodically purges messages that have been seen by all nodes, removing them from the memory in each node. The distributed garbage collection service is configured in the <literal>pbcast.STABLE</literal> sub-element under the JGroups  <literal>config</literal> element. Here is an example configuration.
-        </para>
+    <para>
+      In a JGroups cluster, all nodes must store all messages received for potential 
+      retransmission in case of a failure. However, if we store all messages forever, we will 
+      run out of memory. The distributed garbage collection service in JGroups periodically 
+      purges messages that have seen by all nodes from the memory in each node. The distributed 
+      garbage collection service is configured in the <literal>pbcast.STABLE</literal> sub-element 
+      under the JGroups  <literal>Config</literal> element, like so:
+	  </para>
 
 <programlisting>
 &lt;pbcast.STABLE stability_delay="1000"
@@ -1168,19 +1369,27 @@
 </programlisting>
           
 <para>The configurable attributes in the <literal>pbcast.STABLE</literal> element are as follows.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">desired_avg_gossip</emphasis> specifies intervals (in
-                                milliseconds) of garbage collection runs. Set this to <literal>0</literal> to disable interval-based garbage collection.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">max_bytes</emphasis> specifies the maximum number of bytes
-                                received before the cluster triggers a garbage collection run. Set to <literal>0</literal> to disable garbage collection based on the bytes received.</para>
-            </listitem>
-            <listitem>
-          <para><emphasis role="bold">stability_delay</emphasis> specifies the maximum time period (in milliseconds) of a random delay introduced before a node sends its <literal>STABILITY</literal> message at the end of a garbage collection run. The delay gives other nodes concurrently running a <literal>STABLE</literal> task a chance to send first. If used together with <literal>max_bytes</literal>, this attribute should be set to a small number.</para>
-            </listitem>
-          </itemizedlist>
+          <variablelist>
+      <varlistentry>
+        <term><varname>desired_avg_gossip</varname></term>
+        <listitem><para>Specifies the interval (in milliseconds) between garbage collection runs.
+        Setting this parameter to <literal>0</literal> disables this service.</para></listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><varname>max_bytes</varname></term>
+        <listitem><para>Specifies the maximum number of bytes to receive before triggering a 
+        garbage collection run. Setting this parameter to <literal>0</literal> disables this 
+        service.</para>
+          <note><para>Set <varname>max_bytes</varname> when you have a high-traffic cluster.</para></note>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term><varname>stability_delay</varname></term>
+        <listitem><para>Specifies the delay period before a <literal>STABILITY</literal> message
+        is sent. If used together with <varname>max_bytes</varname>, this attribute should be set
+        to a small number.</para></listitem>
+      </varlistentry>
+    </variablelist>
           <note>
             <para>Set the <literal>max_bytes</literal> attribute when you have a high traffic
                         cluster.</para>
@@ -1189,7 +1398,10 @@
         <section id="jgroups-other-merge">
           <title>Merging (MERGE2)</title>
           <para>
-        When a network error occurs, the cluster might be partitioned into several different partitions. JGroups has a MERGE service that allows the coordinators in partitions to communicate with each other and form a single cluster back again. The merging service is configured in the <literal>MERGE2</literal> sub-element under the JGroups <literal>Config</literal> element. Here is an example configuration.
+		        When a network error occurs, a cluster may be divided into several partitions. The JGroups
+            <literal>MERGE</literal> service lets partitions communicate with each other and reform into
+            a single cluster. Merging is configured in the <literal>MERGE2</literal> sub-element in the
+            JGroups <literal>Config</literal> element, like so:
       </para>
 
 <programlisting>
@@ -1199,24 +1411,34 @@
       
       
           <para>The configurable attributes in the <literal>MERGE2</literal> element are as follows.</para>
-          <itemizedlist>
-            <listitem>
-              <para><emphasis role="bold">max_interval</emphasis> specifies the maximum number of
-                                milliseconds to wait before sending a MERGE message.</para>
-            </listitem>
-            <listitem>
-              <para><emphasis role="bold">min_interval</emphasis> specifies the minimum number of
-                                milliseconds to wait before sending a MERGE message.</para>
-            </listitem>
-          </itemizedlist>
-          <para>JGroups chooses a random value between <literal>min_interval</literal> and
-                            <literal>max_interval</literal> to periodically send the MERGE message.</para>
+          <variablelist>
+            <varlistentry>
+              <term><varname>max_interval</varname></term>
+              <listitem><para>Specifies the maximum number of milliseconds between 
+              <literal>MERGE</literal> messages.</para></listitem>
+            </varlistentry>
+            <varlistentry>
+              <term><varname>min_interval</varname></term>
+              <listitem><para>Specifies the minimum number of milliseconds between
+              <literal>MERGE</literal> messages.</para></listitem>
+            </varlistentry>
+          </variablelist>
+          <para>
+            JGroups selects a random value between <varname>min_interval</varname> and 
+            <varname>max_interval</varname> to send the <literal>MERGE</literal> message.
+          </para>
           <note>
-        <para>
-           The application state maintained by the application using a channel is not merged by JGroups during a merge. This must be done by the application.</para> 
-           </note>
-        <note>
-           <para>If <literal>MERGE2</literal> is used in conjunction with <literal>TCPPING</literal>, the <literal>initial_hosts</literal> attribute must contain all the nodes that could potentially be merged back, in order for the merge process to work properly. Otherwise, the merge process may not detect all sub-groups, and may miss those comprised solely of unlisted members.</para>
+            <title><literal>MERGE</literal> does not merge cluster states</title>
+            <para>
+              Merging the application state maintained by the application using a channel must be done
+              by the application. If <literal>MERGE2</literal>
+              is used in conjunction with <literal>TCPPING</literal>, the <varname>initial_hosts</varname>
+              attribute must list all nodes to be merged. Only the listed nodes will be included
+              in the merge process.</para> 
+            <para>
+              Alternatively, <literal>MPING</literal> can be used with TCP to provide multicast member
+              discovery capabilities without needing to specify all nodes.
+            </para>
           </note>
         </section>
 
@@ -1226,66 +1448,100 @@
 	
 	<section><title>Binding JGroups Channels to a Particular Interface</title>
 		<para>
-			In the Transport Protocols section above, we briefly touched on how the interface to which JGroups will bind sockets is configured. Let's get into this topic in more depth:
+			<xref linkend="jgroups-transport"/> briefly described the interface to which JGroups binds sockets. 
+      Read this section to understand how to configure this interface.
 		</para>
 		<para>
-			First, it is important to understand that the value set in any <literal>bind_addr</literal> element in an XML configuration file will be ignored by JGroups if it finds that the system property <literal>jgroups.bind_addr</literal> (or a deprecated earlier name for the same thing, <literal>bind.address</literal>) has been set. The system property has a higher priority level than the XML property. If JBoss Application Server is started with the <literal>-b</literal> (or <literal>--host</literal>) switch, the application server will set <literal>jgroups.bind_addr</literal> to the specified value. If <literal>-b</literal> is not set, the application server will bind most services to <literal>localhost</literal> by default.
+      The value set in any <varname>bind_addr</varname> element in an XML configuration file is
+      ignored by JGroups if the <varname>jgroups.bind_addr</varname> (or deprecated 
+      <varname>bind.address</varname> system property is already set. The system property will
+      always override the XML configuration. The <code>-b</code> (or <code>--host</code>) switch
+      is used to set the <varname>jgroups.bind_addr</varname> system property at server startup.
 		</para>
 		<para>
-			So, what are <emphasis>best practices</emphasis> for managing how JGroups binds to interfaces?
+      By default, JBoss Enterprise Web Platform binds most services to the local host if the
+      <code>-b</code> switch is not set. Therefore, most users need to set <code>-b</code>,
+      and any XML configuration will be ignored.
 		</para>
-		<itemizedlist>
-			<listitem>
-				<para>
-			Binding JGroups to the same interface as other services.  Simple, just use <literal>-b</literal>:</para>
-		<screen>./run.sh -b 192.168.1.100 -c all</screen>
-</listitem>
-<listitem>
-	<para>
-			Binding services (e.g., JBoss Web) to one interface, but use a different one for JGroups:</para>
-		<screen>./run.sh -b 10.0.0.100 -Djgroups.bind_addr=192.168.1.100 -c all</screen>
 		<para>
-			Specifically setting the system property overrides the <literal>-b</literal> value. This is a common usage pattern; put client traffic on one network, with intra-cluster traffic on another.
+			So, what are <emphasis>best practices</emphasis> for managing how JGroups binds to interfaces?
 		</para>
-	</listitem>
-	<listitem>
-	<para>
-			
-			Binding services (e.g., JBoss Web) to all interfaces.  This can be done like this:
-		<screen>./run.sh -b 0.0.0.0 -c all</screen>
-		However, doing this will not cause JGroups to bind to all interfaces! Instead , JGroups will bind to the machine's default interface.  See the Transport Protocols section for how to tell JGroups to receive or send on all interfaces, if that is what you really want.
-	</para>
-	</listitem>
-	<listitem>
-	<para>	
-		Binding services (e.g., JBoss Web) to all interfaces, but specify the JGroups interface:</para>
-		<screen>./run.sh -b 0.0.0.0 -Djgroups.bind_addr=192.168.1.100 -c all</screen>
-		<para>
-			Again, specifically setting the system property overrides the <literal>-b</literal> value.
-		</para>
-	</listitem>
-	<listitem>
-	<para>	
-		Using different interfaces for different channels:</para>
-		<screen>./run.sh -b 10.0.0.100 -Djgroups.ignore.bind_addr=true -c all</screen>
-	</listitem>
-</itemizedlist>
-
-<para>
-This setting tells JGroups to ignore the <literal>jgroups.bind_addr</literal> system property, and instead use whatever is specfied in XML. You would need to edit the various XML configuration files to set the various <literal>bind_addr</literal> attributes to the desired interfaces.    
-		</para>
+    <variablelist>
+      <varlistentry>
+        <term>Bind JGroups to the same interface as other services</term>
+        <listitem>
+          <para>Use the <code>-b</code> switch, like so:</para>
+          <screen>./run.sh -b 192.168.1.100 -c production</screen>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term>Bind services to one interface and JGroups to another</term>
+        <listitem>
+          <para>Specifically setting the system property with <code>-D</code> overrides the
+          value specified by <code>-b</code>:</para>
+          <screen>./run.sh -b 10.0.0.100 -Djgroups.bind_addr=192.168.1.100 -c production</screen>
+          <para>The code here is a common usage pattern. It places client traffic on one
+          network and intra-cluster traffic on another.</para>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term>Bind services to all interfaces</term>
+        <listitem>
+          <para>Bind services to all interfaces with the following command on startup:</para>
+          <screen>./run.sh -b 0.0.0.0 -c production</screen>
+          <note>
+            <title>This will not bind JGroups to all interfaces</title>
+            <para>JGroups will bind to the machine's default interface. See 
+            <xref linkend="jgroups-transport"/> to learn how to tell JGroups to send and receive
+            on all interfaces.</para>
+          </note>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term>Bind services to all interfaces and specify a JGroups interface</term>
+        <listitem>
+          <para>Specifically setting the system property with <code>-D</code> overrides the
+          value specified by <code>-b</code>:</para>
+          <screen>./run.sh -b 0.0.0.0 -Djgroups.bind_addr=192.168.1.100 -c production</screen>
+        </listitem>
+      </varlistentry>
+      <varlistentry>
+        <term>Use different interfaces for different channels</term>
+        <listitem>
+          <para>
+            Set the <varname>jgroups.ignore.bind_addr</varname> property to <literal>true</literal>
+            on server startup, like so:
+          </para>
+          <screen>./run.sh -b 10.0.0.100 -Djgroups.ignore.bind_addr=true -c production</screen>
+          <para>This setting tells JGroups to ignore the <varname>jgroups.bind_addr</varname>
+          system property and use the value specified in the XML. You would then edit the XML
+          configuration files to specify the <varname>bind_addr</varname> to the desired
+          interface.</para>
+        </listitem>
+      </varlistentry>
+    </variablelist>
 	</section>
 	
 	<section id="clustering-jgroups-isolation"><title>Isolating JGroups Channels</title>
 		<para>
-			Within JBoss Application Server, there are a number of services that independently create JGroups channels &#8212; possibly multiple different JBoss Cache services (used for <literal>HttpSession</literal> replication, EJB3 stateful session bean replication and EJB3 entity replication), two JBoss Messaging channels, and <application>HAPartition</application>, the general purpose clustering service that underlies most other JBossHA services.
-		</para>
+      A number of services independently create JGroups channels: three JBoss Cache services
+      (used for HTTP session replication, EJB3 SFSB replication, and EJB3 entity replication),
+      and <classname>HAPartition</classname>, a clustering service that underlies most 
+      JBoss high availability services.
+    </para>
+    <important><title>These channels must only communicate with their intended peers</title>
+      <para>
+        They must not communicate with channels used by other services, or channels for the same 
+        service opened on machines outside the group. Nodes communicating improperly is one of
+        the most common issues for users attempting to cluster JBoss Enterprise Web Platform.
+      </para>
+    </important>
 		<para>
-			It is critical that these channels only communicate with their intended peers; not with the channels used by other services and not with channels for the same service opened on machines not meant to be part of the group. Nodes improperly communicating with each other is one of the most common issues users have with JBoss Enterprise Web Platform clustering.
+      JGroups channels communicate based on group name, multicast address and multicast port.
+      Isolating a JGroups channel means ensuring that different channels use different values
+      for the group name, multicast address, and multicast port.
 		</para>
-		<para>
-			Whom a JGroups channel will communicate with is defined by its group name and, for UDP-based channels, its multicast address and port. Isolating a JGroups channel means ensuring that different channels use different values for the group name, the multicast address and, in some cases, the multicast port.
-		</para>
+
 		<section>
       <title>Isolating sets of Application Server instances from each other</title>
       
@@ -1296,8 +1552,8 @@
         To isolate JGroups clusters from other clusters on the network, you must:
       </para>
       <itemizedlist>
-        <listitem><para>Make sure the channels in the various clusters use different group names. This can be controlled with the command line arguments used to start JBoss; see <xref linkend="clustering-jgroups-isolation-group-name"/> for more information.</para></listitem>
-        <listitem><para>Make sure the channels in the various clusters use different multicast addresses. This is also easy to control with the command line arguments used to start JBoss<!--; see <xref linkend="clustering-jgroups-isolation-mcast_addr"/> for more information-->.</para></listitem>
+        <listitem><para>Make sure the channels in the various clusters use different group names. This can be controlled with the command line arguments used to start the server; see <xref linkend="clustering-jgroups-isolation-group-name"/> for more information.</para></listitem>
+        <listitem><para>Make sure the channels in the various clusters use different multicast addresses. This is also easy to control with the command line arguments used to start the server<!--; see <xref linkend="clustering-jgroups-isolation-mcast_addr"/> for more information-->.</para></listitem>
              
         <listitem><para>If you are not running on Linux, Windows, Solaris or HP-UX, you may 
           also need to ensure that the channels in each cluster use different 
@@ -1336,8 +1592,8 @@
 		<para>
 			The group name for a JGroups channel is configured via the service that 
          starts the channel. For all the standard clustered services, we make it easy 
-         for you to create unique groups names by simply using the <literal>-g</literal> (or <literal>--partition</literal>) switch when starting JBoss:</para>
-			<screen>./run.sh -g QAPartition -b 192.168.1.100 -c all</screen> 
+         for you to create unique groups names by simply using the <literal>-g</literal> (or <literal>--partition</literal>) switch when starting the server:</para>
+			<screen>./run.sh -g QAPartition -b 192.168.1.100 -c production</screen> 
 			<para>This switch sets the <literal>jboss.partition.name</literal> system property, 
          which is used as a component in the configuration of the group name in 
          all the standard clustering configuration files. For example, 
@@ -1350,7 +1606,7 @@
 <section><title>Changing the multicast address and port</title>
 	<para>
 		The <literal>-u</literal> (or <literal>--udp</literal>) command line switch may be used to control the multicast address used by the JGroups channels opened by all standard AS services.
-<screen><![CDATA[/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c all]]></screen>
+<screen><![CDATA[/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c production]]></screen>
 This switch sets the <literal>jboss.partition.udpGroup</literal> system property, which is referenced in all of the standard protocol stack configurations in JBoss AS: 
 	</para>
 	
@@ -1363,15 +1619,15 @@
     </para>
     </note>
    </section>
-   
+
    <section  id="clustering-jgroups-isolation-mcast_port">
    <title>Changing the Multicast Port</title>
       <para>
         On some operating systems (Mac OS X for example), using different 
-        <literal>-g</literal> and <literal>-u</literal> values is not sufficient 
+        <code>-g</code> and <code>-u</code> values is not sufficient 
         to isolate clusters; the channels running in the different clusters 
         must also use different multicast ports.  Unfortunately, setting the 
-        multicast ports is not as simple as <literal>-g</literal> and 
+        multicast ports is not as simple as <code>-g</code> and 
         <literal>-u</literal>. By default, a JBoss AS instance
         running the <literal>all</literal> configuration will use up to two different instances of
         the JGroups UDP transport protocol, and will therefore open two
@@ -1379,17 +1635,18 @@
         by using system properties on the command line. For example,
       </para>
 <programlisting>
-/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c all \\
-        -Djboss.jgroups.udp.mcast_port=12345 -Djboss.messaging.datachanneludpport=23456
+/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c production \\
+        -Djboss.jgroups.udp.mcast_port=12345
 </programlisting>
+<!--  -Djboss.messaging.datachanneludpport=23456-->
 
-   <para>The <literal>jboss.messaging.datachanneludpport</literal> property controls
-   the multicast port used by the <literal>MPING</literal> protocol in JBoss Messaging's <literal>DATA</literal> channel.
+   <para><!--The <varname>jboss.messaging.datachanneludpport</varname> property controls
+   the multicast port used by the <literal>MPING</literal> protocol in JBoss Messaging's <literal>DATA</literal> channel.-->
    The <literal>jboss.jgroups.udp.mcast_port</literal> property controls the
    multicast port used by the UDP transport protocol shared by all other clustered services.</para>
    
    <para>The set of JGroups protocol stack configurations included in the 
-   <literal>$JBOSS_HOME/server/all/cluster/jgroups-channelfactory.sar/META-INF/jgroups-channelfactory-stacks.xml</literal>
+   <filename>$JBOSS_HOME/server/production/cluster/jgroups-channelfactory.sar/META-INF/jgroups-channelfactory-stacks.xml</filename>
    file includes a number of other example protocol stack configurations that
    the standard JBoss AS distribution doesn't actually use. Those configurations also
    use system properties to set any multicast ports. So, if you reconfigure some
@@ -1400,9 +1657,11 @@
 		<para>
 			It should be sufficient to just change the address, but unfortunately the 
          handling of multicast sockets is one area where the JVM fails to hide 
-         operating system behavior differences from the application. The <literal>java.net.MulticastSocket</literal> 
+         operating system behavior differences from the application. The 
+          <classname>java.net.MulticastSocket</classname> 
          class provides different overloaded constructors. On some operating 
-        systems, if you use one constructor variant, packets addressed to a particular multicast port are delivered to all 
+        systems, if you use one constructor variant, packets addressed to a 
+        particular multicast port are delivered to all 
          listeners on that port, regardless of the multicast address on which they are 
          listening. We refer to this as the <emphasis>promiscuous traffic</emphasis> problem.
          On most operating systems that exhibit the promiscuous traffic problem
@@ -1429,8 +1688,8 @@
        by limiting UDP datagram loss.</para>
        
        <para>One of the most common causes of lost UDP datagrams is an undersized receive
-       buffer on the socket. The UDP protocol's <literal>mcast_recv_buf_size</literal>
-       and <literal>ucast_recv_buf_size</literal> configuration attributes
+       buffer on the socket. The UDP protocol's <varname>mcast_recv_buf_size</varname>
+       and <varname>ucast_recv_buf_size</varname> configuration attributes
        are used to specify the amount of receive buffer JGroups <emphasis>requests</emphasis>
        from the operating system, but the actual size of the buffer the operating system provides is limited by operating system-level maximums. These maximums are often very low:</para>
        
@@ -1468,65 +1727,92 @@
     </section>
 	
 <section><title>JGroups Troubleshooting</title>
-   <section>
-	<title>Nodes do not form a cluster</title>
-	
-	<para>
-		Make sure your machine is set up correctly for IP multicast. There are 2 test programs that can be used to detect this: McastReceiverTest and McastSenderTest. Go to the <literal>$JBOSS_HOME/server/all/lib</literal> directory and start McastReceiverTest, for example:
-<screen>java -cp jgroups.jar org.jgroups.tests.McastReceiverTest -mcast_addr 224.10.10.10 -port 5555 </screen>
-</para>
-
-<para>
-Then in another window start <literal>McastSenderTest</literal>:
-<screen>java -cp jgroups.jar org.jgroups.tests.McastSenderTest -mcast_addr 224.10.10.10 -port 5555</screen>
-</para>
-
-<para>
-	If you want to bind to a specific network interface card (NIC), use <literal>-bind_addr 192.168.0.2</literal>, where 192.168.0.2 is the IP address of the NIC to which you want to bind. Use this parameter in both the sender and the receiver.
-</para>
-<para>
-	You should be able to type in the <literal>McastSenderTest</literal> window and see the output in the <literal>McastReceiverTest</literal> window. If not, try to use -ttl 32 in the sender. If this still fails, consult a system administrator to help you setup IP multicast correctly, and ask the admin to make sure that multicast will work on the interface you have chosen or, if the machines have multiple interfaces, ask to be told the correct interface.
-Once you know multicast is working properly on each machine in your cluster, you can repeat the above test to test the network, putting the sender on one machine and the receiver on another.
-		
-	</para>
+    <section><title>Nodes do not form a cluster</title>	
+	    <para>
+		    Ensure that your machine is set up correctly for IP multicast. There are two test programs that
+        can detect this: <application>McastReceiverTest</application> and 
+        <application>McastSenderTest</application>.
+      </para>
+  <orderedlist>
+    <listitem>
+      <para>Start <application>McastReceiverTest</application> from the 
+      <filename>$JBOSS_HOME/server/production/lib</filename> directory, like so:</para>
+      <screen>java -cp jgroups.jar org.jgroups.tests.McastReceiverTest -mcast_addr 224.10.10.10 -port 5555</screen>
+    </listitem>
+    <listitem>
+      <para>In another window, start <application>McastSenderTest</application> from the same
+      directory:</para>
+      <screen>java -cp jgroups.jar org.jgroups.tests.McastSenderTest -mcast_addr 224.10.10.10 -port 5555</screen>
+      <note>
+        <para>
+          Use the <code>-bind_addr</code> switch to bind to a specific network interface card (NIC). To bind
+          to an NIC with an IP address of <literal>192.168.0.2</literal>, you would use
+          <code>-bind_addr 192.168.0.2</code>. This parameter can be used in both senders and receivers.
+        </para>
+      </note>
+    </listitem>
+    <listitem>
+      <para>
+        Type in the <application>McastSenderTest</application> window. You should be able to see the
+        output in the <application>McastReceiverTest</application> window.
+      </para>
+    </listitem>
+  </orderedlist>
+  <para>
+    If you cannot, try using <literal>-ttl 32</literal> in the sender. If this still fails, consult
+    a system administrator to help you set up IP multicast correctly. Check that multicast will
+    work on the interface you have chosen. If the machines have multiple interfaces, ask which
+    interface is correct for multicasting.
+  </para>
+  <para>
+    When multicast is working correctly on each machine in your cluster, verify that your network is
+    working correctly by repeating this test with <application>McastReceiverTest</application> on
+    one machine and <application>McastSenderTest</application> on another.
+  </para>
 </section>
 	
 <section><title>Causes of missing heartbeats in FD</title>
 	<para>
-		Sometimes a member is suspected by FD because a heartbeat ack has not been received for some time T (defined by timeout and max_tries). This can have multiple reasons, e.g. in a cluster of A,B,C,D; C can be suspected if (note that A pings B, B pings C, C pings D and D pings A):
+		Sometimes a member is suspected by FD because a heartbeat acknowledgement has not been received for 
+    some time (defined by <varname>timeout</varname> and <varname>max_tries</varname>). This may occur for 
+    several reasons. As an example, say you have a cluster consisting of nodes A, B, C and D.
+    In this cluster, A pings B, B pings C, C pings D, and D pings A.
+  </para>
+  <para>
+    C may be suspected in any of the following situations.
 	</para>
-	
 	<itemizedlist>
 		<listitem>
 			<para>
-			B or C are running at 100% CPU for more than T seconds. So even if C sends a heartbeat ack to B, B may not be able to process it because it is at 100%
+			  If B and C are running at 100% CPU for longer than the time defined by <varname>timeout</varname>
+        and <varname>max_tries</varname>. Even if C sends a heartbeat acknowledgement to B, B may not be
+        able to process the acknowledgement.
 			</para>
 		</listitem>
 		<listitem>
 			<para>
-			B or C are garbage collecting, same as above.
+			  If B or C are garbage collecting, they may not respond or process acknowledgement of a
+        heartbeat message.
 			</para>
 		</listitem>
 		<listitem>
 			<para>
-			A combination of the 2 cases above
+  			If the network loses packets, heartbeat messages or acknowledgements may be lost. This can occur
+        when a network has high traffic. Packets are usually dropped in the following order:
+        broadcasts, IP multicasts, then TCP packets.
 			</para>
 		</listitem>
 		<listitem>
 			<para>
-			The network loses packets. This usually happens when there is a lot of traffic on the network, and the switch starts dropping packets (usually broadcasts first, then IP multicasts, TCP packets last).
-			</para>
-		</listitem>
-		<listitem>
-			<para>
-			B or C are processing a callback. Let's say C received a remote method call  over its channel and takes T+1 seconds to process it. During this time, C will not process any other messages, including heartbeats, and therefore B will not receive the heartbeat ack and will suspect C.
+        If B or C are processing a callback. Say C receives a remote method call and takes longer than the
+        <varname>timeout</varname> or <varname>max_tries</varname> period to process it. During this time,
+        C does not process any other message, including heartbeats. Therefore B will not receive a
+        heartbeat acknowledgement, and will suspect C.
 		</para>
 	</listitem>
 </itemizedlist>
-
 </section>
 </section>
 </section>
 
-
-  </chapter>
+</chapter>




More information about the jboss-cvs-commits mailing list