[jboss-cvs] JBossAS SVN: r94704 - projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US.
jboss-cvs-commits at lists.jboss.org
jboss-cvs-commits at lists.jboss.org
Mon Oct 12 19:56:55 EDT 2009
Author: laubai
Date: 2009-10-12 19:56:55 -0400 (Mon, 12 Oct 2009)
New Revision: 94704
Modified:
projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_Building_Blocks.xml
projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml
projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Feedback.xml
projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Introduction.xml
projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Microcontainer.xml
projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Virtual_Deployment_Framework.xml
Log:
Applied changes to Microcontainer.xml.
Modified: projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_Building_Blocks.xml
===================================================================
--- projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_Building_Blocks.xml 2009-10-12 23:30:35 UTC (rev 94703)
+++ projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_Building_Blocks.xml 2009-10-12 23:56:55 UTC (rev 94704)
@@ -182,7 +182,7 @@
<title>Services using a Shared Transport</title>
<mediaobject>
<imageobject>
- <imagedata align="center" fileref="images/clustering-SharedTransport.png" />
+ <imagedata scalefit="1" align="center" fileref="images/clustering-SharedTransport.png" />
</imageobject>
</mediaobject>
</figure>
Modified: projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml
===================================================================
--- projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml 2009-10-12 23:30:35 UTC (rev 94703)
+++ projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Clustering_Guide_JGroups.xml 2009-10-12 23:56:55 UTC (rev 94704)
@@ -33,7 +33,7 @@
<para>
The JGroups framework provides services to enable peer-to-peer communications between nodes in a
cluster. Communication occurs over a communication channel. The channel built up from
- a stack of network communication <emphasis>protocols<emphasis>, each of which is responsible
+ a stack of network communication <emphasis>protocols</emphasis>, each of which is responsible
for adding a particular capability to the overall behavior of the channel.
Key capabilities provided by various protocols include transport,
cluster discovery, message ordering, lossless message delivery, detection
@@ -46,7 +46,7 @@
<title>Protocol stack in JGroups</title>
<mediaobject>
<imageobject>
- <imagedata align="center" fileref="images/JGroupsStack.png"/>
+ <imagedata scalefit="1" align="center" fileref="images/jbosscache-JGroupsStack.png"/>
</imageobject>
</mediaobject>
</figure>
@@ -180,7 +180,7 @@
</note>
<section id="jgroups-transport-udp">
<title>UDP configuration</title>
- <para><para>UDP is the preferred protocol for JGroups. UDP uses multicast (or,
+ <para>UDP is the preferred protocol for JGroups. UDP uses multicast (or,
in an unusual configuration, multiple unicasts) to send and
receive messages. If you choose UDP as the transport protocol for your cluster service, you need
to configure it in the <literal>UDP</literal> sub-element in the JGroups
@@ -286,6 +286,7 @@
<para><emphasis role="bold">ip_ttl</emphasis> specifies time-to-live (TTL) for IP Multicast packets. TTL is the commonly used term in multicast networking, but is actually something of a misnomer, since the value here refers to how many network hops a packet will be allowed to travel before networking equipment will drop it.
</para>
</listitem>
+ <listitem>
<para><emphasis role="bold">tos</emphasis> specifies the traffic class for sending unicast and multicast datagrams.</para>
</listitem>
</itemizedlist>
@@ -366,25 +367,55 @@
goes down.</para>
</listitem>
- <!--hajime-->
<listitem>
- <para><emphasis role="bold">mcast_send_buf_size, mcast_recv_buf_size, ucast_send_buf_size,
- ucast_recv_buf_size</emphasis> define receive and send buffer sizes. It is good to
- have a large receiver buffer size, so packets are less likely to get dropped due to
- buffer overflow.</para>
+ <para><emphasis role="bold">discard_incompatible_packets</emphasis> specifies
+ whether to discard packets sent by peers that use a different version of JGroups.
+ Each message in the cluster is tagged with a JGroups version. If <literal>discard_incompatible_packets</literal> is set to <literal>true</literal>,
+ messages received from different versions of JGroups will be silently discarded.
+ Otherwise, a warning will be logged. <emphasis>In no case will the message be delivered.</emphasis> The default value is <literal>false</literal>.
+ </para>
</listitem>
<listitem>
- <para><literal>tos</literal> specifies traffic class for sending unicast and multicast datagrams.
- </para>
- </listitem>
+ <para><emphasis role="bold">enable_diagnostics</emphasis> specifies that the transport
+ should open a multicast socket on address <literal>diagnostics_addr</literal> and port
+ <literal>diagnostics_port</literal> to listen for diagnostic requests
+ sent by the JGroups <ulink url="http://www.jboss.org/community/wiki/Probe"><emphasis role="bold">Probe</emphasis> utility</ulink>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>The various <emphasis role="bold">thread_pool</emphasis>
+ attributes configure the behavior of the pool of threads JGroups uses
+ to carry ordinary incoming messages up the stack. The various attributes
+ provide the constructor arguments for an instance of
+ <literal>java.util.concurrent.ThreadPoolExecutorService</literal>.
+ In the example above, the pool will have a minimum or <emphasis>core size</emphasis>
+ of 8 threads, and a maximum size of 200. If more than
+ 8 pool threads have been created, a thread returning from carrying
+ a message will wait for up to 5000 milliseconds to be assigned a new message to
+ carry, after which it will terminate. If no threads are available to
+ carry a message, the (separate) thread reading messages off the socket
+ will place messages in a queue; the queue will hold up to 1000 messages.
+ If the queue is full, the thread reading messages off the socket will
+ discard the message.</para>
+ </listitem>
+
+ <listitem>
+ <para>The various <emphasis role="bold">oob_thread_pool</emphasis> attributes
+ are similar to the <emphasis role="bold">thread_pool</emphasis> attributes in that
+ they configure a <literal>java.util.concurrent.ThreadPoolExecutorService</literal>
+ used to carry incoming messages up the protocol stack. In this case,
+ the pool is used to carry a special type of message known as an Out-Of-Band (OOB)
+ message. OOB messages are exempt from the ordered-delivery
+ requirements of protocols like NAKACK and UNICAST and thus can be delivered
+ up the stack even if NAKACK or UNICAST are queueing up messages from
+ a particular sender. OOB messages are often used internally by JGroups
+ protocols and can be used by applications as well. For example, when JBoss Cache is in <literal>REPL_SYNC</literal> mode, it uses OOB messages for the second phase of its
+ two-phase-commit protocol.</para>
+ </listitem>
</itemizedlist>
- <note>
- <para>On Windows 2000 machines, because of the media sense feature being broken with multicast
- (even after disabling media sense), you need to set the UDP protocol's
- <literal>loopback</literal> attribute to <literal>true</literal>.</para>
- </note>
</section>
@@ -394,33 +425,35 @@
TCP generates more network traffic when the cluster size increases. TCP
is fundamentally a unicast protocol. To send multicast messages, JGroups uses multiple TCP
unicasts. To use TCP as a transport protocol, you should define a <literal>TCP</literal> element
- in the JGroups <literal>Config</literal> element. Here is an example of the
+ in the JGroups <literal>config</literal> element. Here is an example of the
<literal>TCP</literal> element.</para>
<programlisting>
-<TCP start_port="7800"
- bind_addr="192.168.5.1"
- loopback="true"
- down_thread="false" up_thread="false"/>
+<TCP singleton_name="tcp"
+ start_port="7800" end_port="7800"/>
</programlisting>
- <para>Below are the attributes available in the <literal>TCP</literal> element.</para>
+ <para>The following attributes are specific to the <literal>TCP</literal> element:</para>
<itemizedlist>
<listitem>
- <para><emphasis role="bold">bind_addr</emphasis> specifies the binding address. It can also
- be set with the <literal>-Djgroups.bind_address</literal> command line option at server
- startup.</para>
+ <para>
+ <literal>start_port</literal> and <literal>end_port</literal> define the range of
+ TCP ports to which the server should bind. The server socket is bound to the first
+ available port beginning with <literal>start_port</literal>. If no available port is
+ found (for example, because the ports are in use by other sockets) before the
+ <literal>end_port</literal>, the server throws an exception. If no
+ <literal>end_port</literal> is provided, or <literal>end_port</literal> is lower than
+ <literal>start_port</literal>, no upper limit is applied to the port range. If
+ <literal>start_port</literal> is equal to <literal>end_port</literal>, JGroups is
+ forced to use the specified port, since <literal>start_port</literal> fails if the
+ specified port in not available. The default value is <literal>7800</literal>.
+ If set to <literal>0</literal>, the operating system will select a port. (This will only work for <literal>MPING</literal> or <literal>TCPGOSSIP</literal> discovery protocols.
+ <literal>TCCPING</literal> requires that nodes and their required ports are listed.)
+ </para>
</listitem>
<listitem>
- <para><emphasis role="bold">start_port, end_port</emphasis> define the range of TCP ports
- the server should bind to. The server socket is bound to the first available port from
- <literal>start_port</literal>. If no available port is found (e.g., because of a
- firewall) before the <literal>end_port</literal>, the server throws an exception. If no <literal>end_port</literal> is provided or <literal>end_port < start_port</literal> then there is no upper limit on the port range. If <literal>start_port == end_port</literal>, then we force JGroups to use the given port (start fails if port is not available). The default is 7800. If set to 0, then the operating system will pick a port. Please, bear in mind that setting it to 0 will work only if we use MPING or TCPGOSSIP as discovery protocol because <literal>TCCPING</literal> requires listing the nodes and their corresponding ports.</para>
+ <para><emphasis role="bold">bind_port</emphasis> in TCP acts as an alias for <literal>start_port</literal>. If configured internally, it sets
+ <literal>start_port</literal>.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">loopback</emphasis> specifies whether to loop outgoing message
- back up the stack. In <literal>unicast</literal> mode, the messages are sent to self. In
- <literal>mcast</literal> mode, a copy of the mcast message is sent. The default is false.</para>
- </listitem>
- <listitem>
<para><emphasis role="bold">recv_buf_size, send_buf_size</emphasis> define receive and send buffer sizes. It is good to have a large receiver buffer size, so packets are less likely to get dropped due to buffer overflow.</para>
</listitem>
<listitem>
@@ -451,19 +484,24 @@
</listitem>
</itemizedlist>
+ <note>
+ <para>All of the attributes common to all protocols discussed in
+ the UDP protocol section also apply to TCP.</para>
+ </note>
</section>
+
+
<section id="jgroups-transport-tunnel">
<title>TUNNEL configuration</title>
- <para>The TUNNEL protocol uses an external router to send messages. The external router is known as
- a <literal>GossipRouter</literal>. Each node has to register with the router. All messages are sent to the router and forwarded on to their destinations. The TUNNEL approach can be used to setup communication with nodes behind firewalls. A node can establish a TCP connection to the GossipRouter through the firewall (you can use port 80). The same connection is used by the router to send messages to nodes behind the firewall as most firewalls do not permit outside hosts to initiate a TCP connection to a host inside the firewall. The TUNNEL configuration is defined in the TUNNEL element in the JGroups Config element. Here is an example..
+ <para>The <literal>TUNNEL</literal> protocol uses an external router process to send messages. The external router is a Java process that runs the <literal>org.jgroups.stack.GossipRouter</literal> main class. Each node has to register with the router. All messages are sent to the router and forwarded on to their destinations. The TUNNEL approach can be used to set up communication with nodes behind firewalls. A node can establish a TCP connection to the <classname>GossipRouter</classname> through the firewall (you can use port 80). This connection is also used by the router to send messages to nodes behind the firewall, as most firewalls do not permit outside hosts to initiate a TCP connection to a host inside the firewall. The <literal>TUNNEL</literal> configuration is defined in the <literal>TUNNEL</literal> element within the JGroups <literal><config></literal> element, like so:
</para>
<programlisting>
-<TUNNEL router_port="12001"
- router_host="192.168.5.1"
- down_thread="false" up_thread="false/>
+<TUNNEL singleton_name="tunnel"
+ router_port="12001"
+ router_host="192.168.5.1"/>
</programlisting>
@@ -478,10 +516,13 @@
GossipRouter is listening.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">loopback</emphasis> specifies whether to loop messages back up
- the stack. The default is <literal>true</literal>.</para>
+ <para><emphasis role="bold">reconnect_interval</emphasis> specifies the interval of time (in milliseconds) for which <literal>TUNNEL</literal> will attempt to connect to the <classname>GossipRouter</classname> if the connection is not established. The default value is <literal>5000</literal>.</para>
</listitem>
</itemizedlist>
+ <note>
+ <para>All of the attributes common to all protocols discussed in
+ the UDP protocol section also apply to <literal>TUNNEL</literal>.</para>
+ </note>
</section>
</section>
@@ -491,14 +532,21 @@
<section id="jgroups-discovery">
<title>Discovery Protocols</title>
<para>
- The cluster needs to maintain a list of current member nodes at all times so that the load balancer and client interceptor know how to route their requests. Discovery protocols are used to discover active nodes in the cluster and detect the oldest member of the cluster, which is the coordinator. All initial nodes are discovered when the cluster starts up.
- When a new node joins the cluster later, it is only discovered after the group membership protocol
- (GMS, see <xref linkend="jgroups-other-gms"/>) admits it into the group.</para>
- <para>Since the discovery protocols sit on top of the transport protocol, you can choose to use different discovery protocols based on your transport protocol. These are also configured as sub-elements in the JGroups MBean <literal>Config</literal> element.</para>
+ When a channel on a node first connects, it must determine which other nodes are
+ running compatible channels, and which of these nodes is currently acting as the
+ <emphasis>coordinator</emphasis> (the node responsible for letting new nodes join
+ the group). Discovery protocols are used to find active nodes in the cluster and
+ to determine which is the coordinator. This information is then provided to the
+ group membership protocol (GMS), which communicates with the coordinator's GMS to
+ add the newly-connecting node to the group. (For more information about group membership protocols, see <xref linkend="jgroups-other-gms"/>.)
+ </para>
+ <para>
+ Discovery protocols also assist merge protocols (see <xref linkend="jgroups-other-merge"/>)
+ to detect cluster-split situations.</para>
+ <para>
+ The discovery protocols sit on top of the transport protocol, so you can choose to use different discovery protocols depending on your transport protocol. These are also configured as sub-elements in the JGroups <literal><config></literal> element.
+ </para>
-
-
-
<section id="jgroups-discovery-ping">
<title>PING</title>
<para>
@@ -506,26 +554,18 @@
</para>
<para>Here is an example PING configuration for IP multicast.
</para>
-
-
- <programlisting>
-<PING timeout="2000"
- num_initial_members="2"
- down_thread="false" up_thread="false"/>
- </programlisting>
+
+<programlisting><PING timeout="2000"
+ num_initial_members="3"/>
+</programlisting>
<para>
Here is another example PING configuration for contacting a Gossip Router.
-<programlisting><![CDATA[
-<PING gossip_host="localhost"
+</para>
+<programlisting><![CDATA[<PING gossip_host="localhost"
gossip_port="1234"
- timeout="3000"
- num_initial_members="3"
- down_thread="false" up_thread="false"/>]]>
-</programlisting>
-
- </para>
-
-
+ timeout="2000"
+ num_initial_members="3"/>]]>
+</programlisting>
<para>The available attributes in the <literal>PING</literal> element are listed below.</para>
<itemizedlist>
<listitem>
@@ -549,15 +589,16 @@
milliseconds) for the lease from the GossipRouter. The default is 20000.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
- (e.g., <literal>host1[12345],host2[23456]</literal>), which are pinged for
- discovery.</para>
+ <para><emphasis role="bold">initial_hosts</emphasis> is a comma-separated list of addresses or ports (for example, <literal>host1[12345],host2[23456]</literal>) which are pinged for
+ discovery. Default is <literal>null</literal>, meaning multicast
+ discovery should be used. If <literal>initial_hosts</literal>
+ is specified, you must list all possible cluster members, not just a few well-known hosts, or <literal>MERGE2</literal> cluster split discovery will not work reliably.</para>
</listitem>
</itemizedlist>
<para>If both <literal>gossip_host</literal> and <literal>gossip_port</literal> are defined, the
- cluster uses the GossipRouter for the initial discovery. If the <literal>initial_hosts</literal>
- is specified, the cluster pings that static list of addresses for discovery. Otherwise, the
- cluster uses IP multicasting for discovery.</para>
+ cluster uses the GossipRouter for the initial discovery. If the <literal>initial_hosts</literal>
+ is specified, the cluster pings that static list of addresses for discovery. Otherwise, the
+ cluster uses IP multicasting for discovery.</para>
<note>
<para>The discovery phase returns when the <literal>timeout</literal> ms have elapsed or the
<literal>num_initial_members</literal> responses have been received.</para>
@@ -571,11 +612,9 @@
<para>The TCPGOSSIP protocol only works with a GossipRouter. It works essentially the same way as
the PING protocol configuration with valid <literal>gossip_host</literal> and
<literal>gossip_port</literal> attributes. It works on top of both UDP and TCP transport protocols. Here is an example.</para>
-<programlisting><![CDATA[
-<TCPGOSSIP timeout="2000"
- initial_hosts="192.168.5.1[12000],192.168.0.2[12000]"
- num_initial_members="3"
- down_thread="false" up_thread="false"/>]]>
+<programlisting><![CDATA[<TCPGOSSIP timeout="2000"
+ num_initial_members="3"
+ initial_hosts="192.168.5.1[12000],192.168.0.2[12000]"/>]]>
</programlisting>
@@ -590,9 +629,8 @@
responses to wait for unless timeout has expired. The default is 2.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
- (e.g., <literal>host1[12345],host2[23456]</literal>) for GossipRouters to register
- with.</para>
+ <para><emphasis role="bold">initial_hosts</emphasis> is a comma-separated list of addresses/ports
+ (for example, <literal>host1[12345],host2[23456]</literal>) of <literal>GossipRouter</literal>s to register</para>
</listitem>
</itemizedlist>
</section>
@@ -601,16 +639,14 @@
<section id="jgroups-discovery-tcpping">
<title>TCPPING</title>
- <para>The TCPPING protocol takes a set of known members and ping them for discovery. This is
+ <para>The TCPPING protocol takes a set of known members and pings them for discovery. This is
essentially a static configuration. It works on top of TCP. Here is an example of the
- <literal>TCPPING</literal> configuration element in the JGroups <literal>Config</literal>
+ <literal>TCPPING</literal> configuration element in the JGroups <literal>config</literal>
element.</para>
- <programlisting>
-<TCPPING timeout="2000"
- initial_hosts="hosta[2300],hostb[3400],hostc[4500]"
- port_range="3"
- num_initial_members="3"
- down_thread="false" up_thread="false"/>
+ <programlisting><TCPPING timeout="2000"
+ num_initial_members="3"/
+ initial_hosts="hosta[2300],hostb[3400],hostc[4500]"
+ port_range="3">
</programlisting>
@@ -626,11 +662,13 @@
</listitem>
<listitem>
<para><emphasis role="bold">initial_hosts</emphasis> is a comma-seperated list of addresses
- (e.g., <literal>host1[12345],host2[23456]</literal>) for pinging.</para>
+ (for example, <literal>host1[12345],host2[23456]</literal>) for pinging.</para>
</listitem>
<listitem>
<para>
- <emphasis role="bold">port_range</emphasis> specifies the number of consecutive ports to be probed when getting the initial membership, starting with the port specified in the initial_hosts parameter. Given the current values of port_range and initial_hosts above, the TCPPING layer will try to connect to hosta:2300, hosta:2301, hosta:2302, hostb:3400, hostb:3401, hostb:3402, hostc:4500, hostc:4501, hostc:4502. The configuration options allows for multiple nodes on the same host to be pinged.
+ <emphasis role="bold">port_range</emphasis> specifies the number of consecutive ports to be probed when getting the initial membership, starting with the port specified in the <varname>initial_hosts</varname> parameter. Given the current values of <literal>port_range</literal> and <literal>initial_hosts</literal> above, the <literal>TCPPING</literal> layer will try to connect to <literal>hosta[2300]</literal>, <literal>hosta[2301]</literal>, <literal>hosta[2302]</literal>, <literal>hostb[3400]</literal>, <literal>hostb[3401]</literal>, <literal>hostb[3402]</literal>, <literal>hostc[4500]</literal>, <literal>hostc[4501]</literal>, and <literal>hostc[4502]</literal>. This configuration option allows for multiple possible ports on the same host to be pinged without having to spell out all possible combinations.
+ If in your TCP protocol configuration your <literal>end_port</literal> is greater than your <literal>start_port</literal>, we recommend using a TCPPING <literal>port_range</literal> equal to the difference, to ensure
+ a node is pinged no matter which port it is bound to within the allowed range.
</para>
</listitem>
</itemizedlist>
@@ -642,17 +680,25 @@
<section id="jgroups-discovery-mping">
<title>MPING</title>
<para>
- MPING uses IP multicast to discover the initial membership. It can be used with all transports, but usually this is used in combination with TCP. TCP usually requires TCPPING, which has to list all group members explicitly, but MPING doesn't have this requirement. The typical use case for this is when we want TCP as transport, but multicasting for discovery so we don't have to define a static list of initial hosts in TCPPING or require external Gossip Router.
+ <literal>MPING</literal> uses IP multicast to discover the initial membership. Unlike the
+ other discovery protocols, which delegate the sending and receiving of
+ discovery messages on the network to the transport protocol, <literal>MPING</literal>
+ opens its own sockets to send and receive multicast discovery messages.
+ As a result it can be used with all transports, but it is most often used
+ with <literal>TCP</literal>. <literal>TCP</literal> usually requires
+ <literal>TCPPING</literal>, which must explicitly list all possible group members.
+ <literal>MPING</literal> does not have this requirement, and is typically used where
+ <literal>TCP</literal> is required for regular message transport, and UDP multicasting
+ is allowed for discovery.
</para>
<programlisting>
<MPING timeout="2000"
+ num_initial_members="3"
bind_to_all_interfaces="true"
mcast_addr="228.8.8.8"
mcast_port="7500"
- ip_ttl="8"
- num_initial_members="3"
- down_thread="false" up_thread="false"/>
+ ip_ttl="8"/>
</programlisting>
@@ -667,8 +713,12 @@
responses to wait for unless timeout has expired. The default is 2..</para>
</listitem>
<listitem>
- <para><emphasis role="bold">bind_addr</emphasis> specifies the interface on which to send
- and receive multicast packets.</para>
+ <para>
+ <emphasis role="bold">bind_addr</emphasis> specifies the interface on which to send
+ and receive multicast packets. By default JGroups uses the value of the system property <literal>jgroups.bind_addr</literal>, which can be set with the <code>-b</code>
+ command line switch. See <xref linkend="jgroups-other"/> for more on binding JGroups
+ sockets.
+ </para>
</listitem>
<listitem>
<para><emphasis role="bold">bind_to_all_interfaces</emphasis> overrides the
@@ -686,22 +736,22 @@
<section id="jgroups-fd">
<title>Failure Detection Protocols</title>
- <para>The failure detection protocols are used to detect failed nodes. Once a failed node is detected, a suspect verification phase can occur after which, if the node is still considered dead, the cluster updates its view so that the load balancer and client interceptors know to avoid the dead node. The failure detection protocols are configured as sub-elements in the JGroups MBean
- <literal>Config</literal> element.</para>
+ <para>
+ The failure detection protocols are used to detect failed nodes. Once a failed node is detected, a <emphasis>suspect verification</emphasis> phase can occur. If the node is still considered dead after this phase is complete, the cluster updates its membership view so that further messages are not sent to the failed node. The service using JGroups is informed that the node is no longer part of the cluster. Failure detection protocols are configured as sub-elements in the JGroups <literal><config></literal> element.
+ </para>
<section id="jgroups-fd-fd">
<title>FD</title>
<para>
- FD is a failure detection protocol based on heartbeat messages. This protocol requires each node to periodically send are-you-alive messages to its neighbour. If the neighbour fails to respond, the calling node sends a SUSPECT message to the cluster. The current group coordinator can optionally double check whether the suspected node is indeed dead after which, if the node is still considered dead, updates the cluster's view. Here is an example FD configuration.
+ <literal>FD</literal> is a failure detection protocol based on 'heartbeat' messages. This protocol requires that eat node periodically ping its neighbour. If the neighbour fails to respond, the calling node sends a <literal>SUSPECT</literal> message to the cluster. The current group coordinator can optionally verify that the suspected node is dead (<literal>VERIFY_SUSPECT</literal>). If the node is still considered dead after this verification step, the coordinator updates the cluster's membership view. The following is an example of <literal>FD</literal> configuration:
</para>
<programlisting>
-<FD timeout="2000"
- max_tries="3"
- shun="true"
- down_thread="false" up_thread="false"/>
+<FD timeout="6000"
+ max_tries="5"
+ shun="true"/>
</programlisting>
@@ -716,34 +766,33 @@
are-you-alive messages from a node before the node is suspected. The default is 2.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">shun</emphasis> specifies whether a failed node will be shunned.
- Once shunned, the node will be expelled from the cluster even if it comes back later.
- The shunned node would have to re-join the cluster through the discovery process. JGroups allows to configure itself such that shunning leads to automatic rejoins and state transfer, which is the default behaivour within JBoss Application Server.</para>
+ <para><emphasis role="bold">shun</emphasis> specifies whether a failed node will be forbidden from sending messages to the group without formally rejoining. A shunned node would need to rejoin the cluster via the discovery process. JGroups allows applications to configure a channel such that, when a channel is shunned, the process of rejoining the cluster and transferring state. (This is default behavior for JBoss Application Server.)</para>
</listitem>
</itemizedlist>
<note>
- <para>Regular traffic from a node counts as if it is a live. So, the are-you-alive messages are
- only sent when there is no regular traffic to the node for sometime.</para>
+ <para>
+ Regular traffic from a node is proof of life, so heartbeat messages are only sent when no regular traffic is detected on the node for a long period of time.</para>
</note>
</section>
-
-
<section id="jgroups-fd-fdsock">
<title>FD_SOCK</title>
<para>
-FD_SOCK is a failure detection protocol based on a ring of TCP sockets created between group members. Each member in a group connects to its neighbor (last member connects to first) thus forming a ring. Member B is suspected when its neighbor A detects abnormally closed TCP socket (presumably due to a node B crash). However, if a member B is about to leave gracefully, it lets its neighbor A know, so that it does not become suspected. The simplest FD_SOCK configuration does not take any attribute. You can just declare an empty <literal>FD_SOCK</literal> element in JGroups's <literal>Config</literal> element.</para>
+ <literal>FD_SOCK</literal> is a failure detection protocol based on a ring of TCP sockets created between group members. Each member in a group connects to its neighbor, with the final member connecting to the first, forming a ring. Node B becomes suspected when its neighbour, Node A, detects an abnormally closed TCP socket, presumably due to a crash in Node B. (When nodes intend to leave the group, they inform their neighbours so that they do not become suspected.)
+ </para>
+ <para>
+ The simplest <literal>FD_SOCK</literal> configuration does not take any attribute. You can declare an empty <literal>FD_SOCK</literal> element in the JGroups <literal><config></literal> element.
+ </para>
-
<programlisting>
-<FD_SOCK_down_thread="false" up_thread="false"/>
+<FD_SOCK/>
</programlisting>
-<para>There available attributes in the <literal>FD_SOCK</literal> element are listed below.</para>
+<para>The attributes available to the <literal>FD_SOCK</literal> element are listed below.</para>
<itemizedlist>
<listitem>
- <para><emphasis role="bold">bind_addr</emphasis> specifies the interface to which the server socket should bind to. If -Djgroups.bind_address system property is defined, XML value will be ignore. This behaivour can be reversed setting -Djgroups.ignore.bind_addr=true system property.</para>
+ <para><emphasis role="bold">bind_addr</emphasis> specifies the interface to which the server socket should be bound. By default, JGroups uses the value of the system property <literal>jgroups.bind_addr</literal>. This system property can be set with the <code>-b</code> command line switch. For more information about binding JGroups sockets, see <xref linkend="jgroups-other"/>.</para>
</listitem>
</itemizedlist>
@@ -757,34 +806,29 @@
</para>
<programlisting><![CDATA[
-<VERIFY_SUSPECT timeout="1500"
- down_thread="false" up_thread="false"/>]]>
+<VERIFY_SUSPECT timeout="1500"/>]]>
</programlisting>
<para>
- The available attributes in the FD_SOCK element are listed below.
+ The available attributes in the <literal>VERIFY_SUSPECT</literal> element are listed below.
</para>
<itemizedlist>
<listitem>
<para>
- timeout specifies how long to wait for a response from the suspected member before considering it dead.
- </para>
-</listitem>
-</itemizedlist>
+ <emphasis role="bold">timeout</emphasis> specifies how long to wait for a response from the suspected member before considering it dead.
+ </para>
+ </listitem>
+ </itemizedlist>
</section>
-
-
<section><title>FD versus FD_SOCK</title>
<para>
FD and FD_SOCK, each taken individually, do not provide a solid failure detection layer. Let's look at the the differences between these failure detection protocols to understand how they complement each other:
</para>
<itemizedlist>
<listitem><para><emphasis>FD</emphasis></para>
- </listitem>
- </itemizedlist>
<itemizedlist>
<listitem>
<para>
@@ -797,22 +841,19 @@
</para>
</listitem>
<listitem>
- <para>
+ <para>
Low timeouts lead to higher probability of false suspicions and higher network traffic.
</para>
</listitem>
<listitem>
- <para>
+ <para>
High timeouts will not detect and remove crashed members for some time.
</para>
</listitem>
</itemizedlist>
+</listitem>
-<itemizedlist>
<listitem><para><emphasis>FD_SOCK</emphasis>:</para>
-</listitem>
-</itemizedlist>
-
<itemizedlist>
<listitem>
<para>
@@ -845,9 +886,11 @@
</para>
</listitem>
</itemizedlist>
+</listitem>
+</itemizedlist>
-<para>
- The aim of a failure detection layer is to report real failures and therefore avoid false suspicions. There are two solutions:
+ <para>
+ A failure detection layer is intended to report real failures promptly, while avoiding false suspicions. There are two solutions:
</para>
<orderedlist>
<listitem>
@@ -861,17 +904,16 @@
</para>
</listitem>
</orderedlist>
-<programlisting><![CDATA[
-<FD_SOCK down_thread="false" up_thread="false"/>
-<FD timeout="10000" max_tries="5" shun="true"
-down_thread="false" up_thread="false" /> ]]>
+<programlisting><![CDATA[<FD_SOCK/>
+<FD timeout="6000" max_tries="5" shun="true"/>
+<VERIFY_SUSPECT timeout="1500"/>]]>
</programlisting>
-<para>
- This suspects a member when the socket to the neighbor has been closed abonormally (e.g. process crash, because the OS closes all sockets). However, f a host or switch crashes, then the sockets won't be closed, therefore, as a seond line of defense, FD will suspect the neighbor after 50 seconds. Note that with this example, if you have your system stopped in a breakpoint in the debugger, the node you're debugging will be suspected after ca 50 seconds.
+ <para>
+ In this example, a member becomes suspected when the neighbouring socket has been closed abnormally, in a process crash, for instance, since the operating system closes all sockets. However, if a host or switch crashes, the sockets will not be closed. <literal>FD</literal> will suspect the neighbour after thirty seconds (<literal>6000</literal> milliseconds). Note that if this example system were stopped in a breakpoint in the debugger, the node being debugged will be suspected once the <varname>timeout</varname> has elapsed.
</para>
-<para>
- A combination of FD and FD_SOCK provides a solid failure detection layer and for this reason, such technique is used accross JGroups configurations included within JBoss Application Server.
+ <para>
+ A combination of <literal>FD</literal> and <literal>FD_SOCK</literal> provides a solid failure detection layer, which is why this technique is used across the JGroups configurations included with JBoss Application Server.
</para>
</section>
</section>
@@ -881,7 +923,7 @@
<section id="jgroups-reliable">
<title>Reliable Delivery Protocols</title>
<para>
- Reliable delivery protocols within the JGroups stack ensure that data pockets are actually delivered in the right order (FIFO) to the destination node. The basis for reliable message delivery is positive and negative delivery acknowledgments (ACK and NAK). In the ACK mode, the sender resends the message until the acknowledgment is received from the receiver. In the NAK mode, the receiver requests retransmission when it discovers a gap.
+ Reliable delivery protocols within the JGroups stack ensure that messages are actually delivered, and delivered in the correct order (First In, First Out, or FIFO) to the destination node. The basis for reliable message delivery is positive and negative delivery acknowledgments (ACK and NAK). In <literal>ACK</literal> mode, the sender resends the message until acknowledgment is received from the receiver. In <literal>NAK</literal> mode, the receiver requests retransmission when it discovers a gap.
</para>
@@ -889,20 +931,18 @@
<section id="jgroups-reliable-unicast">
<title>UNICAST</title>
<para>
- The UNICAST protocol is used for unicast messages. It uses ACK. It is configured as a sub-element under the JGroups Config element. If the JGroups stack is configured with TCP transport protocol, UNICAST is not necessary because TCP itself guarantees FIFO delivery of unicast messages. Here is an example configuration for the <literal>UNICAST</literal> protocol.</para>
+ The <literal>UNICAST</literal> protocol is used for unicast messages. It uses positive acknowlegements (<literal>ACK</literal>). It is configured as a sub-element under the JGroups <literal>config</literal> element. If the JGroups stack is configured with the TCP transport protocol, <literal>UNICAST</literal> is not necessary because TCP itself guarantees FIFO delivery of unicast messages. Here is an example configuration for the <literal>UNICAST</literal> protocol:</para>
<programlisting>
-<UNICAST timeout="100,200,400,800"
-down_thread="false" up_thread="false"/>
+<UNICAST timeout="300,600,1200,2400,3600"/>
</programlisting>
<para>There is only one configurable attribute in the <literal>UNICAST</literal> element.</para>
<itemizedlist>
<listitem>
<para><emphasis role="bold">timeout</emphasis> specifies the retransmission timeout (in
- milliseconds). For instance, if the timeout is "100,200,400,800", the sender resends the
- message if it hasn't received an ACK after 100 ms the first time, and the second time it
- waits for 200 ms before resending, and so on.</para>
+ milliseconds). For instance, if the timeout is <literal>100,200,400,800</literal>, the sender resends the message if it has not received an <literal>ACK</literal> after 100 milliseconds the first time, and the second time it waits for 200 milliseconds before resending, and so on. A low value for the first timeout allows for prompt retransmission of dropped messages, but means that messages may be transmitted more than once if they have not actually been lost (that is, the message has been sent, but the <literal>ACK</literal> has not been received before the timeout). High values (<literal>1000,2000,3000</literal>) can improve performance if the network is tuned such that UDP datagram loss is infrequent. High values on networks with frequent losses will be harmful to performance, since later messages will not be delivered until lost messages have been retransmitted.
+ </para>
</listitem>
</itemizedlist>
</section>
@@ -910,48 +950,39 @@
<section id="jgroups-reliable-nakack">
<title>NAKACK</title>
- <para>The NAKACK protocol is used for multicast messages. It uses NAK. Under this protocol, each
- message is tagged with a sequence number. The receiver keeps track of the sequence numbers and
- deliver the messages in order. When a gap in the sequence number is detected, the receiver asks
- the sender to retransmit the missing message. The NAKACK protocol is configured as the
- <literal>pbcast.NAKACK</literal> sub-element under the JGroups <literal>Config</literal>
- element. Here is an example configuration.</para>
+ <para>The <literal>NAKACK</literal> protocol is used for multicast messages. It uses negative acknowlegements (<literal>NAK</literal>). Under this protocol, each message is tagged with a sequence number. The receiver keeps track of the received sequence numbers and delivers the messages in order. When a gap in the series of received sequence numbers is detected, the receiver schedules a task to periodically ask the sender to retransmit the missing message. The task is cancelled if the missing message is received. <literal>NAKACK</literal> protocol is configured as the <literal>pbcast.NAKACK</literal> sub-element under the JGroups <literal><config></literal> element. Here is an example configuration:</para>
<programlisting>
<pbcast.NAKACK max_xmit_size="60000" use_mcast_xmit="false"
-
retransmit_timeout="300,600,1200,2400,4800" gc_lag="0"
- discard_delivered_msgs="true"
- down_thread="false" up_thread="false"/>
+ discard_delivered_msgs="true"/>
</programlisting>
<para>The configurable attributes in the <literal>pbcast.NAKACK</literal> element are as follows.</para>
<itemizedlist>
<listitem>
- <para><emphasis role="bold">retransmit_timeout</emphasis> specifies the retransmission
- timeout (in milliseconds). It is the same as the <literal>timeout</literal> attribute in
- the UNICAST protocol.</para>
+ <para><emphasis role="bold">retransmit_timeout</emphasis> specifies the series of timeouts (in milliseconds) after which retransmission
+ is requested if a missing message has not yet been received.</para>
</listitem>
<listitem>
<para><emphasis role="bold">use_mcast_xmit</emphasis> determines whether the sender should
- send the retransmission to the entire cluster rather than just the node requesting it.
- This is useful when the sender drops the pocket -- so we do not need to retransmit for
- each node.</para>
+ send the retransmission to the entire cluster rather than just to the node requesting it.
+ This is useful when the <emphasis>sender</emphasis>'s network layer tends to drop packets,
+ avoiding the need to individually retransmit to each node.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">max_xmit_size</emphasis> specifies maximum size for a bundled
- retransmission, if multiple packets are reported missing.</para>
+ <para><emphasis role="bold">max_xmit_size</emphasis> specifies the maximum size (in bytes) for a bundled retransmission, if multiple messages are reported missing.</para>
</listitem>
<listitem>
<para><emphasis role="bold">discard_delivered_msgs</emphasis> specifies whether to discard
- delivery messages on the receiver nodes. By default, we save all delivered messages.
- However, if we only ask the sender to resend their messages, we can enable this option
- and discard delivered messages.</para>
+ delivered messages on the receiver nodes. By default, nodes save delivered messages so
+ any node can retransmit a lost message in case the original sender has crashed
+ or left the group. However, if we only ask the sender to resend its messages, we can enable this option and discard delivered messages.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">gc_lag specifies</emphasis> the number of messages garbage collection lags behind.
+ <para><emphasis role="bold">gc_lag</emphasis> specifies the number of messages to keep in memory for retransmission, even after the periodic cleanup protocol (see <xref linkend="jgroups-other-gc"/>) indicates all peers have received the message. The default value is <literal>20</literal>.
</para>
</listitem>
@@ -962,14 +993,11 @@
<para>The group membership service (GMS) protocol in the JGroups stack
maintains a list of active nodes. It handles the requests to join and
leave the cluster. It also handles the SUSPECT messages sent by failure
- detection protocols. All nodes in the cluster, as well as the load balancer and client side
- interceptors, are notified if the group membership changes. The group membership service is
- configured in the <literal>pbcast.GMS</literal> sub-element under the JGroups
- <literal>Config</literal> element. Here is an example configuration.</para>
+ detection protocols. All nodes in the cluster, as well as any interested
+ services like JBoss Cache or HAPartition, are notified if the group membership changes. The group membership service is configured in the <literal>pbcast.GMS</literal> sub-element under the JGroups <literal>config</literal> element. Here is an example configuration.</para>
<programlisting>
<pbcast.GMS print_local_addr="true"
join_timeout="3000"
- down_thread="false" up_thread="false"
join_retry_timeout="2000"
shun="true"
view_bundling="true"/>
@@ -984,23 +1012,24 @@
milliseconds to wait for a new node JOIN request to succeed. Retry afterwards.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">join_retry_timeout</emphasis> specifies the maximum number of
- milliseconds to wait after a failed JOIN to re-submit it.</para>
+ <para><emphasis role="bold">join_retry_timeout</emphasis> specifies the number of
+ milliseconds to wait after a failed JOIN before trying again.</para>
</listitem>
<listitem>
<para><emphasis role="bold">print_local_addr</emphasis> specifies whether to dump the node's
- own address to the output when started.</para>
+ own address to the standard output when started.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">shun</emphasis> specifies whether a node should shun itself if
- it receives a cluster view that it is not a member node.</para>
+ <para><emphasis role="bold">shun</emphasis> specifies whether a node should shun
+ (that is, disconnect) itself if it receives a cluster view in which it is not a member node.</para>
</listitem>
<listitem>
<para><emphasis role="bold">disable_initial_coord</emphasis> specifies whether to prevent
- this node as the cluster coordinator.</para>
+ this node from becoming the cluster coordinator during the initial connection of the channel. This flag does not prevent a node becoming the coordinator after the initial channel connection, if the current coordinator leaves the group.
+ </para>
</listitem>
<listitem>
- <para><emphasis role="bold">view_bundling</emphasis> specifies whether multiple JOIN or LEAVE request arriving at the same time are bundled and handled together at the same time, only sending out 1 new view / bundle. This is is more efficient than handling each request separately.
+ <para><emphasis role="bold">view_bundling</emphasis> specifies whether multiple JOIN or LEAVE requests arriving at the same time are bundled and handled together at the same time, resulting in only one new view that incorporates all changes. This is is more efficient than handling each request separately.
</para>
</listitem>
@@ -1012,20 +1041,20 @@
<title>Flow Control (FC)</title>
<para>The flow control (FC) protocol tries to adapt the data sending rate
to the data receipt rate among nodes. If a sender node is too fast, it
- might overwhelm the receiver node and result in dropped packets that
- have to be retransmitted. In JGroups, the flow control is implemented via a
+ might overwhelm the receiver node and result in out-of-memory conditions
+ or dropped packets that have to be retransmitted. In JGroups, flow control is implemented via a
credit-based system. The sender and receiver nodes have the same number of credits (bytes) to
start with. The sender subtracts credits by the number of bytes in messages it sends. The
receiver accumulates credits for the bytes in the messages it receives. When the sender's credit
- drops to a threshold, the receivers sends some credit to the sender. If the sender's credit is
+ drops to a threshold, the receivers send some credit to the sender. If the sender's credit is
used up, the sender blocks until it receives credits from the receiver. The flow control protocol
is configured in the <literal>FC</literal> sub-element under the JGroups
- <literal>Config</literal> element. Here is an example configuration.</para>
+ <literal>config</literal> element. Here is an example configuration.</para>
<programlisting>
-<FC max_credits="1000000"
-down_thread="false" up_thread="false"
- min_threshold="0.10"/>
+<FC max_credits="2000000"
+ min_threshold="0.10"
+ ignore_synchronous_response="true"/>
</programlisting>
@@ -1036,45 +1065,43 @@
(in bytes). This value should be smaller than the JVM heap size.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">min_credits</emphasis> specifies the threshold credit on the
- sender, below which the receiver should send in more credits.</para>
+ <para><emphasis role="bold">min_credits</emphasis> specifies the minimum number of bytes that must be received before the receiver will send more credits to the sender.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">min_threshold</emphasis> specifies percentage value of the
- threshold. It overrides the <literal>min_credits</literal> attribute.</para>
+ <para><emphasis role="bold">min_threshold</emphasis> specifies the percentage of the
+ <literal>max_credits</literal> that should be used to calculate <literal>min_credits</literal>.
+ Setting this overrides the <literal>min_credits</literal> attribute.</para>
</listitem>
+ <listitem>
+ <para><emphasis role="bold">ignore_synchronous_response</emphasis> specifies whether threads that have carried messages up to the application should be allowed to carry outgoing messages back down through FC without blocking for credits. <emphasis>Synchronous response</emphasis> refers to the fact that these messages are generally responses to incoming RPC-type messages. Forbidding JGroups threads to carry messages up to block in FC can help prevent certain deadlock scenarios, so we recommend setting this to <literal>true</literal>.</para>
+ </listitem>
</itemizedlist>
-
-<note><title>Note</title>
+ <note>
+ <title>Why is FC needed on top of TCP ? TCP has its own flow control!</title>
<para>
- Applications that use synchronous group RPC calls primarily do not require FC protocol in their JGroups protocol stack because synchronous communication, where the hread that makes the call blocks waiting for responses from all the members of the group, already slows overall rate of calls. Even though TCP provides flow control by itself, FC is still required in TCP based JGroups stacks because of group communication, where we essentially have to send group messages at the highest speed the slowest receiver can keep up with. TCP flow control only takes into account individual node communications and has not a notion of who's the slowest in the group, which is why FC is required.
+ FC is required for group communication where group messages must be sent at the highest speed that the slowest receiver can handle. For example, say we have a cluster comprised of nodes A, B, C and D. D is slow (perhaps overloaded), while the rest are fast. When A sends a group message, it does so via TCP connections: A-A (theoretically), A-B, A-C and A-D.
+ </para>
+ <para>
+ Say A sends 100 million messages to the cluster. TCP's flow control applies to A-B, A-C and A-D individually, but not to A-BCD as a group. Therefore, A, B and C will receive the 100 million messages, but D will receive only 1 million. (This is also why <literal>NAKACK</literal> is required, even though TCP handles its own retransmission.)
</para>
-</note>
-
-<section>
- <title>Why is FC needed on top of TCP ? TCP has its own flow control !</title>
<para>
-
- The reason is group communication, where we essentially have to send group messages at the highest speed the slowest receiver can keep up with. Let's say we have a cluster {A,B,C,D}. D is slow (maybe overloaded), the rest is fast. When A sends a group message, it establishes TCP connections A-A (conceptually), A-B, A-C and A-D (if they don't yet exist). So let's say A sends 100 million messages to the cluster. Because TCP's flow control only applies to A-B, A-C and A-D, but not to A-{B,C,D}, where {B,C,D} is the group, it is possible that A, B and C receive the 100M, but D only received 1M messages. (BTW: this is also the reason why we need NAKACK, although TCP does its own retransmission).
+ JGroups must buffer all messages in memory in case an original sender <emphasis>S</emphasis> dies and a node requests retransmission of a message sent by <emphasis>S</emphasis>. Since all members buffer all messages that they receive, stable messages (messages seen by every node) must sometimes be purged. (The purging process is managed by the <literal>STABLE</literal> protocol. For more information, see <xref linkend="jgroups-other-gc"/>.)
</para>
- <para>
- Now JGroups has to buffer all messages in memory for the case when the original sender S dies and a node asks for retransmission of a message of S. Because all members buffer all messages they received, they need to purge stable messages (= messages seen by everyone) every now and then. This is done by the STABLE protocol, which can be configured to run the stability protocol round time based (e.g. every 50s) or size based (whenever 400K data has been received).
- </para>
<para>
In the above case, the slow node D will prevent the group from purging messages above 1M, so every member will buffer 99M messages ! This in most cases leads to OOM exceptions. Note that - although the sliding window protocol in TCP will cause writes to block if the window is full - we assume in the above case that this is still much faster for A-B and A-C than for A-D.
</para>
<para>
- So, in summary, we need to send messages at a rate the slowest receiver (D) can handle.
+ So, in summary, even with TCP we need to FC to ensure we send messages at a rate the slowest receiver (D) can handle.
</para>
-</section>
+</note>
-<section>
+<note>
<title>So do I always need FC?</title>
<para>
This depends on how the application uses the JGroups channel. Referring to the example above, if there was something about the application that would naturally cause A to slow down its rate of sending because D wasn't keeping up, then FC would not be needed.
</para>
<para>
- A good example of such an application is one that makes synchronous group RPC calls (typically using a JGroups RpcDispatcher.) By synchronous, we mean the thread that makes the call blocks waiting for responses from all the members of the group. In that kind of application, the threads on A that are making calls would block waiting for responses from D, thus naturally slowing the overall rate of calls.
+ A good example of such an application is one that uses JGroups to make synchronous group RPC calls. By synchronous, we mean the thread that makes the call blocks waiting for responses from all the members of the group. In that kind of application, the threads on A that are making calls would block waiting for responses from D, thus naturally slowing the overall rate of calls.
</para>
<para>
A JBoss Cache cluster configured for REPL_SYNC is a good example of an application that makes synchronous group RPC calls. If a channel is only used for a cache configured for REPL_SYNC, we recommend you remove FC from its protocol stack.
@@ -1084,20 +1111,20 @@
</para>
<para>
Another case where FC may not be needed is for a channel used by a JBoss Cache configured for buddy replication and a single buddy. Such a channel will in many respects act like a two node cluster, where messages are only exchanged with one other node, the buddy. (There may be other messages related to data gravitation that go to all members, but in a properly engineered buddy replication use case these should be infrequent. But if you remove FC be sure to load test your application.)
- </para>
+ </para> </note>
</section>
</section>
-
+
<section><title>Fragmentation (FRAG2)</title>
<para>
- This protocol fragments messages larger than certain size. Unfragments at the receiver's side. It works for both unicast and multicast messages. It is configured in the FRAG2 sub-element under the JGroups Config element. Here is an example configuration.
+ This protocol fragments messages that are larger than a certain size, and reassembles them at the receiver's side. It works for both unicast and multicast messages. It is configured with the <literal>FRAG2</literal> sub-element in the JGroups <literal>config</literal> element. Here is an example configuration:
</para>
<programlisting><![CDATA[
- <FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>]]>
+ <FRAG2 frag_size="60000"/>]]>
</programlisting>
<para>
@@ -1105,12 +1132,12 @@
</para>
<itemizedlist>
- <listitem><para><emphasis role="bold">frag_size</emphasis> specifies the max frag size in bytes. Messages larger than that are fragmented.</para></listitem>
+ <listitem><para><emphasis role="bold">frag_size</emphasis> specifies the maximum message size (in bytes) before fragmentation occurs. Messages larger than this size are fragmented. For stacks that use the UDP transport, this value must be lower than 64 kilobytes (the maximum UDP datagram size). For TCP-based stacks, it must be lower than the value of <varname>max_credits</varname> in the FC protocol.</para></listitem>
</itemizedlist>
-<note><title>Note</title>
+<note>
<para>
- TCP protocol already provides fragmentation but a fragmentation JGroups protocol is still needed if FC is used. The reason for this is that if you send a message larger than FC.max_bytes, FC protocol would block. So, frag_size within FRAG2 needs to be set to always be less than FC.max_bytes.
+ TCP protocol already provides fragmentation, but a JGroups fragmentation protocol is still required if FC is used. The reason for this is that if you send a message larger than <literal>FC.max_credits</literal>, the FC protocol will block forever. So, <literal>frag_size</literal> within FRAG2 must always be set to a value lower than that of <literal>FC.max_credits</literal>.
</para>
</note>
@@ -1131,30 +1158,27 @@
<section id="jgroups-other-gc">
<title>Distributed Garbage Collection (STABLE)</title>
<para>
- In a JGroups cluster, all nodes have to store all messages received for potential retransmission in case of a failure. However, if we store all messages forever, we will run out of memory. So, the distributed garbage collection service in JGroups periodically purges messages that have seen by all nodes from the memory in each node. The distributed garbage collection service is configured in the <literal>pbcast.STABLE</literal> sub-element under the JGroups <literal>Config</literal> element. Here is an example configuration.
- </para>
+ In a JGroups cluster, all nodes must store all messages received for potential retransmission in case of a failure. However, if we store all messages forever, we will run out of memory. The distributed garbage collection service periodically purges messages that have been seen by all nodes, removing them from the memory in each node. The distributed garbage collection service is configured in the <literal>pbcast.STABLE</literal> sub-element under the JGroups <literal>config</literal> element. Here is an example configuration.
+ </para>
<programlisting>
<pbcast.STABLE stability_delay="1000"
desired_avg_gossip="5000"
- down_thread="false" up_thread="false"
- max_bytes="400000"/>
+ max_bytes="400000"/>
</programlisting>
<para>The configurable attributes in the <literal>pbcast.STABLE</literal> element are as follows.</para>
<itemizedlist>
<listitem>
<para><emphasis role="bold">desired_avg_gossip</emphasis> specifies intervals (in
- milliseconds) of garbage collection runs. Value <literal>0</literal> disables this
- service.</para>
+ milliseconds) of garbage collection runs. Set this to <literal>0</literal> to disable interval-based garbage collection.</para>
</listitem>
<listitem>
<para><emphasis role="bold">max_bytes</emphasis> specifies the maximum number of bytes
- received before the cluster triggers a garbage collection run. Value
- <literal>0</literal> disables this service.</para>
+ received before the cluster triggers a garbage collection run. Set to <literal>0</literal> to disable garbage collection based on the bytes received.</para>
</listitem>
<listitem>
- <para><emphasis role="bold">stability_delay</emphasis> specifies delay before we send STABILITY msg (give others a change to send first). If used together with max_bytes, this attribute should be set to a small number.</para>
+ <para><emphasis role="bold">stability_delay</emphasis> specifies the maximum time period (in milliseconds) of a random delay introduced before a node sends its <literal>STABILITY</literal> message at the end of a garbage collection run. The delay gives other nodes concurrently running a <literal>STABLE</literal> task a change to send first. If used together with <literal>max_bytes</literal>, this attribute should be set to a small number.</para>
</listitem>
</itemizedlist>
<note>
@@ -1179,54 +1203,50 @@
<itemizedlist>
<listitem>
<para><emphasis role="bold">max_interval</emphasis> specifies the maximum number of
- milliseconds to send out a MERGE message.</para>
+ milliseconds to wait before sending a MERGE message.</para>
</listitem>
<listitem>
<para><emphasis role="bold">min_interval</emphasis> specifies the minimum number of
- milliseconds to send out a MERGE message.</para>
+ milliseconds to wait before sending a MERGE message.</para>
</listitem>
</itemizedlist>
<para>JGroups chooses a random value between <literal>min_interval</literal> and
- <literal>max_interval</literal> to send out the MERGE message.</para>
+ <literal>max_interval</literal> to periodically send the MERGE message.</para>
<note>
<para>
- The cluster states are not merged in a merger. This has to be done by the application. If <literal>MERGE2</literal> is used in conjunction with TCPPING, the <literal>initial_hosts</literal> attribute must contain all the nodes that could potentially be merged back, in order for the merge process to work properly. Otherwise, the merge process would not merge all the nodes even though shunning is disabled. Alternatively use MPING, which is commonly used with TCP to provide multicast member discovery capabilities, instead of TCPPING to avoid having to specify all the nodes.
- </para>
+ The application state maintained by the application using a channel is not merged by JGroups during a merge. This must be done by the application.</para>
+ </note>
+ <note>
+ <para>If <literal>MERGE2</literal> is used in conjunction with <literal>TCPPING</literal>, the <literal>initial_hosts</literal> attribute must contain all the nodes that could potentially be merged back, in order for the merge process to work properly. Otherwise, the merge process may not detect all sub-groups, and may miss those comprised solely of unlisted members.</para>
</note>
</section>
-
-</section>
-
+
<section id="jgroups-other">
<title>Other Configuration Issues</title>
- <section><title>Binding JGroups Channels to a particular interface</title>
+ <section><title>Binding JGroups Channels to a Particular Interface</title>
<para>
In the Transport Protocols section above, we briefly touched on how the interface to which JGroups will bind sockets is configured. Let's get into this topic in more depth:
</para>
<para>
- First, it's important to understand that the value set in any bind_addr element in an XML configuration file will be ignored by JGroups if it finds that system property jgroups.bind_addr (or a deprecated earlier name for the same thing, <literal>bind.address</literal>) has been set. The system property trumps XML. If JBoss Enterprise Application Platform is started with the -b (a.k.a. --host) switch, the Enterprise Application Platform will set <literal>jgroups.bind_addr</literal> to the specified value.
+ First, it is important to understand that the value set in any <literal>bind_addr</literal> element in an XML configuration file will be ignored by JGroups if it finds that the system property <literal>jgroups.bind_addr</literal> (or a deprecated earlier name for the same thing, <literal>bind.address</literal>) has been set. The system property has a higher priority level than the XML property. If JBoss Application Server is started with the <literal>-b</literal> (or <literal>--host</literal>) switch, the application server will set <literal>jgroups.bind_addr</literal> to the specified value. If <literal>-b</literal> is not set, the application server will bind most services to <literal>localhost</literal> by default.
</para>
<para>
- Beginning with Enterprise Application Platform 4.2.0, for security reasons the Enterprise Application Platform will bind most services to localhost if -b is not set. The effect of this is that in most cases users are going to be setting -b and thus jgroups.bind_addr is going to be set and any XML setting will be ignored.
- </para>
- <para>
So, what are <emphasis>best practices</emphasis> for managing how JGroups binds to interfaces?
</para>
<itemizedlist>
<listitem>
<para>
- Binding JGroups to the same interface as other services. Simple, just use -b:
+ Binding JGroups to the same interface as other services. Simple, just use <literal>-b</literal>:</para>
<screen>./run.sh -b 192.168.1.100 -c all</screen>
- </para>
</listitem>
<listitem>
<para>
- Binding services (e.g., JBoss Web) to one interface, but use a different one for JGroups:
+ Binding services (e.g., JBoss Web) to one interface, but use a different one for JGroups:</para>
<screen>./run.sh -b 10.0.0.100 -Djgroups.bind_addr=192.168.1.100 -c all</screen>
-
- Specifically setting the system property overrides the -b value. This is a common usage pattern; put client traffic on one network, with intra-cluster traffic on another.
+ <para>
+ Specifically setting the system property overrides the <literal>-b</literal> value. This is a common usage pattern; put client traffic on one network, with intra-cluster traffic on another.
</para>
</listitem>
<listitem>
@@ -1239,58 +1259,90 @@
</listitem>
<listitem>
<para>
- Binding services (e.g., JBoss Web) to all interfaces, but specify the JGroups interface:
+ Binding services (e.g., JBoss Web) to all interfaces, but specify the JGroups interface:</para>
<screen>./run.sh -b 0.0.0.0 -Djgroups.bind_addr=192.168.1.100 -c all</screen>
-
- Again, specifically setting the system property overrides the -b value.
+ <para>
+ Again, specifically setting the system property overrides the <literal>-b</literal> value.
</para>
</listitem>
<listitem>
<para>
- Using different interfaces for different channels:
+ Using different interfaces for different channels:</para>
<screen>./run.sh -b 10.0.0.100 -Djgroups.ignore.bind_addr=true -c all</screen>
- </para>
</listitem>
</itemizedlist>
<para>
-This setting tells JGroups to ignore the <literal>jgroups.bind_addr</literal> system property, and instead use whatever is specfied in XML. You would need to edit the various XML configuration files to set the <literal>bind_addr</literal> to the desired interfaces.
+This setting tells JGroups to ignore the <literal>jgroups.bind_addr</literal> system property, and instead use whatever is specfied in XML. You would need to edit the various XML configuration files to set the various <literal>bind_addr</literal> attributes to the desired interfaces.
</para>
</section>
<section id="clustering-jgroups-isolation"><title>Isolating JGroups Channels</title>
<para>
- Within JBoss Enterprise Application Platform, there are a number of services that independently create JGroups channels -- 3 different JBoss Cache services (used for HttpSession replication, EJB3 SFSB replication and EJB3 entity replication) along with the general purpose clustering service called HAPartition that underlies most other JBossHA services.
+ Within JBoss Application Server, there are a number of services that independently create JGroups channels — possibly multiple different JBoss Cache services (used for <literal>HttpSession</literal> replication, EJB3 stateful session bean replication and EJB3 entity replication), two JBoss Messaging channels, and <application>HAPartition</application>, the general purpose clustering service that underlies most other JBossHA services.
</para>
<para>
It is critical that these channels only communicate with their intended peers; not with the channels used by other services and not with channels for the same service opened on machines not meant to be part of the group. Nodes improperly communicating with each other is one of the most common issues users have with JBoss Enterprise Application Platform clustering.
</para>
<para>
- Whom a JGroups channel will communicate with is defined by its group name, multicast address, and multicast port, so isolating JGroups channels comes down to ensuring different channels use different values for the group name, multicast address and multicast port.
+ Whom a JGroups channel will communicate with is defined by its group name and, for UDP-based channels, its multicast address and port. Isolating a JGroups channel means ensuring that different channels use different values for the group name, the multicast address and, in some cases, the multicast port.
</para>
+ <section>
+ <title>Isolating sets of Application Server instances from each other</title>
+
+ <para>
+ This section addresses the issue of having multiple independent clusters running within the same environment. For example, you might have a production cluster, a staging cluster, and a QA cluster, or multiple clusters in a QA test lab or development team environment.
+ </para>
+ <para>
+ To isolate JGroups clusters from other clusters on the network, you must:
+ </para>
+ <itemizedlist>
+ <listitem><para>Make sure the channels in the various clusters use different group names. This can be controlled with the command line arguments used to start JBoss; see <xref linkend="clustering-jgroups-isolation-group-name"/> for more information.</para></listitem>
+ <listitem><para>Make sure the channels in the various clusters use different multicast addresses. This is also easy to control with the command line arguments used to start JBoss<!--; see <xref linkend="clustering-jgroups-isolation-mcast_addr"/> for more information-->.</para></listitem>
+
+ <listitem><para>If you are not running on Linux, Windows, Solaris or HP-UX, you may
+ also need to ensure that the channels in each cluster use different
+ multicast ports. This is more difficult than using different
+ group names, although it can still be controlled from the command line.
+ See <xref linkend="clustering-jgroups-isolation-mcast_port"/>. Note
+ that using different ports should not be necessary if your servers are
+ running on Linux, Windows, Solaris or HP-UX.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section>
+ <title>Isolating Channels for Different Services on the Same Set of AS Instances</title>
+
+ <para>
+ This section addresses the usual case: a cluster of three machines, each of which has, for example, an HAPartition deployed alongside JBoss Cache for web session clustering. The HAPartition channels should not communicate with the JBoss Cache channels. Ensuring proper isolation of these channels is straightforward, and is usually handled by the application server without any alterations on the part of the user.
+ </para>
+ <para>
+ To isolate channels for different services from each other on the same set of application server instances, each channel must have its own group name. The configurations that ship with JBoss Application Server ensure that this is the case. However, if you create a custom service that uses JGroups directly, you must use a unique group name. If you create a custom JBoss Cache configuration, ensure that you provide a unique value in the <literal>clusterName</literal> configuration property.
+ </para>
+
<para>
- To isolate JGroups channels for different services on the same set of Enterprise Application Platform instances from each other, you MUST change the group name and the multicast port. In other words, each channel must have its own set of values.
+ In releases prior to JBoss Application Server 5, different channels running in the same application server also had to use unique multicast ports.
+ With the JGroups shared transport introduced in JBoss AS 5 (see
+ <xref linkend="clustering-blocks-jgroups-sharedtransport"/>), it is
+ now common for multiple channels to use the same tranpsort protocol and its sockets.
+ This makes configuration easier, which is one of the main benefits of the shared
+ transport. However, if you decide to create your own custom JGroups protocol
+ stack configuration, be sure to configure its transport protocols with a multicast port
+ that is different from the ports used in other protocol stacks.
</para>
- <para>
- For example, say we have a production cluster of 3 machines, each of which has an HAPartition deployed along with a JBoss Cache used for web session clustering. The HAPartition channels should not communicate with the JBoss Cache channels. They should use a different group name and multicast port. They can use the same multicast address, although they don't need to.
- </para>
- <para>
- To isolate JGroups channels for the same service from other instances of the service on the network, you MUST change ALL three values. Each channel must have its own group name, multicast address, and multicast port.
- </para>
- <para>
- For example, say we have a production cluster of 3 machines, each of which has an HAPartition deployed. On the same network there is also a QA cluster of 3 machines, which also has an HAPartition deployed. The HAPartition group name, multicast address, and multicast port for the production machines must be different from those used on the QA machines.
- </para>
- <section><title>Changing the Group Name</title>
+ <section id="clustering-jgroups-isolation-group-name"><title>Changing the Group Name</title>
<para>
- The group name for a JGroups channel is configured via the service that starts the channel. Unfortunately, different services use different attribute names for configuring this. For HAPartition and related services configured in the deploy/cluster-service.xml file, this is configured via a PartitionName attribute. For JBoss Cache services, the name of the attribute is ClusterName.
- </para>
- <para>
- The HAPartition and all the standard JBoss Cache services, make it easy for you to create unique groups names simply by using the -g (a.k.a. –partition) switch when starting JBoss:
+ The group name for a JGroups channel is configured via the service that
+ starts the channel. For all the standard clustered services, we make it easy
+ for you to create unique groups names by simply using the <literal>-g</literal> (or <literal>--partition</literal>) switch when starting JBoss:</para>
<screen>./run.sh -g QAPartition -b 192.168.1.100 -c all</screen>
- This switch sets the jboss.partition.name system property, which is used as a component in the configuration of the group name in all the standard clustering configuration files. For example,
-<screen><![CDATA[<attribute name="ClusterName">Tomcat-${jboss.partition.name:Cluster}</attribute>]]></screen>
+ <para>This switch sets the <literal>jboss.partition.name</literal> system property,
+ which is used as a component in the configuration of the group name in
+ all the standard clustering configuration files. For example,
+<programlisting><![CDATA[<property name="clusterName">${jboss.partition.name:DefaultPartition}-SFSBCache</property>]]></programlisting>
</para>
</section>
@@ -1298,34 +1350,123 @@
<section><title>Changing the multicast address and port</title>
<para>
- The -u (a.k.a. --udp) command line switch may be used to control the multicast address used by the JGroups channels opened by all standard Enterprise Application Platform services.
+ The <literal>-u</literal> (or <literal>--udp</literal>) command line switch may be used to control the multicast address used by the JGroups channels opened by all standard AS services.
<screen><![CDATA[/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c all]]></screen>
-This switch sets the jboss.partition.udpGroup system property, which you can see referenced in all of the standard protocol stack configs in JBoss Enterprise Application Platform:
+This switch sets the <literal>jboss.partition.udpGroup</literal> system property, which is referenced in all of the standard protocol stack configurations in JBoss AS:
</para>
-<programlisting><![CDATA[<Config>
-<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}"
- ....]]>
-</programlisting>
-<para>
- Unfortunately, setting the multicast ports is not so simple. As described above, by default there are four separate JGroups channels in the standard JBoss Enterprise Application Platform all configuration, and each should be given a unique port. There are no command line switches to set these, but the standard configuration files do use system properties to set them. So, they can be configured from the command line by using -D. For example,
- </para>
+<programlisting><![CDATA[<UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" ....]]></programlisting>
+
+ <note>
+ <title>Why is changing the group name insufficient?</title>
+ <para>
+ If channels with different group names share the same multicast address and port, the lower level JGroups protocols in each channel will see, process and eventually discard messages intended for the other group. This will at a minimum hurt performance and can lead to anomalous behavior.
+ </para>
+ </note>
+ </section>
+
+ <section id="clustering-jgroups-isolation-mcast_port">
+ <title>Changing the Multicast Port</title>
+ <para>
+ On some operating systems (Mac OS X for example), using different
+ <literal>-g</literal> and <literal>-u</literal> values is not sufficient
+ to isolate clusters; the channels running in the different clusters
+ must also use different multicast ports. Unfortunately, setting the
+ multicast ports is not as simple as <literal>-g</literal> and
+ <literal>-u</literal>. By default, a JBoss AS instance
+ running the <literal>all</literal> configuration will use up to two different instances of
+ the JGroups UDP transport protocol, and will therefore open two
+ multicast sockets. You can control the ports those sockets use
+ by using system properties on the command line. For example,
+ </para>
<programlisting>
- /run.sh -u 230.1.2.3 -g QAPartition -Djboss.hapartition.mcast_port=12345 -Djboss.webpartition.mcast_port=23456 -Djboss.ejb3entitypartition.mcast_port=34567 -Djboss.ejb3sfsbpartition.mcast_port=45678 -b 192.168.1.100 -c all
+/run.sh -u 230.1.2.3 -g QAPartition -b 192.168.1.100 -c all \\
+ -Djboss.jgroups.udp.mcast_port=12345 -Djboss.messaging.datachanneludpport=23456
</programlisting>
-
-<para><emphasis>Why isn't it sufficient to change the group name?</emphasis></para>
-<para>
- If channels with different group names share the same multicast address and port, the lower level JGroups protocols in each channel will see, process and eventually discard messages intended for the other group. This will at a minimum hurt performance and can lead to anomalous behavior.
- </para>
-
- <para><emphasis>Why do I need to change the multicast port if I change the address?</emphasis></para>
+
+ <para>The <literal>jboss.messaging.datachanneludpport</literal> property controls
+ the multicast port used by the <literal>MPING</literal> protocol in JBoss Messaging's <literal>DATA</literal> channel.
+ The <literal>jboss.jgroups.udp.mcast_port</literal> property controls the
+ multicast port used by the UDP transport protocol shared by all other clustered services.</para>
+
+ <para>The set of JGroups protocol stack configurations included in the
+ <literal>$JBOSS_HOME/server/all/cluster/jgroups-channelfactory.sar/META-INF/jgroups-channelfactory-stacks.xml</literal>
+ file includes a number of other example protocol stack configurations that
+ the standard JBoss AS distribution doesn't actually use. Those configurations also
+ use system properties to set any multicast ports. So, if you reconfigure some
+ AS service to use one of those protocol stack configurations, use the
+ appropriate system property to control the port from the command line.
+ </para>
+ <note><title>Why do I need to change the multicast port if I change the address?</title>
<para>
- It should be sufficient to just change the address, but there is a problem on several operating systems whereby packets addressed to a particular multicast port are delivered to all listeners on that port, regardless of the multicast address they are listening on. So the recommendation is to change both the address and the port.
+ It should be sufficient to just change the address, but unfortunately the
+ handling of multicast sockets is one area where the JVM fails to hide
+ operating system behavior differences from the application. The <literal>java.net.MulticastSocket</literal>
+ class provides different overloaded constructors. On some operating
+ systems, if you use one constructor variant, packets addressed to a particular multicast port are delivered to all
+ listeners on that port, regardless of the multicast address on which they are
+ listening. We refer to this as the <emphasis>promiscuous traffic</emphasis> problem.
+ On most operating systems that exhibit the promiscuous traffic problem
+ (Linux, Solaris and HP-UX) JGroups can use a different constructor
+ variant that avoids the problem. However, on some operating systems with the
+ promiscuous traffic problem (Mac OS X), multicast does not work
+ properly if the other constructor variant is used. So, on these
+ operating systems the recommendation is to configure different
+ multicast ports for different clusters.
</para>
-</section>
+ </note>
</section>
</section>
+
+ <section id="jgroups-perf-udpbuffer">
+ <title>Improving UDP Performance by Configuring OS UDP Buffer Limits</title>
+ <para>By default, the JGroups channels in JBoss Enterprise Application Platform use the UDP transport protocol to take advantage of IP multicast. However, one disadvantage
+ of UDP is it does not come with the reliable delivery guarantees
+ provided by TCP. The protocols discussed in
+ <xref linkend="jgroups-reliable"/> allow JGroups to guarantee delivery of
+ UDP messages, but those protocols are implemented in Java, not at the
+ operating system network layer. For peak performance from a UDP-based JGroups
+ channel it is important to limit the need for JGroups to retransmit messages
+ by limiting UDP datagram loss.</para>
+
+ <para>One of the most common causes of lost UDP datagrams is an undersized receive
+ buffer on the socket. The UDP protocol's <literal>mcast_recv_buf_size</literal>
+ and <literal>ucast_recv_buf_size</literal> configuration attributes
+ are used to specify the amount of receive buffer JGroups <emphasis>requests</emphasis>
+ from the operating system, but the actual size of the buffer the operating system provides is limited by operating system-level maximums. These maximums are often very low:</para>
+
+ <table frame="topbot">
+ <title>Default Max UDP Buffer Sizes</title>
+ <tgroup cols="2">
+ <thead><row><entry>Operating System</entry><entry>Default Max UDP Buffer (in bytes)</entry></row></thead>
+ <tbody>
+ <row><entry>Linux</entry><entry>131071</entry></row>
+ <row><entry>Windows</entry><entry>No known limit</entry></row>
+ <row><entry>Solaris</entry><entry>262144</entry></row>
+ <row><entry>FreeBSD, Darwin</entry><entry>262144</entry></row>
+ <row><entry>AIX</entry><entry>1048576</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>The command used to increase the above limits is operating system-specific. The table
+ below shows the command required to increase the maximum buffer to 25 megabytes.
+ In all cases, root privileges are required:</para>
+
+ <table frame="topbot">
+ <title>Commands to Change Max UDP Buffer Sizes</title>
+ <tgroup cols="2">
+ <thead><row><entry>Operating System</entry><entry>Command</entry></row></thead>
+ <tbody>
+ <row><entry>Linux</entry><entry><literal>sysctl -w net.core.rmem_max=26214400</literal></entry></row>
+ <row><entry>Solaris</entry><entry><literal>ndd -set /dev/udp udp_max_buf 26214400</literal></entry></row>
+ <row><entry>FreeBSD, Darwin</entry><entry><literal>sysctl -w kern.ipc.maxsockbuf=26214400</literal></entry></row>
+ <row><entry>AIX</entry><entry><literal>no -o sb_max=8388608</literal> (AIX will only allow 1 megabyte, 4 megabytes or 8 megabytes).</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+ </section>
+ </section>
<section><title>JGroups Troubleshooting</title>
<section>
@@ -1386,5 +1527,7 @@
</section>
</section>
+</section>
+
</chapter>
Modified: projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Feedback.xml
===================================================================
--- projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Feedback.xml 2009-10-12 23:30:35 UTC (rev 94703)
+++ projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Feedback.xml 2009-10-12 23:56:55 UTC (rev 94704)
@@ -20,7 +20,7 @@
<para>The directory structure includes other languages the book will be translated in. For English please edit the files under <emphasis>en-US</emphasis>.
</para>
- <para>To identify the filename you wish to edit, please check the chapter title which will match the file's name. The files are written in Docbook xml. After saving your changes please validate the files you've edited for error's before committing your changes.
+ <para>To identify the filename you wish to edit, please check the chapter title which will match the file's name. The files are written in Docbook xml. After saving your changes please validate the files you've edited for errors before committing your changes.
</para>
-</section>
\ No newline at end of file
+</section>
Modified: projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Introduction.xml
===================================================================
--- projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Introduction.xml 2009-10-12 23:30:35 UTC (rev 94703)
+++ projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Introduction.xml 2009-10-12 23:56:55 UTC (rev 94704)
@@ -106,17 +106,17 @@
</listitem>
<listitem>
<para>
- JBoss Messaging is a high performance JMS provider in the JBoss Enterprise Middleware Stack (JEMS), included with JBoss Enterprise Application Platform 5 as the default messaging provider. It is also the backbone of the JBoss ESB infrastructure. JBoss Messaging is a complete rewrite of JBossMQ, which is the default JMS provider for the JBoss Enterprise Application Platform 4.x series.
+ JBoss Messaging is a high performance JMS provider in the JBoss Enterprise Middleware Stack (JEMS), included with JBoss Enterprise Application Platform 5 as the default messaging provider. It is also the backbone of the JBoss ESB infrastructure. JBoss Messaging is a complete rewrite of JBossMQ, which is the default JMS provider for JBoss Enterprise Application Platform 4.2.
</para>
</listitem>
<listitem>
<para>
- JBossCache 2.0 that comes in two flavors. A traditional tree-structured node-based cache and a PojoCache, an in-memory, transactional, and replicated cache system that allows users to operate on simple POJOs transparently without active user management of either replication or persistency aspects.
+ JBoss Cache comes in two flavors: a traditional tree-structured node-based cache, and a PojoCache, an in-memory, transactional, and replicated cache system that allows users to operate on simple POJOs transparently without active user management of either replication or persistency aspects.
</para>
</listitem>
<listitem>
<para>
- JBossWS 2 is the web services stack for JBoss Enterprise Application Platform 5 providing Java EE compatible web services, JAXWS-2.0.
+ JBossWS 2 is the web services stack for JBoss Enterprise Application Platform 5 providing Java EE compatible web services, JAXWS-2.x.
</para>
</listitem>
<listitem>
@@ -130,16 +130,16 @@
</para>
</listitem>
</itemizedlist>
- <para>
+ <!--<para>
JBoss Enterprise Application Platform 5 includes numerous features and bug fixes, many of them carried over from the JBoss Enterprise Application Platform 4.x codebase. See the Detailed Release Notes section for the full details.
-</para>
+</para>-->
<section id="JBossAS_Use_Cases">
<title>JBoss Enterprise Application Platform Use Cases</title>
<itemizedlist>
<listitem>
<para>
- 99% of web apps involve a database
+ 99% of web applications involving a database
</para>
</listitem>
<listitem>
@@ -149,7 +149,7 @@
</listitem>
<listitem>
<para>
- Simple web applications with JSPs/Servlets upgrades to JBoss Enterprise Application Platform with tomcat embedded.
+ Simple web applications with JSPs/Servlets upgrades to JBoss Enterprise Application Platform with Tomcat Embedded.
</para>
</listitem>
<listitem>
Modified: projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Microcontainer.xml
===================================================================
--- projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Microcontainer.xml 2009-10-12 23:30:35 UTC (rev 94703)
+++ projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Microcontainer.xml 2009-10-12 23:56:55 UTC (rev 94704)
@@ -26,6 +26,8 @@
<para>
This section introduces the various Microcontainer modules. The figure below gives an overview of the modules.
+
+</para>
<inlinemediaobject>
<imageobject>
@@ -33,83 +35,70 @@
</imageobject>
</inlinemediaobject>
- <itemizedlist>
- <listitem>
- <para>
- <literal>aop-mc-int</literal> handles integration between the JBossAOP and Microcontainer projects
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>classloader</literal> new peer classloader model, prepared to handle OSGi bundle model.
- </para>
- </listitem>
- <!-- <listitem>
- <para>
- The <literal>container</literal> module contains: reflection, the integration point for manipulating class information at runtime (for example, overriding annotations or obtaining an aop instance advisor), joinpoint (the joinpoint model including the join point factory), classadaptor (the integration and configuration spi) and metadata (the base metadata types and repository).
- </para>
- </listitem> -->
- <listitem>
- <para>
- <literal>dependency</literal> management is handled by the controller. The controller is the core component for keeping track of contexts to make sure the configuration and lifecycle are done in the correct order including dependencies and classloading considerations.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>deployers</literal> load components from various models (for example, POJOs, JMX, spring, Java EE) into the Microcontainer runtime.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>kernel</literal> defines the core kernel spi including, boostrap, configuration, POJO deployments, dependency, events, bean metadata, and bean registry.
- </para>
- </listitem>
- <listitem>
- <para>
- The <literal>managed</literal> and <literal>metatype</literal> modules define the base objects defining the management view of a component.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>guice-int</literal> contains the integration classes for Guice.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>osgi-int</literal> contains the integration classes that adapt the OSGi model onto the Microcontainer.
- </para>
- </listitem>
- <!-- <listitem>
- <para><literal>reliance-identity</literal> defines identity as a MC POJO service</para>
- </listitem> -->
- <listitem>
- <para><literal>reliance-rules</literal> defines your dependencies with Drools</para>
- </listitem>
- <!-- <listitem>
- <para><literal>reliance-jbpm</literal> defines your dependencies with jBPM</para>
- </listitem> -->
- <listitem>
- <para>
- <literal>reflect</literal> is the integration point for manipulating class information at runtime.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>mdr</literal> is the generic metadata repository. It handles scoped metadata lookups.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>vfs</literal> represents Virtual File System. It's an abstraction layer to identify known file system issues in a single module.
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>spring-int</literal> contains the integration classes that adapt the spring model onto the Microcontainer.
- </para>
- </listitem>
- </itemizedlist>
- </para>
+<itemizedlist>
+ <!--<listitem>
+ <para>
+ <literal>Tools</literal> has no description! :D
+ </para>
+ </listitem>-->
+ <listitem>
+ <para>
+ <literal>Integr.</literal> represents the <literal>aop-mc-int</literal> module, which handles integration between the JBoss AOP and JBoss Microcontainer.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>OSGi</literal> represents several integration classes that adapt the OSGi module for the Microcontainer.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>Reliance</literal> represents two modules: <literal>Drools-int</literal> and <literal>jBPM-int</literal>. These modules define Drools dependencies.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>Deployers</literal> load components from from various modules (such as POJOs, JMX, Spring, Java EE) into the Microcontainer at runtime.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>CL</literal> represents the <emphasis>Classloader</emphasis>, a new peer classloader module that handles the OSGi bundle module.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>VFS</literal> represents the <emphasis>Virtual File System</emphasis>. This is an abstract layer used to identify known file system issues within a single module.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>Kernel</literal> defines the core kernel SPI, including bootstrap, configuration, POJO deployments, dependency, events, bean metadata and bean registry. It contains the following modules:
+ </para>
+ <itemizedlist>
+ <listitem><para><literal>Dependency</literal></para></listitem>
+ <listitem><para><literal>Kernel</literal></para></listitem>
+ <listitem><para><literal>AOP-MC-int</literal></para></listitem>
+ <listitem><para><literal>Spring-int</literal></para></listitem>
+ <listitem><para><literal>Guice-int</literal></para></listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>Managed</literal> represents two modules: <literal>managed</literal> and <literal>metatype</literal>. These modules define the base objects that define the management view of a component.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>MDR</literal> is the generic <emphasis>Metadata Repository</emphasis>. It handles scoped metadata lookups.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>Reflect</literal> is the integration point for manipulating class information at runtime.
+ </para>
+ </listitem>
+</itemizedlist>
</section>
<section><title>Configuration</title>
Modified: projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Virtual_Deployment_Framework.xml
===================================================================
--- projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Virtual_Deployment_Framework.xml 2009-10-12 23:30:35 UTC (rev 94703)
+++ projects/docs/enterprise/5.0/Administration_And_Configuration_Guide/en-US/Virtual_Deployment_Framework.xml 2009-10-12 23:56:55 UTC (rev 94704)
@@ -13,7 +13,7 @@
<title>The JBoss5 Deployment Framework Classes</title>
<mediaobject>
<imageobject>
- <imagedata align="center" fileref="images/vdf.png" />
+ <imagedata scalefit="1" align="center" fileref="images/vdf.png" />
</imageobject>
</mediaobject>
</figure>
@@ -84,7 +84,6 @@
</itemizedlist>
</listitem>
</itemizedlist>
- .
<itemizedlist>
<listitem>
More information about the jboss-cvs-commits
mailing list