[jboss-cvs] JBossCache/docs/JBossCache-UserGuide/en/modules ...
Manik Surtani
msurtani at jboss.com
Tue Jan 23 18:40:16 EST 2007
User: msurtani
Date: 07/01/23 18:40:16
Modified: docs/JBossCache-UserGuide/en/modules architecture.xml
replication.xml
Removed: docs/JBossCache-UserGuide/en/modules state_transfer.xml
Log:
Chapter 7
Revision Changes Path
1.4 +30 -0 JBossCache/docs/JBossCache-UserGuide/en/modules/architecture.xml
(In the diff below, changes in quantity of whitespace are not shown.)
Index: architecture.xml
===================================================================
RCS file: /cvsroot/jboss/JBossCache/docs/JBossCache-UserGuide/en/modules/architecture.xml,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -b -r1.3 -r1.4
--- architecture.xml 23 Jan 2007 19:20:32 -0000 1.3
+++ architecture.xml 23 Jan 2007 23:40:15 -0000 1.4
@@ -254,6 +254,14 @@
In addition, reference counting is done to reduce duplication of writing certain objects multiple times, to help
keep the streams small and efficient.
</para>
+ <para>
+ Also, if <literal>UseRegionBasedMarshalling</literal> is enabled (disabled by default) the marshaller adds region
+ information to the stream before writing any data. This region information is in the form of a <literal>String</literal>
+ representation of an <literal>Fqn</literal>. When unmarshalling, the <literal>RegionManager</literal> can be used to
+ find the relevant <literal>Region</literal>, and use a region-specific <literal>ClassLoader</literal> to unmarshall
+ the stream. This is specifically useful when used to cluster state for application servers, where each deployment has
+ it's own <literal>ClassLoader</literal>. See the section below on <link linkend="architecture.regions">regions</link> for more information.
+ </para>
<section>
<title>JBoss Serialization</title>
<para>
@@ -267,4 +275,26 @@
</section>
</section>
+ <section id="architecture.regions">
+ <title>Class Loading and Regions</title>
+ <para>
+ When used to cluster state of application servers, applications deployed in the application tend to put instances
+ of objects specific to their application in the cache (or in an <literal>HttpSession</literal> object) which
+ would require replication. It is common for application servers to assign separate <literal>ClassLoader</literal>
+ instances to each application deployed, but have JBoss Cache libraries referenced by the application server's
+ <literal>ClassLoader</literal>.
+ </para>
+ <para>
+ To enable us to successfully marshall and unmarshall objects from such class loaders, we use a concept called
+ regions. A region is a portion of the cache which share a common class loader (a region also has other uses -
+ see <link linkend="eviction_policies">eviction policies</link>).
+ </para>
+ <para>
+ A region is created by using the <literal>Cache.getRegion(Fqn fqn, boolean createIfNotExists)</literal> method,
+ and returns an implementation of the <literal>Region</literal> interface. Once a region is obtained, a
+ class loader for the region can be set or unset, and the region can be activated/deactivated. By default, regions
+ are active unless the <literal>InactiveOnStartup</literal> configuration attribute is set to <literal>true</literal>.
+ </para>
+ </section>
+
</chapter>
1.2 +295 -104 JBossCache/docs/JBossCache-UserGuide/en/modules/replication.xml
(In the diff below, changes in quantity of whitespace are not shown.)
Index: replication.xml
===================================================================
RCS file: /cvsroot/jboss/JBossCache/docs/JBossCache-UserGuide/en/modules/replication.xml,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -b -r1.1 -r1.2
--- replication.xml 19 Jan 2007 17:01:00 -0000 1.1
+++ replication.xml 23 Jan 2007 23:40:15 -0000 1.2
@@ -1,83 +1,72 @@
-<chapter id="replication">
- <title>Clustered Caches</title>
-
- <para>The
- <literal>TreeCache</literal>
- can be configured to be either local (standalone) or clustered. If
+<chapter id="clustering">
+ <title>Clustering</title>
+ <para>This chapter talks about aspects around clustering JBoss Cache.</para>
+ <section>
+ <title>Cache Replication Modes</title>
+ <para>
+ JBoss Cache can be configured to be either local (standalone) or clustered. If
in a cluster, the cache can be configured to replicate changes, or to
invalidate changes. A detailed discussion on this follows.
</para>
-
<section>
- <title>Local Cache</title>
-
+ <title>LOCAL</title>
<para>Local caches don't join a cluster and don't communicate with other
- nodes in a cluster. Therefore their elements don't need to be
+ caches in a cluster. Therefore their elements don't need to be
serializable - however, we recommend making them serializable, enabling a
- user to change the cache mode at any time.
- </para>
+ user to change the cache mode at any time. The dependency on the JGroups
+ library is still there, although a JGroups channel is not started.</para>
</section>
-
<section>
- <title>Clustered Cache - Using Replication</title>
-
+ <title>Replicated Caches</title>
<para>Replicated caches replicate all changes to the
- other
- <literal>TreeCache</literal>
- instances in the cluster. Replication can either happen
+ other cache instances in the cluster. Replication can either happen
after each modification (no transactions), or at the end of a
transaction (commit time).
</para>
<para>Replication can be synchronous or asynchronous . Use of either one
of the options is application dependent. Synchronous replication blocks
- the caller (e.g. on a put()) until the modifications have been
+ the caller (e.g. on a <literal>put()</literal>) until the modifications have been
replicated successfully to all nodes in a cluster. Asynchronous
- replication performs replication in the background (the put() returns
+ replication performs replication in the background (the <literal>put()</literal> returns
immediately).
- <literal>TreeCache</literal>
+ JBoss Cache
also offers a replication queue, where
modifications are replicated periodically (i.e. interval-based), or when
the queue size exceeds a number of elements, or a combination
thereof.
</para>
- <para>Asynchronous replication is faster (no caller blocking), because
+ <para>
+ Asynchronous replication is faster (no caller blocking), because
synchronous replication requires acknowledgments from all nodes in a
cluster that they received and applied the modification successfully
(round-trip time). However, when a synchronous replication returns
successfully, the caller knows for sure that all modifications have been
- applied at all nodes, whereas this may or may not be the case with
+ applied at all nodes, whereas this is not be the case with
asynchronous replication. With asynchronous replication, errors are
- simply written to a log. Even when using transactions, a transaction may succeed but replication may not
- succeed on all
- <literal>TreeCache</literal>
- instances.
+ simply written to a log. Even when using transactions, a transaction may
+ succeed but replication may not succeed on all cache instances.
</para>
<section>
<title>Replicated Caches and Transactions</title>
<para>
When using transactions, replication only occurs at the transaction boundary - i.e., when a transaction
- commits.
- This results in minimising replication traffic since a single modification os broadcast rather than a series
- of
- individual modifications, and can be a lot more efficient than not using transactions. Another effect of
- this
- is that if a transaction were to roll back, nothing is broadcast across a cluster.
+ commits. This results in minimising replication traffic since a single modification os broadcast rather
+ than a series of individual modifications, and can be a lot more efficient than not using transactions.
+ Another effect of this is that if a transaction were to roll back, nothing is broadcast across a cluster.
</para>
<para>
Depending on whether you are running your cluster in asynchronous or synchronous mode, JBoss Cache will use
- either a single phase or
- <ulink url="http://en.wikipedia.org/wiki/Two-phase_commit_protocol">two phase commit</ulink>
+ either a single phase or <ulink url="http://en.wikipedia.org/wiki/Two-phase_commit_protocol">two phase commit</ulink>
protocol, respectively.
</para>
<section>
<title>One Phase Commits</title>
<para>
Used when your cache mode is REPL_ASYNC. All modifications are replicated in a single call, which
- instructs
- remote caches to apply the changes to their local in-memory state and commit locally. Remote
+ instructs remote caches to apply the changes to their local in-memory state and commit locally. Remote
errors/rollbacks are never fed back to the originator of the transaction since the communication is
asynchronous.
</para>
@@ -86,32 +75,24 @@
<title>Two Phase Commits</title>
<para>
Used when your cache mode is REPL_SYNC. Upon committing your transaction, JBoss Cache broadcasts a
- prepare call,
- which carries all modifications relevant to the transaction. Remote caches then acquire local locks on
- their
- im-memory state and apply the modifications. Once all remote caches respond to the prepare call, the
- originator of the transaction broadcasts a commit. This instructs all remote caches to commit their data.
- If any of the caches fail to respond to the prepare phase, the originator broadcasts a rollback.
+ prepare call, which carries all modifications relevant to the transaction. Remote caches then acquire
+ local locks on their in-memory state and apply the modifications. Once all remote caches respond to the
+ prepare call, the originator of the transaction broadcasts a commit. This instructs all remote caches to
+ commit their data. If any of the caches fail to respond to the prepare phase, the originator broadcasts
+ a rollback.
</para>
<para>
Note that although the prepare phase is synchronous, the commit and rollback phases are asynchronous.
- This
- is because
+ This is because
<ulink url="http://java.sun.com/products/jta/">Sun's JTA specification</ulink>
does not specify how transactional resources should deal with failures
at this stage of a transaction; and other resources participating in the transaction may have
- indeterminate
- state anyway. As such, we do away with the overhead of synchronous communication for this phase of the
- transaction. That said, they can be forced to be synchronous using the
- <literal>SyncCommitPhase</literal>
- and
- <literal>SyncRollbackPhase</literal>
- configuration options.
+ indeterminate state anyway. As such, we do away with the overhead of synchronous communication
+ for this phase of the transaction. That said, they can be forced to be synchronous using the
+ <literal>SyncCommitPhase</literal> and <literal>SyncRollbackPhase</literal> configuration attributes.
</para>
</section>
-
</section>
-
<section>
<title>Buddy Replication</title>
<para>
@@ -123,31 +104,32 @@
<para>
One of the most common use cases of Buddy Replication is when a replicated cache is used by a servlet
container to store HTTP session data. One of the pre-requisites to buddy replication working well and being
- a real benefit is the use of
- <emphasis>session affinity</emphasis>
- , also known as
- <emphasis>sticky sessions</emphasis>
- in HTTP session replication speak. What this means is that if certain data is frequently accessed, it is
- desirable that this is always accessed on one instance rather than in a round-robin fashion as this helps
- the cache cluster optimise how it chooses buddies, where it stores data, and minimises replication traffic.
+ a real benefit is the use of <emphasis>session affinity</emphasis>, more casually known as
+ <emphasis>sticky sessions</emphasis> in HTTP session replication speak. What this means is that if
+ certain data is frequently accessed, it is desirable that this is always accessed on one instance rather
+ than in a round-robin fashion as this helps the cache cluster optimise how it chooses buddies, where it
+ stores data, and minimises replication traffic.
</para>
<para>
If this is not possible, Buddy Replication may prove to be more of an overhead than a benefit.
</para>
<section>
<title>Selecting Buddies</title>
+ <figure>
+ <title>BuddyLocator</title>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/BuddyReplication.png"/>
+ </imageobject>
+ </mediaobject>
+ </figure>
<para>
- Buddy Replication uses an instance of a
- <literal>org.jboss.cache.buddyreplication.BuddyLocator</literal>
+ Buddy Replication uses an instance of a <literal>BuddyLocator</literal>
which contains the logic used to select buddies in a network. JBoss Cache currently ships with a single
- implementation,
- <literal>org.jboss.cache.buddyreplication.NextMemberBuddyLocator</literal>
- , which is used
- as a default if no implementation is provided. The
- <literal>NextMemberBuddyLocator</literal>
- selects the
- next member in the cluster, as the name suggests, and guarantees an even spread of buddies for each
- instance.
+ implementation, <literal>NextMemberBuddyLocator</literal>, which is used as a default if no implementation
+ is provided. The <literal>NextMemberBuddyLocator</literal> selects the next member in the cluster, as
+ the name suggests, and guarantees an even spread of buddies for each instance.
</para>
<para>
The
@@ -174,17 +156,15 @@
<section>
<title>BuddyPools</title>
<para>
- Also known as replication groups, a buddy pool is an optional construct where each instance in a cluster
+ Also known as <emphasis>replication groups</emphasis>, a buddy pool is an optional construct where each instance in a cluster
may be configured with a buddy pool name. Think of this as an 'exclusive club membership' where when
selecting buddies,
- <literal>BuddyLocator</literal>
- s would try and select buddies sharing the same
+ <literal>BuddyLocator</literal>s that support buddy pools would try and select buddies sharing the same
buddy pool name. This allows system administrators a degree of flexibility and control over how buddies
are selected. For example, a sysadmin may put two instances on two separate physical servers that
may be on two separate physical racks in the same buddy pool. So rather than picking an
instance on a different host on the same rack,
- <literal>BuddyLocator</literal>
- s would rather pick
+ <literal>BuddyLocator</literal>s would rather pick
the instance in the same buddy pool, on a separate rack which may add a degree of redundancy.
</para>
</section>
@@ -253,23 +233,6 @@
</para>
</section>
<section>
- <title>Implementation</title>
- <para>
- <figure>
- <title>Class diagram of the classes involved in buddy replication and how they are related to each
- other
- </title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/PackageOverview-BuddyReplication.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- </para>
-
- </section>
- <section>
<title>Configuration</title>
<para>
<programlisting><![CDATA[
@@ -319,11 +282,13 @@
</para>
</section>
</section>
- </section>
- <section>
- <title>Clustered Cache - Using Invalidation</title>
+
+ </section>
+ </section>
+ <section>
+ <title>Invalidation</title>
<para>If a cache is configured for invalidation rather than replication,
every time data is changed in a cache other caches in the cluster
receive a message informing them that their data is now stale and should
@@ -351,4 +316,230 @@
doesn't block and wait for responses.
</para>
</section>
+
+
+
+ <section>
+ <title>State Transfer</title>
+
+ <para><emphasis>State Transfer</emphasis> refers to the process by which a JBoss Cache instance
+ prepares itself to begin providing a service by acquiring the current
+ state from another cache instance and integrating that state into its
+ own state.
+ </para>
+
+ <section>
+ <title>Types of State Transfer</title>
+
+ <para>The state that is acquired and integrated can consist of two basic
+ types:
+ </para>
+
+ <orderedlist>
+ <listitem>
+ <para>"Transient" or "in-memory" state. This consists of the actual
+ in-memory state of another cache instance - the contents of the
+ various in-memory nodes in the cache that is providing state are
+ serialized and transferred; the recipient deserializes the data,
+ creates corresponding nodes in its own in-memory tree, and populates
+ them with the transferred data.
+ </para>
+
+ <para>"In-memory" state transfer is enabled by setting
+ the cache's
+ <literal>FetchInMemoryState</literal>
+ configuration attribute to
+ <literal>true</literal>
+ .
+ </para>
+ </listitem>
+ <listitem>
+ <para>"Persistent" state. Only applicable if a non-shared
+ cache loader is used. The state stored in the state-provider
+ cache's persistent store is deserialized and transferred; the
+ recipient passes the data to its own cache loader, which persists
+ it to the recipient's persistent store.
+ </para>
+
+ <para>"Persistent" state transfer is enabled by setting
+ a cache loader's
+ <literal>fetchPersistentState</literal>
+ attribute to
+ <literal>true</literal>
+ . If multiple cache loaders
+ are configured in a chain, only one can have this property
+ set to true; otherwise you will get an exception at startup.
+ </para>
+
+ <para>Persistent state transfer with a shared cache loader does
+ not make sense, as the same persistent store that provides the
+ data will just end up receiving it. Therefore, if a shared cache
+ loader is used, the cache will not allow a persistent state
+ transfer even if a cache loader has
+ <literal>fetchPersistentState</literal>
+ set to
+ <literal>true</literal>
+ .
+ </para>
+ </listitem>
+ </orderedlist>
+
+ <para>Which of these types of state transfer is appropriate depends on the usage
+ of the cache.
+ </para>
+
+ <orderedlist>
+ <listitem>
+ <para>If a write-through cache loader is used, the current cache
+ state is fully represented by the persistent state. Data may
+ have been evicted from the in-memory state, but it will still be
+ in the persistent store. In this case, if the cache loader is not
+ shared, persistent state transfer is used to ensure the new cache
+ has the correct state. In-memory state can be transferred as well
+ if the desire is to have a "hot" cache -- one that has all
+ relevant data in memory when the cache begins providing service.
+ (Note that the "CacheLoaderPreload" configuration parameter can
+ be used as well to provide a "warm" or "hot" cache without
+ requiring an in-memory state transfer. This approach somewhat
+ reduces the burden on the cache instance providing state, but
+ increases the load on the persistent store on the recipient
+ side.)
+ </para>
+ </listitem>
+ <listitem>
+ <para>If a cache loader is used with passivation, the full
+ representation of the state can only be obtained by combining
+ the in-memory (i.e. non-passivated) and persistent (i.e. passivated)
+ states. Therefore an in-memory state transfer is necesssary. A
+ persistent state transfer is necessary if the cache loader is
+ not shared.
+ </para>
+ </listitem>
+ <listitem>
+ <para>If no cache loader is used and the cache is solely a
+ write-aside cache (i.e. one that is used to cache data that can
+ also be found in a persistent store, e.g. a database), whether
+ or not in-memory state should be transferred depends on whether
+ or not a "hot" cache is desired.
+ </para>
+ </listitem>
+ </orderedlist>
+ </section>
+ <section>
+ <title>When State Transfer Occurs</title>
+
+ <para>If either in-memory or persistent state transfer is enabled, a full or
+ partial state transfer will be done at various times, depending on how the
+ cache is used. "Full" state transfer refers to the transfer of the state
+ related to the entire tree -- i.e. the root node and all nodes below it.
+ A "partial" state transfer is one where just a portion of the tree is
+ transferred -- i.e. a node at a given Fqn and all nodes below it.
+ </para>
+
+ <para>If either in-memory or persistent state transfer is enabled, state
+ transfer will occur at the following times:
+ </para>
+
+ <orderedlist>
+ <listitem>
+ <para>Initial state transfer. This occurs when the cache is first
+ started (as part of the processing of the
+ <literal>start()</literal>
+ method). This is a full state transfer. The state is retrieved
+ from the cache instance that has been operational the longest. If
+ there is any problem receiving or integrating the state, the cache
+ will not start.
+ </para>
+
+ <para>Initial state transfer will occur unless:</para>
+
+ <orderedlist>
+ <listitem>
+ <para>The cache's
+ <literal>InactiveOnStartup</literal>
+ property
+ is
+ <literal>true</literal>
+ . This property is used in conjunction
+ with region-based marshalling.
+ </para>
+ </listitem>
+ <listitem>
+ <para>Buddy replication is used. See below for more on
+ state transfer with buddy replication.
+ </para>
+ </listitem>
+ </orderedlist>
+ </listitem>
+
+ <listitem>
+ <para>Partial state transfer following region activation. Only
+ relevant when region-based marshalling is used. Here a special
+ classloader is needed to unmarshal the state for a portion of
+ the tree. State transfer cannot succeed until the application
+ registers this classloader with the cache. Once the application
+ registers its classloader, it calls
+ <literal>cache.getRegion(fqn, true).activate()</literal>
+ .
+ As part of the region activation process, a partial state transfer
+ of the relevant subtree's state is performed. The state is
+ requested from the oldest cache instance in the cluster; if that
+ instance responds with no state, state is requested from each
+ instance one by one until one provides state or all instances have
+ been queried.
+ </para>
+
+ <para>Typically when region-based marshalling is used, the cache's
+ <literal>InactiveOnStartup</literal>
+ property is set to
+ <literal>true</literal>
+ . This suppresses initial state transfer,
+ which would fail due to the inability to deserialize the
+ transferred state.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>Buddy replication. When buddy replication is used, initial
+ state transfer is disabled. Instead, when a cache instance joins the
+ cluster, it becomes the buddy of one or more other instances, and
+ one or more other instances become its buddy. Each time an instance
+ determines it has a new buddy providing backup for it, it pushes
+ it's current state to the new buddy. This "pushing" of state to
+ the new buddy is slightly different from other forms of state
+ transfer, which are based on a "pull" approach (i.e. recipient
+ asks for and receives state). However, the process of preparing
+ and integrating the state is the same.
+ </para>
+
+ <para>This "push" of state upon buddy group formation only occurs
+ if the
+ <literal>InactiveOnStartup</literal>
+ property is set to
+ <literal>false</literal>
+ . If it is
+ <literal>true</literal>
+ ,
+ state transfer amongst the buddies only occurs when the application
+ activates the region
+ on the various
+ members of the group.
+ </para>
+
+ <para>Partial state transfer following a region activation
+ call is slightly different in the buddy replication case as well.
+ Instead of requesting the partial state from one cache instance,
+ and trying all instances until one responds, with buddy replication
+ the instance that is activating a region will request partial
+ state from each instance for which it is serving as a backup.
+ </para>
+ </listitem>
+ </orderedlist>
+
+
+ </section>
+
+ </section>
+
</chapter>
\ No newline at end of file
+
\ No newline at end of file
More information about the jboss-cvs-commits
mailing list