[jboss-cvs] JBossCache/docs/JBossCache-UserGuide/en/modules ...

Tue Jan 23 18:40:16 EST 2007

  User: msurtani
  Date: 07/01/23 18:40:16

  Modified:    docs/JBossCache-UserGuide/en/modules    architecture.xml
                        replication.xml
  Removed:     docs/JBossCache-UserGuide/en/modules    state_transfer.xml
  Log:
  Chapter 7
  
  Revision  Changes    Path
  1.4       +30 -0     JBossCache/docs/JBossCache-UserGuide/en/modules/architecture.xml
  
  (In the diff below, changes in quantity of whitespace are not shown.)
  
  Index: architecture.xml
  ===================================================================
  RCS file: /cvsroot/jboss/JBossCache/docs/JBossCache-UserGuide/en/modules/architecture.xml,v
  retrieving revision 1.3
  retrieving revision 1.4
  diff -u -b -r1.3 -r1.4
  --- architecture.xml	23 Jan 2007 19:20:32 -0000	1.3
  +++ architecture.xml	23 Jan 2007 23:40:15 -0000	1.4
  @@ -254,6 +254,14 @@
               In addition, reference counting is done to reduce duplication of writing certain objects multiple times, to help
               keep the streams small and efficient.
            </para>
  +         <para>
  +            Also, if <literal>UseRegionBasedMarshalling</literal> is enabled (disabled by default) the marshaller adds region
  +            information to the stream before writing any data.  This region information is in the form of a <literal>String</literal>
  +            representation of an <literal>Fqn</literal>.  When unmarshalling, the <literal>RegionManager</literal> can be used to
  +            find the relevant <literal>Region</literal>, and use a region-specific <literal>ClassLoader</literal> to unmarshall
  +            the stream.  This is specifically useful when used to cluster state for application servers, where each deployment has
  +            it's own <literal>ClassLoader</literal>.  See the section below on <link linkend="architecture.regions">regions</link> for more information.
  +         </para>
            <section>
               <title>JBoss Serialization</title>
               <para>
  @@ -267,4 +275,26 @@
         </section>
   
      </section>
  +   <section id="architecture.regions">
  +         <title>Class Loading and Regions</title>
  +         <para>
  +            When used to cluster state of application servers, applications deployed in the application tend to put instances
  +            of objects specific to their application in the cache (or in an <literal>HttpSession</literal> object) which
  +            would require replication.  It is common for application servers to assign separate <literal>ClassLoader</literal>
  +            instances to each application deployed, but have JBoss Cache libraries referenced by the application server's
  +            <literal>ClassLoader</literal>.
  +         </para>
  +         <para>
  +            To enable us to successfully marshall and unmarshall objects from such class loaders, we use a concept called
  +            regions.  A region is a portion of the cache which share a common class loader (a region also has other uses -
  +            see <link linkend="eviction_policies">eviction policies</link>).
  +         </para>
  +         <para>
  +            A region is created by using the <literal>Cache.getRegion(Fqn fqn, boolean createIfNotExists)</literal> method,
  +            and returns an implementation of the <literal>Region</literal> interface.  Once a region is obtained, a
  +            class loader for the region can be set or unset, and the region can be activated/deactivated.  By default, regions
  +            are active unless the <literal>InactiveOnStartup</literal> configuration attribute is set to <literal>true</literal>.
  +         </para>
  +   </section>
  +
   </chapter>
  
  
  
  1.2       +295 -104  JBossCache/docs/JBossCache-UserGuide/en/modules/replication.xml
  
  (In the diff below, changes in quantity of whitespace are not shown.)
  
  Index: replication.xml
  ===================================================================
  RCS file: /cvsroot/jboss/JBossCache/docs/JBossCache-UserGuide/en/modules/replication.xml,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -b -r1.1 -r1.2
  --- replication.xml	19 Jan 2007 17:01:00 -0000	1.1
  +++ replication.xml	23 Jan 2007 23:40:15 -0000	1.2
  @@ -1,83 +1,72 @@
  -<chapter id="replication">
  -   <title>Clustered Caches</title>
  -
  -   <para>The
  -      <literal>TreeCache</literal>
  -      can be configured to be either local (standalone) or clustered. If
  +<chapter id="clustering">
  +   <title>Clustering</title>
  +   <para>This chapter talks about aspects around clustering JBoss Cache.</para>
  +   <section>
  +      <title>Cache Replication Modes</title>
  +      <para>
  +         JBoss Cache can be configured to be either local (standalone) or clustered. If
         in a cluster, the cache can be configured to replicate changes, or to
         invalidate changes. A detailed discussion on this follows.
      </para>
  -
      <section>
  -      <title>Local Cache</title>
  -
  +         <title>LOCAL</title>
         <para>Local caches don't join a cluster and don't communicate with other
  -         nodes in a cluster. Therefore their elements don't need to be
  +         caches in a cluster. Therefore their elements don't need to be
            serializable - however, we recommend making them serializable, enabling a
  -         user to change the cache mode at any time.
  -      </para>
  +         user to change the cache mode at any time.  The dependency on the JGroups
  +         library is still there, although a JGroups channel is not started.</para>
      </section>
  -
      <section>
  -      <title>Clustered Cache - Using Replication</title>
  -
  +         <title>Replicated Caches</title>
         <para>Replicated caches replicate all changes to the
  -         other
  -         <literal>TreeCache</literal>
  -         instances in the cluster. Replication can either happen
  +         other cache instances in the cluster. Replication can either happen
            after each modification (no transactions), or at the end of a
            transaction (commit time).
         </para>
   
         <para>Replication can be synchronous or asynchronous . Use of either one
            of the options is application dependent. Synchronous replication blocks
  -         the caller (e.g. on a put()) until the modifications have been
  +         the caller (e.g. on a <literal>put()</literal>) until the modifications have been
            replicated successfully to all nodes in a cluster. Asynchronous
  -         replication performs replication in the background (the put() returns
  +         replication performs replication in the background (the <literal>put()</literal> returns
            immediately).
  -         <literal>TreeCache</literal>
  +         JBoss Cache
            also offers a replication queue, where
            modifications are replicated periodically (i.e. interval-based), or when
            the queue size exceeds a number of elements, or a combination
            thereof.
         </para>
   
  -      <para>Asynchronous replication is faster (no caller blocking), because
  +      <para>
  +         Asynchronous replication is faster (no caller blocking), because
            synchronous replication requires acknowledgments from all nodes in a
            cluster that they received and applied the modification successfully
            (round-trip time). However, when a synchronous replication returns
            successfully, the caller knows for sure that all modifications have been
  -         applied at all nodes, whereas this may or may not be the case with
  +         applied at all nodes, whereas this is not be the case with
            asynchronous replication. With asynchronous replication, errors are
  -         simply written to a log. Even when using transactions, a transaction may succeed but replication may not
  -         succeed on all
  -         <literal>TreeCache</literal>
  -         instances.
  +         simply written to a log. Even when using transactions, a transaction may
  +         succeed but replication may not succeed on all cache instances.
         </para>
   
         <section>
            <title>Replicated Caches and Transactions</title>
            <para>
               When using transactions, replication only occurs at the transaction boundary - i.e., when a transaction
  -            commits.
  -            This results in minimising replication traffic since a single modification os broadcast rather than a series
  -            of
  -            individual modifications, and can be a lot more efficient than not using transactions. Another effect of
  -            this
  -            is that if a transaction were to roll back, nothing is broadcast across a cluster.
  +            commits. This results in minimising replication traffic since a single modification os broadcast rather
  +            than a series of individual modifications, and can be a lot more efficient than not using transactions.
  +            Another effect of this is that if a transaction were to roll back, nothing is broadcast across a cluster.
            </para>
            <para>
               Depending on whether you are running your cluster in asynchronous or synchronous mode, JBoss Cache will use
  -            either a single phase or
  -            <ulink url="http://en.wikipedia.org/wiki/Two-phase_commit_protocol">two phase commit</ulink>
  +            either a single phase or <ulink url="http://en.wikipedia.org/wiki/Two-phase_commit_protocol">two phase commit</ulink>
               protocol, respectively.
            </para>
            <section>
               <title>One Phase Commits</title>
               <para>
                  Used when your cache mode is REPL_ASYNC. All modifications are replicated in a single call, which
  -               instructs
  -               remote caches to apply the changes to their local in-memory state and commit locally. Remote
  +               instructs remote caches to apply the changes to their local in-memory state and commit locally. Remote
                  errors/rollbacks are never fed back to the originator of the transaction since the communication is
                  asynchronous.
               </para>
  @@ -86,32 +75,24 @@
               <title>Two Phase Commits</title>
               <para>
                  Used when your cache mode is REPL_SYNC. Upon committing your transaction, JBoss Cache broadcasts a
  -               prepare call,
  -               which carries all modifications relevant to the transaction. Remote caches then acquire local locks on
  -               their
  -               im-memory state and apply the modifications. Once all remote caches respond to the prepare call, the
  -               originator of the transaction broadcasts a commit. This instructs all remote caches to commit their data.
  -               If any of the caches fail to respond to the prepare phase, the originator broadcasts a rollback.
  +               prepare call, which carries all modifications relevant to the transaction. Remote caches then acquire
  +               local locks on their in-memory state and apply the modifications. Once all remote caches respond to the
  +               prepare call, the originator of the transaction broadcasts a commit. This instructs all remote caches to
  +               commit their data.  If any of the caches fail to respond to the prepare phase, the originator broadcasts
  +               a rollback.
               </para>
               <para>
                  Note that although the prepare phase is synchronous, the commit and rollback phases are asynchronous.
  -               This
  -               is because
  +               This is because
                  <ulink url="http://java.sun.com/products/jta/">Sun's JTA specification</ulink>
                  does not specify how transactional resources should deal with failures
                  at this stage of a transaction; and other resources participating in the transaction may have
  -               indeterminate
  -               state anyway. As such, we do away with the overhead of synchronous communication for this phase of the
  -               transaction. That said, they can be forced to be synchronous using the
  -               <literal>SyncCommitPhase</literal>
  -               and
  -               <literal>SyncRollbackPhase</literal>
  -               configuration options.
  +               indeterminate state anyway. As such, we do away with the overhead of synchronous communication
  +               for this phase of the transaction. That said, they can be forced to be synchronous using the
  +               <literal>SyncCommitPhase</literal> and <literal>SyncRollbackPhase</literal> configuration attributes.
               </para>
            </section>
  -
         </section>
  -
         <section>
            <title>Buddy Replication</title>
            <para>
  @@ -123,31 +104,32 @@
            <para>
               One of the most common use cases of Buddy Replication is when a replicated cache is used by a servlet
               container to store HTTP session data. One of the pre-requisites to buddy replication working well and being
  -            a real benefit is the use of
  -            <emphasis>session affinity</emphasis>
  -            , also known as
  -            <emphasis>sticky sessions</emphasis>
  -            in HTTP session replication speak. What this means is that if certain data is frequently accessed, it is
  -            desirable that this is always accessed on one instance rather than in a round-robin fashion as this helps
  -            the cache cluster optimise how it chooses buddies, where it stores data, and minimises replication traffic.
  +            a real benefit is the use of <emphasis>session affinity</emphasis>, more casually known as
  +            <emphasis>sticky sessions</emphasis> in HTTP session replication speak. What this means is that if
  +            certain data is frequently accessed, it is desirable that this is always accessed on one instance rather
  +            than in a round-robin fashion as this helps the cache cluster optimise how it chooses buddies, where it
  +            stores data, and minimises replication traffic.
            </para>
            <para>
               If this is not possible, Buddy Replication may prove to be more of an overhead than a benefit.
            </para>
            <section>
               <title>Selecting Buddies</title>
  +            <figure>
  +                  <title>BuddyLocator</title>
  +
  +                  <mediaobject>
  +                     <imageobject>
  +                        <imagedata fileref="images/BuddyReplication.png"/>
  +                     </imageobject>
  +                  </mediaobject>
  +               </figure>
               <para>
  -               Buddy Replication uses an instance of a
  -               <literal>org.jboss.cache.buddyreplication.BuddyLocator</literal>
  +               Buddy Replication uses an instance of a <literal>BuddyLocator</literal>
                  which contains the logic used to select buddies in a network. JBoss Cache currently ships with a single
  -               implementation,
  -               <literal>org.jboss.cache.buddyreplication.NextMemberBuddyLocator</literal>
  -               , which is used
  -               as a default if no implementation is provided. The
  -               <literal>NextMemberBuddyLocator</literal>
  -               selects the
  -               next member in the cluster, as the name suggests, and guarantees an even spread of buddies for each
  -               instance.
  +               implementation, <literal>NextMemberBuddyLocator</literal>, which is used as a default if no implementation
  +               is provided. The <literal>NextMemberBuddyLocator</literal> selects the next member in the cluster, as
  +               the name suggests, and guarantees an even spread of buddies for each instance.
               </para>
               <para>
                  The
  @@ -174,17 +156,15 @@
            <section>
               <title>BuddyPools</title>
               <para>
  -               Also known as replication groups, a buddy pool is an optional construct where each instance in a cluster
  +               Also known as <emphasis>replication groups</emphasis>, a buddy pool is an optional construct where each instance in a cluster
                  may be configured with a buddy pool name. Think of this as an 'exclusive club membership' where when
                  selecting buddies,
  -               <literal>BuddyLocator</literal>
  -               s would try and select buddies sharing the same
  +               <literal>BuddyLocator</literal>s that support buddy pools would try and select buddies sharing the same
                  buddy pool name. This allows system administrators a degree of flexibility and control over how buddies
                  are selected. For example, a sysadmin may put two instances on two separate physical servers that
                  may be on two separate physical racks in the same buddy pool. So rather than picking an
                  instance on a different host on the same rack,
  -               <literal>BuddyLocator</literal>
  -               s would rather pick
  +               <literal>BuddyLocator</literal>s would rather pick
                  the instance in the same buddy pool, on a separate rack which may add a degree of redundancy.
               </para>
            </section>
  @@ -253,23 +233,6 @@
               </para>
            </section>
            <section>
  -            <title>Implementation</title>
  -            <para>
  -               <figure>
  -                  <title>Class diagram of the classes involved in buddy replication and how they are related to each
  -                     other
  -                  </title>
  -
  -                  <mediaobject>
  -                     <imageobject>
  -                        <imagedata fileref="images/PackageOverview-BuddyReplication.png"/>
  -                     </imageobject>
  -                  </mediaobject>
  -               </figure>
  -            </para>
  -
  -         </section>
  -         <section>
               <title>Configuration</title>
               <para>
                  <programlisting><![CDATA[
  @@ -319,11 +282,13 @@
               </para>
            </section>
         </section>
  -   </section>
   
  -   <section>
  -      <title>Clustered Cache - Using Invalidation</title>
   
  +
  +         </section>
  +      </section>
  +      <section>
  +         <title>Invalidation</title>
         <para>If a cache is configured for invalidation rather than replication,
            every time data is changed in a cache other caches in the cluster
            receive a message informing them that their data is now stale and should
  @@ -351,4 +316,230 @@
            doesn't block and wait for responses.
         </para>
      </section>
  +
  +
  +
  +   <section>
  +      <title>State Transfer</title>
  +
  +      <para><emphasis>State Transfer</emphasis> refers to the process by which a JBoss Cache instance
  +         prepares itself to begin providing a service by acquiring the current
  +         state from another cache instance and integrating that state into its
  +         own state.
  +      </para>
  +
  +      <section>
  +         <title>Types of State Transfer</title>
  +
  +         <para>The state that is acquired and integrated can consist of two basic
  +            types:
  +         </para>
  +
  +         <orderedlist>
  +            <listitem>
  +               <para>"Transient" or "in-memory" state. This consists of the actual
  +                  in-memory state of another cache instance - the contents of the
  +                  various in-memory nodes in the cache that is providing state are
  +                  serialized and transferred; the recipient deserializes the data,
  +                  creates corresponding nodes in its own in-memory tree, and populates
  +                  them with the transferred data.
  +               </para>
  +
  +               <para>"In-memory" state transfer is enabled by setting
  +                  the cache's
  +                  <literal>FetchInMemoryState</literal>
  +                  configuration attribute to
  +                  <literal>true</literal>
  +                  .
  +               </para>
  +            </listitem>
  +            <listitem>
  +               <para>"Persistent" state. Only applicable if a non-shared
  +                  cache loader is used. The state stored in the state-provider
  +                  cache's persistent store is deserialized and transferred; the
  +                  recipient passes the data to its own cache loader, which persists
  +                  it to the recipient's persistent store.
  +               </para>
  +
  +               <para>"Persistent" state transfer is enabled by setting
  +                  a cache loader's
  +                  <literal>fetchPersistentState</literal>
  +                  attribute to
  +                  <literal>true</literal>
  +                  . If multiple cache loaders
  +                  are configured in a chain, only one can have this property
  +                  set to true; otherwise you will get an exception at startup.
  +               </para>
  +
  +               <para>Persistent state transfer with a shared cache loader does
  +                  not make sense, as the same persistent store that provides the
  +                  data will just end up receiving it. Therefore, if a shared cache
  +                  loader is used, the cache will not allow a persistent state
  +                  transfer even if a cache loader has
  +                  <literal>fetchPersistentState</literal>
  +                  set to
  +                  <literal>true</literal>
  +                  .
  +               </para>
  +            </listitem>
  +         </orderedlist>
  +
  +         <para>Which of these types of state transfer is appropriate depends on the usage
  +            of the cache.
  +         </para>
  +
  +         <orderedlist>
  +            <listitem>
  +               <para>If a write-through cache loader is used, the current cache
  +                  state is fully represented by the persistent state. Data may
  +                  have been evicted from the in-memory state, but it will still be
  +                  in the persistent store. In this case, if the cache loader is not
  +                  shared, persistent state transfer is used to ensure the new cache
  +                  has the correct state. In-memory state can be transferred as well
  +                  if the desire is to have a "hot" cache -- one that has all
  +                  relevant data in memory when the cache begins providing service.
  +                  (Note that the "CacheLoaderPreload" configuration parameter can
  +                  be used as well to provide a "warm" or "hot" cache without
  +                  requiring an in-memory state transfer. This approach somewhat
  +                  reduces the burden on the cache instance providing state, but
  +                  increases the load on the persistent store on the recipient
  +                  side.)
  +               </para>
  +            </listitem>
  +            <listitem>
  +               <para>If a cache loader is used with passivation, the full
  +                  representation of the state can only be obtained by combining
  +                  the in-memory (i.e. non-passivated) and persistent (i.e. passivated)
  +                  states. Therefore an in-memory state transfer is necesssary. A
  +                  persistent state transfer is necessary if the cache loader is
  +                  not shared.
  +               </para>
  +            </listitem>
  +            <listitem>
  +               <para>If no cache loader is used and the cache is solely a
  +                  write-aside cache (i.e. one that is used to cache data that can
  +                  also be found in a persistent store, e.g. a database), whether
  +                  or not in-memory state should be transferred depends on whether
  +                  or not a "hot" cache is desired.
  +               </para>
  +            </listitem>
  +         </orderedlist>
  +      </section>
  +      <section>
  +         <title>When State Transfer Occurs</title>
  +
  +         <para>If either in-memory or persistent state transfer is enabled, a full or
  +            partial state transfer will be done at various times, depending on how the
  +            cache is used. "Full" state transfer refers to the transfer of the state
  +            related to the entire tree -- i.e. the root node and all nodes below it.
  +            A "partial" state transfer is one where just a portion of the tree is
  +            transferred -- i.e. a node at a given Fqn and all nodes below it.
  +         </para>
  +
  +         <para>If either in-memory or persistent state transfer is enabled, state
  +            transfer will occur at the following times:
  +         </para>
  +
  +         <orderedlist>
  +            <listitem>
  +               <para>Initial state transfer. This occurs when the cache is first
  +                  started (as part of the processing of the
  +                  <literal>start()</literal>
  +                  method). This is a full state transfer. The state is retrieved
  +                  from the cache instance that has been operational the longest. If
  +                  there is any problem receiving or integrating the state, the cache
  +                  will not start.
  +               </para>
  +
  +               <para>Initial state transfer will occur unless:</para>
  +
  +               <orderedlist>
  +                  <listitem>
  +                     <para>The cache's
  +                        <literal>InactiveOnStartup</literal>
  +                        property
  +                        is
  +                        <literal>true</literal>
  +                        . This property is used in conjunction
  +                        with region-based marshalling.
  +                     </para>
  +                  </listitem>
  +                  <listitem>
  +                     <para>Buddy replication is used. See below for more on
  +                        state transfer with buddy replication.
  +                     </para>
  +                  </listitem>
  +               </orderedlist>
  +            </listitem>
  +
  +            <listitem>
  +               <para>Partial state transfer following region activation. Only
  +                  relevant when region-based marshalling is used. Here a special
  +                  classloader is needed to unmarshal the state for a portion of
  +                  the tree. State transfer cannot succeed until the application
  +                  registers this classloader with the cache. Once the application
  +                  registers its classloader, it calls
  +                  <literal>cache.getRegion(fqn, true).activate()</literal>
  +                  .
  +                  As part of the region activation process, a partial state transfer
  +                  of the relevant subtree's state is performed. The state is
  +                  requested from the oldest cache instance in the cluster; if that
  +                  instance responds with no state, state is requested from each
  +                  instance one by one until one provides state or all instances have
  +                  been queried.
  +               </para>
  +
  +               <para>Typically when region-based marshalling is used, the cache's
  +                  <literal>InactiveOnStartup</literal>
  +                  property is set to
  +                  <literal>true</literal>
  +                  . This suppresses initial state transfer,
  +                  which would fail due to the inability to deserialize the
  +                  transferred state.
  +               </para>
  +            </listitem>
  +
  +            <listitem>
  +               <para>Buddy replication. When buddy replication is used, initial
  +                  state transfer is disabled. Instead, when a cache instance joins the
  +                  cluster, it becomes the buddy of one or more other instances, and
  +                  one or more other instances become its buddy. Each time an instance
  +                  determines it has a new buddy providing backup for it, it pushes
  +                  it's current state to the new buddy. This "pushing" of state to
  +                  the new buddy is slightly different from other forms of state
  +                  transfer, which are based on a "pull" approach (i.e. recipient
  +                  asks for and receives state). However, the process of preparing
  +                  and integrating the state is the same.
  +               </para>
  +
  +               <para>This "push" of state upon buddy group formation only occurs
  +                  if the
  +                  <literal>InactiveOnStartup</literal>
  +                  property is set to
  +                  <literal>false</literal>
  +                  . If it is
  +                  <literal>true</literal>
  +                  ,
  +                  state transfer amongst the buddies only occurs when the application
  +                  activates the region
  +                  on the various
  +                  members of the group.
  +               </para>
  +
  +               <para>Partial state transfer following a region activation
  +                  call is slightly different in the buddy replication case as well.
  +                  Instead of requesting the partial state from one cache instance,
  +                  and trying all instances until one responds, with buddy replication
  +                  the instance that is activating a region will request partial
  +                  state from each instance for which it is serving as a backup.
  +               </para>
  +            </listitem>
  +         </orderedlist>
  +
  +
  +      </section>
  +      
  +   </section>
  +
   </chapter>
  \ No newline at end of file
  +      
  \ No newline at end of file