[exo-jcr-commits] exo-jcr SVN: r2858 - in jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules: jcr and 1 other directories.

do-not-reply at jboss.org do-not-reply at jboss.org
Tue Aug 3 04:22:27 EDT 2010


Author: sergiykarpenko
Date: 2010-08-03 04:22:26 -0400 (Tue, 03 Aug 2010)
New Revision: 2858

Added:
   jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/backup/
   jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/backup/exojcr-backup-service.xml
Modified:
   jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr.xml
Log:
EXOJCR-869: backup documents ported

Added: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/backup/exojcr-backup-service.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/backup/exojcr-backup-service.xml	                        (rev 0)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/backup/exojcr-backup-service.xml	2010-08-03 08:22:26 UTC (rev 2858)
@@ -0,0 +1,480 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<chapter id="JCR.eXoJCRBackupService">
+  <?dbhtml filename="ch-exojcr-backup-service.html"?>
+
+  <title>eXo JCR Backup Service</title>
+
+  <section id="Concept">
+    <title>Concept</title>
+
+    <para>The main purpose of that feature is to restore data in case of
+    system faults and repository crashes. Also the backup results may be used
+    as a content history.</para>
+
+    <para>The eXo JCR backup service was developed from the JCR 1.8
+    implementation. It's an independent service available as an eXo JCR
+    Extensions project.</para>
+
+    <para>The concept is based on the export of a workspace unit in the Full,
+    or Full + Incrementals model. A repository workspace can be backup and
+    restored using a combination of these modes. In all cases, at least one
+    Full (initial) backup must be executed to mark a starting point of the
+    backup history. An Incremental backup is not a complete image of the
+    workspace. It contains only changes for some period. So it is not possible
+    to perform an Incremental backup without an initial Full backup.</para>
+
+    <para>The Backup service may operate as a hot-backup process at runtime on
+    an in-use workspace. It's a case when the Full + Incrementals model should
+    be used to have a guaranty of data consistency during restoration. An
+    Incremental will be run starting from the start point of the Full backup
+    and will contain changes that have occured during the Full backup
+    too.</para>
+
+    <para>A <emphasis role="bold">restore</emphasis> operation is a mirror of
+    a backup one. At least one Full backup should be restored to obtain a
+    workspace corresponding to some point in time. On the other hand,
+    Incrementals may be restored in the order of creation to reach a required
+    state of a content. If the Incremental contains the same data as the Full
+    backup (hot-backup), the changes will be applied again as if they were
+    made in a normal way via API calls.</para>
+
+    <para>According to the model there are several modes for backup
+    logic:</para>
+
+    <itemizedlist>
+      <listitem>
+        <para><emphasis role="bold">Full backup only</emphasis> : single
+        operation, runs once</para>
+      </listitem>
+
+      <listitem>
+        <para><emphasis role="bold">Full + Incrementals</emphasis> : Start
+        with an initial Full backup and then keep incrementals changes in one
+        file. Runs until it is stopped.</para>
+      </listitem>
+
+      <listitem>
+        <para><emphasis role="bold">Full + Incrementals(periodic)</emphasis> :
+        Start with an initial Full backup and then keep incrementals with
+        periodic result file rotation. Runs until it is stopped.</para>
+      </listitem>
+    </itemizedlist>
+  </section>
+
+  <section>
+    <title>How it works</title>
+
+    <section>
+      <title>Implementation details</title>
+
+      <para>Full backup/restore is implemented using the JCR SysView
+      Export/Import. Workspace data will be exported into Sysview XML data
+      from root node.</para>
+
+      <para>Restore is implemented using the special eXo JCR API feature: a
+      dynamic workspace creation. Restoring of the workspace Full backup will
+      create one new workspace in the repository. Then the SysView XML data
+      will be imported as the root node.</para>
+
+      <para>Incremental backup is implemented using the eXo JCR ChangesLog
+      API. This API allows to record each JCR API call as atomic entries in a
+      changelog. Hence, the Incremental backup uses a listener that collects
+      these logs and stores them in a file.</para>
+
+      <para>Restoring an incremental backup consists in applying the collected
+      set of ChangesLogs to a workspace in the correct order.</para>
+    </section>
+
+    <section>
+      <title>Work basics</title>
+
+      <para>The work of Backup is based on the BackupConfig configuration and
+      the BackupChain logical unit.</para>
+
+      <para>BackupConfig describes the backup operation chain that will be
+      performed by the service. When you intend to work with it, the
+      configuration should be prepared before the backup is started.</para>
+
+      <para>The configuration contains such values as:</para>
+
+      <itemizedlist>
+        <listitem>
+          <para><emphasis role="bold">types of full and incremental
+          backup</emphasis> ? (fullBackupType, incrementalBackupType) Strings
+          with full names of classes which will cover the type
+          functional.</para>
+        </listitem>
+
+        <listitem>
+          <para><emphasis role="bold">incremental period</emphasis> - a period
+          after that a current backup will be stopped and a new one will be
+          started, in seconds (long).</para>
+        </listitem>
+
+        <listitem>
+          <para><emphasis role="bold">target repository and workspace
+          names</emphasis> ? Strings with described names</para>
+        </listitem>
+
+        <listitem>
+          <para><emphasis role="bold">destination directory</emphasis> for
+          result files ? String with a path to a folder where operation result
+          files will be stored.</para>
+        </listitem>
+      </itemizedlist>
+
+      <para>BackupChain is a unit performing the backup process and it covers
+      the principle of initial Full backup execution and manages Incrementals
+      operations. BackupChain is used as a key object for accessing current
+      backups during runtime via BackupManager. Each BackupJob performs a
+      single atomic operation ? a Full or Incremental process. The result of
+      that operation is data for a Restore. BackupChain can contain one or
+      more BackupJobs. But at least the initial Full job is always there. Each
+      BackupJobs has its own unique number which means its Job order in the
+      chain, the initial Full job always has the number 0.</para>
+
+      <para><emphasis role="bold">Backup process, result data and file
+      location</emphasis></para>
+
+      <para>To start the backup process it's necessary to create the
+      BackupConfig and call the BackupManager.startBackup(BackupConfig)
+      method. This method will return BackupChain created according to the
+      configuration. At the same time the chain creates a BackupChainLog which
+      persists BackupConfig content and BackupChain operation states to the
+      file in the service working directory (see Configuration).</para>
+
+      <para>When the chain starts the work and the initial BackupJob starts,
+      the job will create a result data file using the destination directory
+      path from BackupConfig. The destination directory will contain a
+      directory with an automatically created name using the pattern
+      repository_workspace-timestamp where timestamp is current time in the
+      format of yyyyMMdd_hhmmss (E.g. db1_ws1-20080306_055404). The directory
+      will contain the results of all Jobs configured for execution. Each Job
+      stores the backup result in its own file with the name
+      repository_workspace-timestamp.jobNumber. BackupChain saves each state
+      (STARTING, WAITING, WORKING, FINISHED) of its Jobs in the
+      BackupChainLog, which has a current result full file path.</para>
+
+      <para>BackupChain log file and job result files are a whole and
+      consistent unit, that is a source for a Restore.</para>
+
+      <note>
+        <para>BackupChain log contains absolute paths to job result files.
+        Don't move these files to another location.</para>
+      </note>
+
+      <para><emphasis role="bold">Restore requirements</emphasis></para>
+
+      <para>As mentioned before a Restore operation is a mirror of a Backup.
+      The process is a Full restore of a root node with restoring an
+      additional Incremental backup to reach a desired workspace state.
+      Restoring of the workspace Full backup will create a new workspace in
+      the repository using given RepositoyEntry of existing repository and
+      given (preconfigured) WorkspaceEntry for a new target workspace. A
+      Restore process will restore a root node there from the SysView XML
+      data.</para>
+
+      <note>
+        <para>The target workspace should not be in the repository. Otherwise
+        a BackupConfigurationException exception will be thrown.</para>
+      </note>
+
+      <para>For creation and manipulation with Workspaces check the article
+      <link linkend="TODO">Repository and Workspace management</link>.</para>
+
+      <para>Finally we may say that a Restore is a process of a new Workspace
+      creation and filling it with a Backup content. In case you already have
+      a target Workspace (with the same name) in a Repository, you have to
+      configure a new name for it. If no target workspace exists in the
+      Repository you may use the same name as the Backup one.</para>
+    </section>
+  </section>
+
+  <section>
+    <title>Configuration</title>
+
+    <para>As an optional extension, the Backup service is not enabled by
+    default. <emphasis role="bold">You need to enable it via
+    configuration</emphasis>.</para>
+
+    <para>Below is an example configuration compatible with JCR 1.9.3 and
+    later :</para>
+
+    <programlisting>&lt;component&gt;
+  &lt;key&gt;org.exoplatform.services.jcr.ext.backup.BackupManager&lt;/key&gt;
+  &lt;type&gt;org.exoplatform.services.jcr.ext.backup.impl.BackupManagerImpl&lt;/type&gt;
+  &lt;init-params&gt;
+    &lt;properties-param&gt;
+      &lt;name&gt;backup-properties&lt;/name&gt;
+      &lt;property name="default-incremental-job-period" value="3600" /&gt; &lt;!-- set default incremental period = 60 minutes --&gt;
+      &lt;property name="full-backup-type" value="org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob" /&gt;
+      &lt;property name="incremental-backup-type" value="org.exoplatform.services.jcr.ext.backup.impl.fs.IncrementalBackupJob" /&gt;
+      &lt;property name="backup-dir" value="target/backup" /&gt;
+    &lt;/properties-param&gt;
+  &lt;/init-params&gt;
+&lt;/component&gt;</programlisting>
+
+    <para>Where:<itemizedlist>
+        <listitem>
+          <para><emphasis role="bold">incremental-backup-type</emphasis>
+          (since 1.9.3) : the FQN of incremental job class. Must implement
+          org.exoplatform.services.jcr.ext.backup.BackupJob</para>
+        </listitem>
+
+        <listitem>
+          <para><emphasis role="bold">full-backup-type</emphasis> (since
+          1.9.3) : the FQN of the full backup job class; Must implement
+          org.exoplatform.services.jcr.ext.backup.BackupJob</para>
+        </listitem>
+
+        <listitem>
+          <para><emphasis
+          role="bold">default-incremental-job-period</emphasis> (since 1.9.3)
+          :the period between incremetal flushes (in seconds)</para>
+        </listitem>
+
+        <listitem>
+          <para><emphasis role="bold">backup-dir</emphasis> : the path to a
+          working directory where the service will store internal files and
+          chain logs.</para>
+        </listitem>
+      </itemizedlist></para>
+  </section>
+
+  <section>
+    <title>Usage</title>
+
+    <section>
+      <title>Perform a Backup</title>
+
+      <para>In following example we create a BackupConfig bean for the Full +
+      Incrementals mode, then we ask the BackupManager to start the backup
+      process.</para>
+
+      <programlisting>// Obtaining the backup service from the eXo container.
+BackupManager backup = (BackupManager) container.getComponentInstanceOfType(BackupManager.class);
+
+// And prepare the BackupConfig instance with custom parameters. 
+// full backup &amp; incremental
+File backDir = new File("/backup/ws1"); // the destination path for result files
+backDir.mkdirs();
+
+BackupConfig config = new BackupConfig();
+config.setRepository(repository.getName());
+config.setWorkspace("ws1");
+config.setBackupDir(backDir);
+
+// Before 1.9.3, you also need to indicate the backupjobs class FDNs
+// config.setFullBackupType("org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob");
+// config.setIncrementalBackupType("org.exoplatform.services.jcr.ext.backup.impl.fs.IncrementalBackupJob");
+
+// start backup using the service manager
+BackupChain chain = backup.startBackup(config);</programlisting>
+
+      <para>To stop the backup operation you have to use the BackupChain
+      instance.</para>
+
+      <programlisting>// stop backup
+backup.stopBackup(chain);</programlisting>
+    </section>
+
+    <section>
+      <title>Perform a Restore</title>
+
+      <para>Restoration involves the reloading the backup file into a
+      BackupChainLog and applying appropriate workspace initialization. The
+      following snippet shows the typical sequence for restoring a workspace
+      :</para>
+
+      <programlisting>// find BackupChain using the repository and workspace names (return null if not found)
+BackupChain chain = backup.findBackup("db1", "ws1");
+
+// Get the RepositoryEntry and WorkspaceEntry
+ManageableRepository repo = repositoryService.getRepository(repository);
+RepositoryEntry repoconf = repo.getConfiguration();
+List&lt;WorkspaceEntry&gt; entries = repoconf.getWorkspaceEntries();
+WorkspaceEntry = getNewEntry(entries, workspace); // create a copy entry from an existing one
+
+// restore backup log using ready RepositoryEntry and WorkspaceEntry
+File backLog = new File(chain.getLogFilePath());
+BackupChainLog bchLog = new BackupChainLog(backLog);
+
+// initialize the workspace
+repository.configWorkspace(workspaceEntry);
+
+// run restoration
+backup.restore(bchLog, repositoryEntry, workspaceEntry);</programlisting>
+
+      <section>
+        <title>Restoring into an existing workspace</title>
+
+        <note>
+          <para>These instructions only applies to regular workspace. Special
+          instructions are provided for System workspace below.</para>
+        </note>
+
+        <para>To restore a backup over an existing workspace, you are required
+        to clear its data. Your backup process should follow these steps :
+        <itemizedlist>
+            <listitem>
+              <para>remove workspace<programlisting>ManageableRepository repo = repositoryService.getRepository(repository);
+repo.removeWorkspace(workspace);</programlisting></para>
+            </listitem>
+
+            <listitem>
+              <para>clean database, value storage, index</para>
+            </listitem>
+
+            <listitem>
+              <para>restore (see snippet above)</para>
+            </listitem>
+          </itemizedlist></para>
+      </section>
+
+      <section>
+        <title>System workspace</title>
+
+        <note>
+          <para>The BackupWorkspaceInitializer is available in JCR 1.9 and
+          later.</para>
+        </note>
+
+        <para>Restoring the JCR System workspace requires to shutdown the
+        system and use of a special initializer.</para>
+
+        <para>Follow these steps (this will also work for normal workspaces) :
+        <itemizedlist>
+            <listitem>
+              <para>Stop repository (or portal)</para>
+            </listitem>
+
+            <listitem>
+              <para>clean database, value storage, index; </para>
+            </listitem>
+
+            <listitem>
+              <para>In configuration the workspace set
+              BackupWorkspaceInitializer to reference your backup.</para>
+
+              <para>For example :<programlisting>&lt;workspaces&gt;
+  &lt;workspace name="production" ... &gt;
+    &lt;container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer"&gt;
+      ...
+    &lt;/container&gt;
+    &lt;initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer"&gt;
+      &lt;properties&gt;
+         &lt;property name="restore-path" value="D:\java\exo-working\backup\repository_production-20090527_030434"/&gt;
+      &lt;/properties&gt;
+   &lt;/initializer&gt;
+    ...
+&lt;/workspace&gt;</programlisting></para>
+            </listitem>
+
+            <listitem>
+              <para>Start repository (or portal).</para>
+            </listitem>
+          </itemizedlist></para>
+      </section>
+    </section>
+  </section>
+
+  <section>
+    <title>Scheduling (experimental)</title>
+
+    <para>The Backup service has an additional feature that can be useful for
+    a production level backup implementation. When you need to organize a
+    backup of a repository it's necessary to have a tool which will be able to
+    create and manage a cycle of Full and Incremental backups in periodic
+    manner.</para>
+
+    <para>The service has internal BackupScheduler which can run a
+    configurable cycle of BackupChains as if they have been executed by a user
+    during some period of time. I.e. BackupScheduler is a user-like daemon
+    which asks the BackupManager to start or stop backup operations.</para>
+
+    <para>For that purpose BackupScheduler has the method</para>
+
+    <para>BackupScheduler.schedule(backupConfig, startDate, stopDate,
+    chainPeriod, incrementalPeriod)</para>
+
+    <para>where</para>
+
+    <itemizedlist>
+      <listitem>
+        <para>backupConfig - a ready configuration which will be given to the
+        BackupManager.startBackup() method</para>
+      </listitem>
+
+      <listitem>
+        <para>startDate - a date and time of the backup start</para>
+      </listitem>
+
+      <listitem>
+        <para>stopDate - a date and time of the backup stop</para>
+      </listitem>
+
+      <listitem>
+        <para>chainPeriod - a period after which a current BackupChain will be
+        stopped and a new one will be started, in seconds</para>
+      </listitem>
+
+      <listitem>
+        <para>incrementalPeriod - if it is greater than 0 it will be used to
+        override the same value in backupConfig.</para>
+      </listitem>
+    </itemizedlist>
+
+    <programlisting>// geting the scheduler from the BackupManager
+   BackupScheduler scheduler = backup.getScheduler();
+
+// schedule backup using a ready configuration (Full + Incrementals) to run from startTime
+// to stopTime. Full backuop will be performed every 24 hours (BackupChain lifecycle),
+// incremental will rotate result files every 3 hours.
+   scheduler.schedule(config, startTime, stopTime, 3600  * 24, 3600 * 3);
+
+// it's possible to run the scheduler for an uncertain period of time (i.e. without stop time).
+// schedule backup to run from startTime till it will be stopped manually
+// also there, the incremental will rotate result files as it configured in BackupConfig
+   scheduler.schedule(config, startTime, null, 3600 * 24, 0);
+
+// to unschedule backup simply call the scheduler with the configuration describing the 
+// already planned backup cycle.
+// the scheduler will search in internal tasks list for task with repository and
+// workspace name from the configuration and will stop that task.
+   scheduler.unschedule(config);</programlisting>
+
+    <para>When the BackupScheduler starts the scheduling, it uses the internal
+    Timer with startDate for the first (or just once) execution. If
+    chainPeriod is greater than 0 then the task is repeated with this value
+    used as a period starting from startDate. Otherwise the task will be
+    executed once at startDate time. If the scheduler has stopDate it will
+    stop the task ( the chain cycle) after stopDate. And the last parameter
+    incrementalPeriod will be used instead of the same from BackupConfig if
+    its values are greater than 0.</para>
+
+    <para>Starting each task (BackupScheduler.schedule(...)), the scheduler
+    creates a task file in the service working directory (see <emphasis
+    role="bold">Configuration</emphasis>, backup-dir) which describes the task
+    backup configuration and periodic values. These files will be used at the
+    backup service start (JVM start) to reinitialize BackupScheduler for
+    continuous task scheduling. Only tasks that don't have a stopDate or a
+    stopDate not expired will be reinitialized.</para>
+
+    <para>There is one notice about BackupScheduler task reinitialization in
+    the current implementation. It comes from the BackupScheduler nature and
+    its implemented behaviour. As the scheduler is just a virtual user which
+    asks the BackupManager to start or stop backup operations, it isn't able
+    to reinitialize each existing BackupChain before the service (JVM) is
+    stopped. But it's possible to start a new operation with the same
+    configuration via BackupManager (that was configured before and stored in
+    a task file).</para>
+
+    <para>This is a main detail of the BackupScheduler which should be taken
+    into suggestion of a backup operation design now. In case of
+    reinitialization the task will have new time values for the backup
+    operation cycle as the chainPeriod and incrementalPeriod will be applied
+    again. That behaviour may be changed in the future.</para>
+  </section>
+</chapter>

Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr.xml	2010-08-03 08:16:25 UTC (rev 2857)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr.xml	2010-08-03 08:22:26 UTC (rev 2858)
@@ -96,16 +96,25 @@
               xmlns:xi="http://www.w3.org/2001/XInclude" />    
 
   <xi:include href="jcr/searching/searching-repository-content.xml"
-              xmlns:xi="http://www.w3.org/2001/XInclude" />    
-              
-  <!-- protocols -->
-  
+              xmlns:xi="http://www.w3.org/2001/XInclude" />    
+              
+  <!-- protocols -->
+  
   <xi:include href="jcr/protocols/webdav.xml"
-              xmlns:xi="http://www.w3.org/2001/XInclude" />
-
+              xmlns:xi="http://www.w3.org/2001/XInclude" />
+
   <xi:include href="jcr/protocols/ftp.xml"
-              xmlns:xi="http://www.w3.org/2001/XInclude" />              
+              xmlns:xi="http://www.w3.org/2001/XInclude" />              
 
+
+  <!-- backup -->
+  <xi:include href="jcr/backup/exojcr-backup-service.xml"
+              xmlns:xi="http://www.w3.org/2001/XInclude" />    
+  <!--xi:include href="jcr/backup/backup-client.xml"
+              xmlns:xi="http://www.w3.org/2001/XInclude" /-->    
+
+
+
   <!-- other  -->
 
   <xi:include href="jcr/statistics.xml"



More information about the exo-jcr-commits mailing list