[exo-jcr-commits] exo-jcr SVN: r1888 - jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules.

do-not-reply at jboss.org do-not-reply at jboss.org
Thu Feb 18 06:08:34 EST 2010


Author: sergiykarpenko
Date: 2010-02-18 06:08:34 -0500 (Thu, 18 Feb 2010)
New Revision: 1888

Added:
   jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml
Log:
EXOJCR-490: external-value-storage updated

Added: jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml
===================================================================
--- jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml	                        (rev 0)
+++ jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml	2010-02-18 11:08:34 UTC (rev 1888)
@@ -0,0 +1,247 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<article>
+  <articleinfo>
+    <title>External Value Storages</title>
+  </articleinfo>
+
+  <section>
+    <title>Introduction</title>
+
+    <para>By default JCR Values are stored in the Workspace Data container
+    along with the JCR structure (i.e. Nodes and Properties). eXo JCR offers
+    an additional option of storing JCR Values separately from Workspace Data
+    container, which can be extremely helpful to keep Binary Large Objects
+    (BLOBs) for example (see [TODOBinary values processing link]).</para>
+
+    <para>Value storage configuration is a part of Repository configuration,
+    find more details there [TODO link to repository configuration].</para>
+
+    <para>Tree-based storage is recommended for most of cases. If you run an
+    application on Amazon EC2 - the S3 option may be interesting for
+    architecture. Simple 'flat' storage is good in speed of creation/deletion
+    of values, it might be a compromise for a small storages.</para>
+  </section>
+
+  <section>
+    <title>Tree File Value Storage</title>
+
+    <para>Holds Values in tree-like FileSystem files.
+    <property>path</property> property points to the root directory to store
+    the files.</para>
+
+    <para>This is a recommended type of external storage, it can contain large
+    amount of files limited only by disk/volume free space.</para>
+
+    <para>A disadvantage it's a higher time on Value deletion due to unused
+    tree-nodes remove.</para>
+
+    <programlisting>&lt;value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage"&gt;
+     &lt;properties&gt;
+       &lt;property name="path" value="data/values"/&gt;
+     &lt;/properties&gt;
+     &lt;filters&gt;
+       &lt;filter property-type="Binary" min-value-size="1M"/&gt;
+     &lt;/filters&gt;</programlisting>
+
+    <para>Where :<simplelist>
+        <member><parameter>id</parameter> - the value storage unique
+        identifier, used for linking with properties stored in workspace
+        container</member>
+
+        <member><parameter>path</parameter> - a location where value files
+        will be stored</member>
+      </simplelist></para>
+
+    <para>Each file value storage can have the <function>filter(s)</function>
+    for incoming values. A filter can match values by property type
+    (<property>property-type</property>), property name
+    (<property>property-name</property>), ancestor path
+    (<property>ancestor-path</property>) and/or size of values stored
+    (<property>min-value-size</property>, in bytes). In code sample we use a
+    filter with property-type and min-value-size only. I.e. storage for binary
+    values with size greater of 1MB. It's recommended to store properties with
+    large values in file value storage only.</para>
+
+    <para>Another example shows a value storage with different locations for
+    large files (<property>min-value-size</property> a 20Mb-sized filter). A
+    value storage uses ORed logic in the process of filter selection. That
+    means the first filter in the list will be asked first and if not matched
+    the next will be called etc. Here a value matches the 20 MB-sized filter
+    <property>min-value-size</property> and will be stored in the path
+    "data/20Mvalues", all other in "data/values".</para>
+
+    <programlisting>&lt;value-storages&gt;
+  &lt;value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage"&gt;
+    &lt;properties&gt;
+      &lt;property name="path" value="data/20Mvalues"/&gt;
+    &lt;/properties&gt;
+    &lt;filters&gt;
+      &lt;filter property-type="Binary" min-value-size="20M"/&gt;
+    &lt;/filters&gt;
+  &lt;value-storage&gt;
+  &lt;value-storage id="Storage #2" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage"&gt;
+    &lt;properties&gt;
+      &lt;property name="path" value="data/values"/&gt;
+    &lt;/properties&gt;
+    &lt;filters&gt;
+      &lt;filter property-type="Binary" min-value-size="1M"/&gt;
+    &lt;/filters&gt;
+  &lt;value-storage&gt;
+&lt;value-storages&gt;</programlisting>
+  </section>
+
+  <section>
+    <title>S3 File Value Storage</title>
+
+    <para>Holds Values at Amazon S3 storage. For more about S3 see <ulink
+    url="http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2/103-7720231-3235021?ie=UTF8&amp;node=16427261&amp;no=3435361&amp;me=A36L942TSJ2AJA">http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2/103-7720231-3235021?ie=UTF8&amp;node=16427261&amp;no=3435361&amp;me=A36L942TSJ2AJA</ulink>.</para>
+
+    <para>This type of storage saves all matching Values on Amazon S3 service.
+    That is very useful for <phrase>cloud computing</phrase> (like Amazon EC2
+    hosted repositories). But can be used in any environment and storages
+    combinations. It's often used in combined with Workspace Simple DB storage
+    [TODO Workspace Simple DB storage].</para>
+
+    <para>It's networked storage with RESTbased access (via HTTP) that makes a
+    footprint on performance of the Repository.</para>
+
+    <programlisting>&lt;value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.s3.SimpleS3ValueStorage"&gt;
+    &lt;properties&gt;
+      &lt;property name="bucket" value="BUCKET NAME HERE"/&gt;
+      &lt;property name="aws-access-key" value="INSERT YOUR AWS ACCESS KEY ID HERE"/&gt;
+      &lt;property name="aws-secret-access-key" value="INSERT YOUR AWS SECRET ACCESS KEY HERE"/&gt;
+      &lt;property name="s3-swap-directory" value="s3swap_directory_name"/&gt;
+    &lt;/properties&gt;
+    &lt;filters&gt;
+      &lt;filter property-type="Binary"/&gt;
+    &lt;/filters&gt;
+&lt;/value-storage&gt;</programlisting>
+  </section>
+
+  <section>
+    <title>Simple File Value Storage</title>
+
+    <note>
+      <para>Not recommended to use in production due to low capacity
+      capabilities on most file systems.</para>
+
+      <para>But if you're sure in your file-system or data amount is small it
+      may be useful for you as haves a faster speed of Value removal.</para>
+    </note>
+
+    <para>Holds Values in flat FileSystem files. <property>path</property>
+    property points to root directory in order to store files</para>
+
+    <programlisting>&lt;value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.SimpleFileValueStorage"&gt;
+     &lt;properties&gt;
+       &lt;property name="path" value="data/values"/&gt;
+     &lt;/properties&gt;
+     &lt;filters&gt;
+       &lt;filter property-type="Binary" min-value-size="1M"/&gt;
+     &lt;/filters&gt;</programlisting>
+  </section>
+
+  <section>
+    <title>Content Addressable Value storage (CAS) support</title>
+
+    <remark>Available from version 1.9.3</remark>
+
+    <para>eXo JCR supports <phrase>Content-addressable storage</phrase>
+    feature for <phrase>Values</phrase> storing.</para>
+
+    <note>
+      <para>Content-addressable storage, also referred to as associative
+      storage and abbreviated CAS, is a mechanism for storing information that
+      can be retrieved based on its content, not its storage location. It is
+      typically used for high-speed storage and retrieval of fixed content,
+      such as documents stored for compliance with government
+      regulations.</para>
+    </note>
+
+    <para>Content Addressable Value storage stores unique content once.
+    Different properties (values) with same content will be stored as one data
+    file shared between those values. We can tell the Value content will be
+    shared across some Values in storage and will be stored on one physical
+    file.</para>
+
+    <para>Storage size will be decreased for application which governs
+    potentially same data in the content.</para>
+
+    <note>
+      <para>For example: if you have 100 different properties containing the
+      same data (e.g. mail attachment) the storage stores only one single
+      file. The file will be shared with all referencing properties.</para>
+    </note>
+
+    <para>If property Value changes it is stored in an additional file.
+    Alternatively the file is shared with other values, pointing to the same
+    content.</para>
+
+    <para>The storage calculates Value content address each time the property
+    was changed. CAS write operations are much more expensive compared to the
+    non-CAS storages.</para>
+
+    <para>Content address calculation based on java.security.MessageDigest
+    hash computation and tested with <abbrev>MD5</abbrev> and
+    <abbrev>SHA1</abbrev> algorithms.</para>
+
+    <note>
+      <para>CAS storage works most efficiently on data that does not change
+      often. For data that changes frequently, CAS is not as efficient as
+      location-based addressing.</para>
+    </note>
+
+    <para>CAS support can be enabled for <phrase>Tree</phrase> and
+    <phrase>Simple File Value Storage</phrase> types.</para>
+
+    <para>To enable CAS support just configure it in JCR Repositories
+    configuration like we do for other Value Storages.</para>
+
+    <programlisting>&lt;workspaces&gt;
+        &lt;workspace name="ws"&gt;
+          &lt;container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer"&gt;
+            &lt;properties&gt;
+              &lt;property name="source-name" value="jdbcjcr"/&gt;
+              &lt;property name="dialect" value="oracle"/&gt;
+              &lt;property name="multi-db" value="false"/&gt;
+              &lt;property name="update-storage" value="false"/&gt;
+              &lt;property name="max-buffer-size" value="200k"/&gt;
+              &lt;property name="swap-directory" value="target/temp/swap/ws"/&gt;
+            &lt;/properties&gt;
+            &lt;value-storages&gt;
+&lt;!------------------- here -----------------------&gt;
+              &lt;value-storage id="ws" class="org.exoplatform.services.jcr.impl.storage.value.fs.CASableTreeFileValueStorage"&gt;
+                &lt;properties&gt;
+                  &lt;property name="path" value="target/temp/values/ws"/&gt;
+                  &lt;property name="digest-algo" value="MD5"/&gt;
+                  &lt;property name="vcas-type" value="org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImpl"/&gt;
+                  &lt;property name="jdbc-source-name" value="jdbcjcr"/&gt;
+                  &lt;property name="jdbc-dialect" value="oracle"/&gt;
+                &lt;/properties&gt;
+                &lt;filters&gt;
+                  &lt;filter property-type="Binary"/&gt;
+                &lt;/filters&gt;
+              &lt;/value-storage&gt;
+            &lt;/value-storages&gt;</programlisting>
+
+    <para>Properties:<simplelist>
+        <member><parameter>digest-algo</parameter> - digest hash algorithm
+        (MD5 and SHA1 were tested);</member>
+
+        <member><parameter>vcas-type</parameter> - Value CAS internal data
+        type, JDBC backed is currently implemented
+        org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImp;l</member>
+
+        <member><parameter>jdbc-source-name</parameter> -
+        JDBCValueContentAddressStorageImpl specific parameter, database will
+        be used to save CAS metadata. It's simple to use same as in workspace
+        container;</member>
+
+        <member><parameter>jdbc-dialect</parameter> -
+        JDBCValueContentAddressStorageImpl specific parameter, database
+        dialect. It's simple to use same as in workspace container;</member>
+      </simplelist></para>
+  </section>
+</article>



More information about the exo-jcr-commits mailing list