[exo-jcr-commits] exo-jcr SVN: r1888 - jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules.
do-not-reply at jboss.org
do-not-reply at jboss.org
Thu Feb 18 06:08:34 EST 2010
Author: sergiykarpenko
Date: 2010-02-18 06:08:34 -0500 (Thu, 18 Feb 2010)
New Revision: 1888
Added:
jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml
Log:
EXOJCR-490: external-value-storage updated
Added: jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml
===================================================================
--- jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml (rev 0)
+++ jcr/trunk/docs/reference/en/src/main/docbook/en-US/modules/external-value-storages.xml 2010-02-18 11:08:34 UTC (rev 1888)
@@ -0,0 +1,247 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<article>
+ <articleinfo>
+ <title>External Value Storages</title>
+ </articleinfo>
+
+ <section>
+ <title>Introduction</title>
+
+ <para>By default JCR Values are stored in the Workspace Data container
+ along with the JCR structure (i.e. Nodes and Properties). eXo JCR offers
+ an additional option of storing JCR Values separately from Workspace Data
+ container, which can be extremely helpful to keep Binary Large Objects
+ (BLOBs) for example (see [TODOBinary values processing link]).</para>
+
+ <para>Value storage configuration is a part of Repository configuration,
+ find more details there [TODO link to repository configuration].</para>
+
+ <para>Tree-based storage is recommended for most of cases. If you run an
+ application on Amazon EC2 - the S3 option may be interesting for
+ architecture. Simple 'flat' storage is good in speed of creation/deletion
+ of values, it might be a compromise for a small storages.</para>
+ </section>
+
+ <section>
+ <title>Tree File Value Storage</title>
+
+ <para>Holds Values in tree-like FileSystem files.
+ <property>path</property> property points to the root directory to store
+ the files.</para>
+
+ <para>This is a recommended type of external storage, it can contain large
+ amount of files limited only by disk/volume free space.</para>
+
+ <para>A disadvantage it's a higher time on Value deletion due to unused
+ tree-nodes remove.</para>
+
+ <programlisting><value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
+ <properties>
+ <property name="path" value="data/values"/>
+ </properties>
+ <filters>
+ <filter property-type="Binary" min-value-size="1M"/>
+ </filters></programlisting>
+
+ <para>Where :<simplelist>
+ <member><parameter>id</parameter> - the value storage unique
+ identifier, used for linking with properties stored in workspace
+ container</member>
+
+ <member><parameter>path</parameter> - a location where value files
+ will be stored</member>
+ </simplelist></para>
+
+ <para>Each file value storage can have the <function>filter(s)</function>
+ for incoming values. A filter can match values by property type
+ (<property>property-type</property>), property name
+ (<property>property-name</property>), ancestor path
+ (<property>ancestor-path</property>) and/or size of values stored
+ (<property>min-value-size</property>, in bytes). In code sample we use a
+ filter with property-type and min-value-size only. I.e. storage for binary
+ values with size greater of 1MB. It's recommended to store properties with
+ large values in file value storage only.</para>
+
+ <para>Another example shows a value storage with different locations for
+ large files (<property>min-value-size</property> a 20Mb-sized filter). A
+ value storage uses ORed logic in the process of filter selection. That
+ means the first filter in the list will be asked first and if not matched
+ the next will be called etc. Here a value matches the 20 MB-sized filter
+ <property>min-value-size</property> and will be stored in the path
+ "data/20Mvalues", all other in "data/values".</para>
+
+ <programlisting><value-storages>
+ <value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
+ <properties>
+ <property name="path" value="data/20Mvalues"/>
+ </properties>
+ <filters>
+ <filter property-type="Binary" min-value-size="20M"/>
+ </filters>
+ <value-storage>
+ <value-storage id="Storage #2" class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
+ <properties>
+ <property name="path" value="data/values"/>
+ </properties>
+ <filters>
+ <filter property-type="Binary" min-value-size="1M"/>
+ </filters>
+ <value-storage>
+<value-storages></programlisting>
+ </section>
+
+ <section>
+ <title>S3 File Value Storage</title>
+
+ <para>Holds Values at Amazon S3 storage. For more about S3 see <ulink
+ url="http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2/103-7720231-3235021?ie=UTF8&node=16427261&no=3435361&me=A36L942TSJ2AJA">http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2/103-7720231-3235021?ie=UTF8&node=16427261&no=3435361&me=A36L942TSJ2AJA</ulink>.</para>
+
+ <para>This type of storage saves all matching Values on Amazon S3 service.
+ That is very useful for <phrase>cloud computing</phrase> (like Amazon EC2
+ hosted repositories). But can be used in any environment and storages
+ combinations. It's often used in combined with Workspace Simple DB storage
+ [TODO Workspace Simple DB storage].</para>
+
+ <para>It's networked storage with RESTbased access (via HTTP) that makes a
+ footprint on performance of the Repository.</para>
+
+ <programlisting><value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.s3.SimpleS3ValueStorage">
+ <properties>
+ <property name="bucket" value="BUCKET NAME HERE"/>
+ <property name="aws-access-key" value="INSERT YOUR AWS ACCESS KEY ID HERE"/>
+ <property name="aws-secret-access-key" value="INSERT YOUR AWS SECRET ACCESS KEY HERE"/>
+ <property name="s3-swap-directory" value="s3swap_directory_name"/>
+ </properties>
+ <filters>
+ <filter property-type="Binary"/>
+ </filters>
+</value-storage></programlisting>
+ </section>
+
+ <section>
+ <title>Simple File Value Storage</title>
+
+ <note>
+ <para>Not recommended to use in production due to low capacity
+ capabilities on most file systems.</para>
+
+ <para>But if you're sure in your file-system or data amount is small it
+ may be useful for you as haves a faster speed of Value removal.</para>
+ </note>
+
+ <para>Holds Values in flat FileSystem files. <property>path</property>
+ property points to root directory in order to store files</para>
+
+ <programlisting><value-storage id="Storage #1" class="org.exoplatform.services.jcr.impl.storage.value.fs.SimpleFileValueStorage">
+ <properties>
+ <property name="path" value="data/values"/>
+ </properties>
+ <filters>
+ <filter property-type="Binary" min-value-size="1M"/>
+ </filters></programlisting>
+ </section>
+
+ <section>
+ <title>Content Addressable Value storage (CAS) support</title>
+
+ <remark>Available from version 1.9.3</remark>
+
+ <para>eXo JCR supports <phrase>Content-addressable storage</phrase>
+ feature for <phrase>Values</phrase> storing.</para>
+
+ <note>
+ <para>Content-addressable storage, also referred to as associative
+ storage and abbreviated CAS, is a mechanism for storing information that
+ can be retrieved based on its content, not its storage location. It is
+ typically used for high-speed storage and retrieval of fixed content,
+ such as documents stored for compliance with government
+ regulations.</para>
+ </note>
+
+ <para>Content Addressable Value storage stores unique content once.
+ Different properties (values) with same content will be stored as one data
+ file shared between those values. We can tell the Value content will be
+ shared across some Values in storage and will be stored on one physical
+ file.</para>
+
+ <para>Storage size will be decreased for application which governs
+ potentially same data in the content.</para>
+
+ <note>
+ <para>For example: if you have 100 different properties containing the
+ same data (e.g. mail attachment) the storage stores only one single
+ file. The file will be shared with all referencing properties.</para>
+ </note>
+
+ <para>If property Value changes it is stored in an additional file.
+ Alternatively the file is shared with other values, pointing to the same
+ content.</para>
+
+ <para>The storage calculates Value content address each time the property
+ was changed. CAS write operations are much more expensive compared to the
+ non-CAS storages.</para>
+
+ <para>Content address calculation based on java.security.MessageDigest
+ hash computation and tested with <abbrev>MD5</abbrev> and
+ <abbrev>SHA1</abbrev> algorithms.</para>
+
+ <note>
+ <para>CAS storage works most efficiently on data that does not change
+ often. For data that changes frequently, CAS is not as efficient as
+ location-based addressing.</para>
+ </note>
+
+ <para>CAS support can be enabled for <phrase>Tree</phrase> and
+ <phrase>Simple File Value Storage</phrase> types.</para>
+
+ <para>To enable CAS support just configure it in JCR Repositories
+ configuration like we do for other Value Storages.</para>
+
+ <programlisting><workspaces>
+ <workspace name="ws">
+ <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name="source-name" value="jdbcjcr"/>
+ <property name="dialect" value="oracle"/>
+ <property name="multi-db" value="false"/>
+ <property name="update-storage" value="false"/>
+ <property name="max-buffer-size" value="200k"/>
+ <property name="swap-directory" value="target/temp/swap/ws"/>
+ </properties>
+ <value-storages>
+<!------------------- here ----------------------->
+ <value-storage id="ws" class="org.exoplatform.services.jcr.impl.storage.value.fs.CASableTreeFileValueStorage">
+ <properties>
+ <property name="path" value="target/temp/values/ws"/>
+ <property name="digest-algo" value="MD5"/>
+ <property name="vcas-type" value="org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImpl"/>
+ <property name="jdbc-source-name" value="jdbcjcr"/>
+ <property name="jdbc-dialect" value="oracle"/>
+ </properties>
+ <filters>
+ <filter property-type="Binary"/>
+ </filters>
+ </value-storage>
+ </value-storages></programlisting>
+
+ <para>Properties:<simplelist>
+ <member><parameter>digest-algo</parameter> - digest hash algorithm
+ (MD5 and SHA1 were tested);</member>
+
+ <member><parameter>vcas-type</parameter> - Value CAS internal data
+ type, JDBC backed is currently implemented
+ org.exoplatform.services.jcr.impl.storage.value.cas.JDBCValueContentAddressStorageImp;l</member>
+
+ <member><parameter>jdbc-source-name</parameter> -
+ JDBCValueContentAddressStorageImpl specific parameter, database will
+ be used to save CAS metadata. It's simple to use same as in workspace
+ container;</member>
+
+ <member><parameter>jdbc-dialect</parameter> -
+ JDBCValueContentAddressStorageImpl specific parameter, database
+ dialect. It's simple to use same as in workspace container;</member>
+ </simplelist></para>
+ </section>
+</article>
More information about the exo-jcr-commits
mailing list