Author: jaredmorgs
Date: 2013-05-08 03:02:26 -0400 (Wed, 08 May 2013)
New Revision: 9275
Added:
epp/docs/JPP/trunk/Development_Guide/en-US/eXo_JCR.xml
Modified:
epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.ent
epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.xml
epp/docs/JPP/trunk/Development_Guide/en-US/Load_Groups.xml
epp/docs/JPP/trunk/Development_Guide/en-US/Localization.xml
epp/docs/JPP/trunk/Development_Guide/en-US/Revision_History.xml
epp/docs/JPP/trunk/Development_Guide/en-US/The_eXo_Kernel.xml
Log:
Minor changes to ensure the book would build in Brew
Modified: epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.ent
===================================================================
--- epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.ent 2013-05-08 05:35:47
UTC (rev 9274)
+++ epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.ent 2013-05-08 07:02:26
UTC (rev 9275)
@@ -17,3 +17,4 @@
<!ENTITY VX "6">
<!ENTITY VY "6.1">
<!ENTITY VZ "6.1.0">
+<!ENTITY JCR_VERSION "1.15.3">
Modified: epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.xml
===================================================================
--- epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.xml 2013-05-08 05:35:47
UTC (rev 9274)
+++ epp/docs/JPP/trunk/Development_Guide/en-US/Development_Guide.xml 2013-05-08 07:02:26
UTC (rev 9275)
@@ -62,5638 +62,9 @@
<part>
<title>Advanced Development Concepts</title>
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="The_eXo_Kernel.xml"/>
+ <xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="eXo_JCR.xml"/>
+ <xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="appendix-Quickstarts.xml"/>
</part>
- <appendix>
- <title>eXo JCR</title>
- <section xmlns=""
id="chap-Reference_Guide-eXoJCR-Introduction">
- <title>Introduction and Scope</title>
- <warning>
- <title>eXo JCR usage</title>
- <para>
- The JBoss Portal Platform is using a JCR API to store its information for
internal usage. We do not support usage of the JCR to store application information.
- </para>
- <para>
- The information below is intended to assist users to understand particular
low level details on how the JBoss Portal Platform works and how it can be fine-tuned.
- </para>
- </warning>
- <para>
- The term <emphasis role="bold">JCR</emphasis> refers to the
Java Content Repository. The JCR is the data store of JBoss Portal Platform. All content
is stored and managed via the JCR.
- </para>
- <para>
- The eXo JCR included with JBoss Portal Platform &VY; is a (<ulink
url="http://www.jcp.org/en/jsr/detail?id=170"
type="http">JSR-170</ulink>) compliant implementation of the JCR 1.0
specification. The JCR provides versioning, textual search, access control, content event
monitoring, and is used to storing text and binary data for the portal internal usage. The
back-end storage of the JCR is configurable and can be a file system or a database.
- </para>
- <section id="sect-Reference_Guide-Introduction-Concepts">
- <title>Concepts</title>
- <variablelist>
- <varlistentry>
- <term>Repository</term>
- <listitem>
- <para>
- A repository is a form of data storage device. A
'repository' differs from a 'database' in the nature
of the information contained. While a database holds hard data in rigid tables, a
repository may access the data on a database by using less rigid
<emphasis>meta</emphasis>-data. In this sense a repository operates as an
'interpreter' between the database(s) and the user.
- </para>
- <note>
- <para>
- The data model for the interface (the repository) is rarely
the same as the data model used by the repository's underlying storage subsystems
(such as a database), however the repository is able to make persistent data changes in
the storage subsystem.
- </para>
- </note>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Workspace</term>
- <listitem>
- <para>
- The eXo JCR uses 'workspaces' as the main data
abstraction in its data model. The content is stored in a workspace as a hierarchy of
<emphasis>items</emphasis> and each workspace has its own hierarchy of items.
- </para>
- <para>
- Repositories access one or more workspaces. Persistent JCR
workspaces consist of a directed acyclic graph of <emphasis>items</emphasis>
where the edges represent the parent-child relation.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Items</term>
- <listitem>
- <para>
- An <emphasis>item</emphasis> is either a
<emphasis>node</emphasis> or a <emphasis>property</emphasis>.
Properties contain the data (either simple values or binary data). The nodes of a
workspace give it its structure while the properties hold the data itself.
- </para>
- <variablelist>
- <varlistentry>
- <term>Nodes</term>
- <listitem>
- <para>
- Nodes are identified using accepted
<emphasis>namespacing</emphasis> conventions. Changed nodes may be versioned
through an associated version graph to preserve data integrity.
- </para>
- <para>
- Nodes can have various properties or child nodes
associated to them.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Properties</term>
- <listitem>
- <para>
- Properties hold data as values of predefined types,
such as: <emphasis role="bold">String</emphasis>, <emphasis
role="bold">Binary</emphasis>, <emphasis
role="bold">Long</emphasis>, <emphasis
role="bold">Boolean</emphasis>, <emphasis
role="bold">Double</emphasis>, <emphasis
role="bold">Date</emphasis>, <emphasis
role="bold">Reference</emphasis> and <emphasis
role="bold">Path</emphasis>.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>The Data Model</term>
- <listitem>
- <para>
- The core of any Content Repository is the data model. The data
model defines the 'data elements' (fields, columns, attributes, etc.)
that are stored in the CR and the relationships between these elements.
- </para>
- <para>
- Data elements can be singular pieces of information (the value
3.14, for example), or compound values
('<emphasis>pi</emphasis>' = 3.14). A data model uses
concepts like 'nodes', 'arrays' and
'links' to define relationships between data elements.
- </para>
- <para>
- The use and structure of these elements forms the content
repository's 'data model'.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Data Abstraction</term>
- <listitem>
- <para>
- Data abstraction describes the separation between
<emphasis>abstract</emphasis> and <emphasis>concrete</emphasis>
properties of data stored in a repository. The <emphasis>concrete</emphasis>
properties of the data refer to its implementation details.
- </para>
- <para>
- The <emphasis>concrete</emphasis> properties of the
data implementation may be changed without affecting the
<emphasis>abstract</emphasis> properties of the data itself, which are read by
the data client.
- </para>
- <para>
- Consider the presentation of data in a list, graph or table.
While the information <emphasis>implementation</emphasis> may change, the data
itself is unaffected, and readers to whom the data is presented can perform a mental
abstraction to interpret it correctly, regardless of the implementation.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
-<!-- Commented until image can be redrawn for RedHat.
-<mediaobject>
-<imageobject role="html">
-<imagedata fileref="images/Advanced/JCR/repository_diagram.png"
format="PNG" align="center" scale="90" />
-</imageobject>
-<imageobject role="fo">
-<imagedata fileref="images/Advanced/JCR/repository_diagram.png"
format="PNG" align="center" contentwidth="150mm" />
-</imageobject>
-</mediaobject>
-<para>
-The above diagram depicts a repository R with workspaces W0, W1 and W2. The item graph
of W0 contains a root node with child nodes A, B and C. A has a property D of type
STRING and a child node E, which in turn has a property I of type BINARY. B has the
properties F (a LONG) and G (a BOOLEAN). C has a property H of type DOUBLE.
-</para>
-<warning>
-<title>DOC TODO: Placeholders</title>
-<para>
-The above diagram is being redrawn for RedHat. The explanatory note needs to be rewritten
to avoid possible licensing issues.
-</para>
-</warning>
-Metadata:
-Source URL:
http://www.day.com/specs/jcr/2.0/3_Repository_Model.html
-Source Author: Day Management AG
-Source Author email:
-Source License:
http://www.day.com/specs/jcr/2.0/license.html] -->
</section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Implementation">
- <title>Implementation</title>
- <para>
- The relationships between the eXo Repository Service components are illustrated below:
- </para>
- <figure>
- <title id="exojcr">Exo JCR</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/eXoJCR/concepts/exojcr.gif"/>
- </imageobject>
- </mediaobject>
- </figure>
- <variablelist
id="vari-Reference_Guide-Implementation-Definitions">
- <title>Definitions</title>
- <varlistentry>
- <term>eXo Container:</term>
- <listitem>
- <para>
- A subclass of
<parameter>org.exoplatform.container.ExoContainer</parameter>
(<parameter>org.exoplatform.container.PortalContainer</parameter>) holds a
reference to the Repository Service.
- </para>
- <variablelist>
- <title/>
- <varlistentry>
- <term>Repository Service</term>
- <listitem>
- <para>
- This contains information about repositories. eXo JCR is able to manage many
Repositories.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Repository</term>
- <listitem>
- <para>
- An implementation of <literal>javax.jcr.Repository</literal>. It
holds references to one or more Workspace(s).
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Workspace</term>
- <listitem>
- <para>
- Container of a single rooted tree of Items.
- </para>
- <note>
- <title>Note:</title>
- <para>
- That here it is not exactly the same as
<literal>javax.jcr.Workspace</literal> as it is not a per Session object.
- </para>
- </note>
- </listitem>
- </varlistentry>
- </variablelist>
- </listitem>
- </varlistentry>
- </variablelist>
- <para>
- The usual JCR application usecase includes two initial steps:
- </para>
- <procedure>
- <step>
- <para>
- Obtaining Repository object by getting <emphasis
role="bold">Repository Service</emphasis> via JNDI lookup if eXo
repository is bound to the naming context using (see <xref
linkend="chap-Reference_Guide-JCR_configuration"/> for details).
- </para>
- </step>
- <step>
- <para>
- Creating a <parameter>javax.jcr.Session object</parameter> that calls
<parameter>Repository.login(..)</parameter>.
- </para>
- </step>
- </procedure>
- <section
id="sect-Reference_Guide-Implementation-Workspace_Data_Model">
- <title>Workspace Data Model</title>
- <para>
- The following diagram explains which components of a eXo JCR implementation are used
in a data flow to perform operations specified in the JCR API.
- </para>
- <figure>
- <title id="wsdatamodel">Workspace Data Model</title>
- <mediaobject>
- <imageobject>
- <imagedata width="444"
fileref="images/eXoJCR/concepts/wsdatamodel.gif"/>
- </imageobject>
- </mediaobject>
- </figure>
- <para>
- The Workspace Data Model can be split into four levels by data isolation and value
from the JCR model point of view.
- </para>
- <itemizedlist>
- <listitem>
- <para>
- The eXo JCR core implements <emphasis role="bold">JCR
API</emphasis> interfaces, such as Item, Node, Property. It contains JCR
"<emphasis>logical</emphasis>" view on stored data.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Session Level</emphasis>: isolates
transient data viewable inside one JCR Session and interacts with API level using eXo JCR
internal API.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Session Data Manager</emphasis>:
maintains transient session data. With data access/ modification/ validation logic, it
contains Modified Items Storage to hold the data changed between subsequent save() calling
and Session Items Cache.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Transaction Data Manager</emphasis>:
maintains session data between save() and transaction commit/ rollback if the current
session is part of a transaction.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Workspace Level</emphasis>: operates for
particular workspace shared data. It contains per-Workspace objects
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Workspace Storage Data
Manager:</emphasis> maintains workspace data, including final validation, events
firing, caching.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Workspace Data Container</emphasis>:
implements physical data storage. It allows different types of backend (like RDB, FS
files, etc) to be used as a storage for JCR data. With the main Data Container, other
storages for persisted Property Values can be configured and used.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Indexer:</emphasis> maintains workspace
data indexing for further queries.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis role="bold">Storage Level</emphasis>: Persistent
storages for:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- JCR Data
- </para>
- </listitem>
- <listitem>
- <para>
- Indexes (Apache Lucene)
- </para>
- </listitem>
- <listitem>
- <para>
- Values (e.g., for BLOBs) if different from the main Data Container
- </para>
- </listitem>
- </itemizedlist>
- </listitem>
- </itemizedlist>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-JCR_configuration">
- <title>JCR configuration</title>
- <para>
- The JCR configuration is defined in an XML file which is constructed as per the
DTD below:
- </para>
- <programlisting language="XML"
role="XML"><!ELEMENT repository-service (repositories)>
-<!ATTLIST repository-service default-repository NMTOKEN #REQUIRED>
-<!ELEMENT repositories (repository)>
-<!ELEMENT repository
(security-domain,access-control,session-max-age,authentication-policy,workspaces)>
-<!ATTLIST repository
- default-workspace NMTOKEN #REQUIRED
- name NMTOKEN #REQUIRED
- system-workspace NMTOKEN #REQUIRED
->
-<!ELEMENT security-domain (#PCDATA)>
-<!ELEMENT access-control (#PCDATA)>
-<!ELEMENT session-max-age (#PCDATA)>
-<!ELEMENT authentication-policy (#PCDATA)>
-<!ELEMENT workspaces (workspace+)>
-<!ELEMENT workspace (container,initializer,cache,query-handler)>
-<!ATTLIST workspace name NMTOKEN #REQUIRED>
-<!ELEMENT container (properties,value-storages)>
-<!ATTLIST container class NMTOKEN #REQUIRED>
-<!ELEMENT value-storages (value-storage+)>
-<!ELEMENT value-storage (properties,filters)>
-<!ATTLIST value-storage class NMTOKEN #REQUIRED>
-<!ELEMENT filters (filter+)>
-<!ELEMENT filter EMPTY>
-<!ATTLIST filter property-type NMTOKEN #REQUIRED>
-<!ELEMENT initializer (properties)>
-<!ATTLIST initializer class NMTOKEN #REQUIRED>
-<!ELEMENT cache (properties)>
-<!ATTLIST cache
- enabled NMTOKEN #REQUIRED
- class NMTOKEN #REQUIRED
->
-<!ELEMENT query-handler (properties)>
-<!ATTLIST query-handler class NMTOKEN #REQUIRED>
-<!ELEMENT access-manager (properties)>
-<!ATTLIST access-manager class NMTOKEN #REQUIRED>
-<!ELEMENT lock-manager (time-out,persister)>
-<!ELEMENT time-out (#PCDATA)>
-<!ELEMENT persister (properties)>
-<!ELEMENT properties (property+)>
-<!ELEMENT property EMPTY></programlisting>
- <section
id="sect-Reference_Guide-JCR_configuration-Portal_configuration">
- <title><remark>BZ#812412 </remark>Portal
configuration</title>
- <para>
- JCR services are registered in the Portal container.
- </para>
- <remark>NEEDINFO - FILE PATHS - The path needs to be updated with the
equivalent path for JBoss Portal Platform instead of gatein, please see below para. New
info required?</remark>
- <para>
- Below is an example configuration from the
<filename><replaceable>JPP_DIST</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/jcr-configuration.xml</filename>
file.
- </para>
- <programlisting language="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/jcr-configuration.xml"
parse="text"/></programlisting>
- <section
id="sect-Reference_Guide-Portal_configuration-JCR_Configuration">
- <title>JCR Configuration</title>
- <para>
- The JCR Service can use multiple
<emphasis>Repositories</emphasis> and each repository can have multiple
<emphasis>Workspaces</emphasis>.
- <remark>NEEDINFO - FILE PATHS - The path needs to be updated with the
equivalent path for JBoss Portal Platform instead of gatein, please see below para. New
info required?</remark> </para>
- <para>
- Configure the workspaces by locating the workspace you need to modify in
<filename><replaceable>JPP_DIST</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml</filename>.
- </para>
- <para>
- The repository configuration supports human-readable values. They are not
case-sensitive.
- </para>
- <para>
- Complete the appropriate element fields using the following value
formats:
- </para>
- <variablelist>
- <varlistentry>
- <term>Number formats:</term>
- <listitem>
- <para>
- <itemizedlist>
- <listitem>
- <para>
- <emphasis
role="bold">K</emphasis> or <emphasis
role="bold">KB</emphasis> for kilobytes.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">M</emphasis> or <emphasis
role="bold">MB</emphasis> for megabytes.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">G</emphasis> or <emphasis
role="bold">GB</emphasis> for gigabytes.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">T</emphasis> or <emphasis
role="bold">TB</emphasis> for terabytes.
- </para>
- </listitem>
- <listitem>
- <para>
- Examples: 200K or 200KB; 4M or 4MB; 1.4G or
1.4GB; 10T or 10TB.
- </para>
- </listitem>
- </itemizedlist>
-
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Time formats:</term>
- <listitem>
- <para>
- <itemizedlist>
- <listitem>
- <para>
- <emphasis
role="bold">ms</emphasis> for milliseconds.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">s</emphasis> for seconds.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">m</emphasis> for minutes.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">h</emphasis> for hours.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">d</emphasis> for days.
- </para>
- </listitem>
- <listitem>
- <para>
- <emphasis
role="bold">w</emphasis> for weeks.
- </para>
- </listitem>
- <listitem>
- <para>
- The default time format is seconds if no other
format is specified.
- </para>
- </listitem>
- <listitem>
- <para>
- Examples: 500ms or 500 milliseconds; 20, 20s or
20 seconds; 30m or 30 minutes; 12h or 12 hours; 5d or 5 days; 4w or 4 weeks.
- </para>
- </listitem>
- </itemizedlist>
-
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Repository_service_configuration_JCR_repositories_configuration">
- <title>Repository service configuration (JCR repositories
configuration)</title>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="../../../../extras/Advanced_Development_JCR_Configuration/orig.xml"
parse="text"/></programlisting>
- <note>
- <title>
- <parameter> session-max-age </parameter>
- </title>
- <para>
- <emphasis>session-max-age</emphasis>: This parameter is
not shown in the example file above as it is not a required setting. It sets the time
after which an idle session will be removed (called logout). If it is not set up, an idle
session will never be removed.
- </para>
- </note>
- <para>
- <emphasis
role="bold">lock-remover-max-threads</emphasis>: Number of threads that
can serve LockRemover tasks. Default value is 1. Repository may have many workspaces, each
workspace have own LockManager. JCR supports Locks with defined lifetime. Such a lock must
be removed is it become expired. That is what LockRemovers does. But LockRemovers is not
an independent timer-threads, its a task that executed each 30 seconds. Such a task is
served by ThreadPoolExecutor which may use different number of threads.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Workspace_configuration">
- <title>Workspace configuration:</title>
- <variablelist>
- <title/>
- <varlistentry>
- <term>name</term>
- <listitem>
- <para>
- The name of a workspace.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>auto-init-root-nodetype</term>
- <listitem>
- <para>
- DEPRECATED in JCR 1.9 (use initializer). The node
type for root node initialization.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>container</term>
- <listitem>
- <para>
- Workspace data container (physical storage)
configuration.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>initializer</term>
- <listitem>
- <para>
- Workspace initializer configuration.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>cache</term>
- <listitem>
- <para>
- Workspace storage cache configuration.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>query-handler</term>
- <listitem>
- <para>
- Query handler configuration.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>auto-init-permissions</term>
- <listitem>
- <para>
- DEPRECATED in JCR 1.9 (use initializer). Default
permissions of the root node. It is defined as a set of semicolon-delimited permissions
containing a group of space-delimited identities and the type of permission.
- </para>
- <para>
- For example, any read; <literal>:/admin
read;</literal>:/admin add_node; <literal>:/admin
set_property;</literal>:/admin remove means that users from group
<literal>admin</literal> have all permissions and other users have only a
'read' permission.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Workspace_data_container_configuration">
- <title>Workspace data container configuration:</title>
- <variablelist>
- <title/>
- <varlistentry>
- <term>class</term>
- <listitem>
- <para>
- A workspace data container class name.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>properties</term>
- <listitem>
- <para>
- The list of properties (name-value pairs) for the
concrete Workspace data container.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- <table
id="tabl-Reference_Guide-Workspace_data_container_configuration-Parameter_Descriptions">
- <title>Parameter Descriptions</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Parameter </entry>
- <entry> Description </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> trigger_events_for_descendents_on_rename </entry>
- <entry> indicates if need to trigger events for descendants on
rename or not. It allows to increase performance on rename operation but in same time
Observation is not notified, has default value true </entry>
- </row>
- <row>
- <entry> lazy-node-iterator-page-size </entry>
- <entry> the page size for lazy iterator. Indicates how many nodes
can be retrieved from storage per request. The default value is 100 </entry>
- </row>
- <row>
- <entry> acl-bloomfilter-false-positive-probability
</entry>
- <entry> ACL Bloom-filter desired false positive probability.
Range [0..1]. Default value 0.1d. (See the note below) </entry>
- </row>
- <row>
- <entry> acl-bloomfilter-elements-number </entry>
- <entry> Expected number of ACL-elements in the Bloom-filter.
Default value 1000000. (See the note below) </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <note>
- <para>
- Bloom filters are not supported by all the cache implementations so
far only the implementation for infinispan supports it.
- </para>
- </note>
- <variablelist>
- <title/>
- <varlistentry>
- <term>value-storage</term>
- <listitem>
- <para>
- The list of value storage plug-ins.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Value_Storage_plugin_configuration_for_data_container">
- <title>Value Storage plug-in configuration (for data
container):</title>
- <note>
- <para>
- The value-storage element is optional. If you do not include it, the
values will be stored as BLOBs inside the database.
- </para>
- </note>
- <variablelist>
- <title/>
- <varlistentry>
- <term>value-storage</term>
- <listitem>
- <para>
- Optional value Storage plug-in definition.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>class</term>
- <listitem>
- <para>
- A value storage plug-in class name (attribute).
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>properties</term>
- <listitem>
- <para>
- The list of properties (name-value pairs) for a concrete
Value Storage plug-in.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>filters</term>
- <listitem>
- <para>
- The list of filters defining conditions when this plug-in
is applicable.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Initializer_configuration_optional">
- <title>Initializer configuration (optional):</title>
- <variablelist>
- <title/>
- <varlistentry>
- <term>class</term>
- <listitem>
- <para>
- Initializer implementation class.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>properties</term>
- <listitem>
- <para>
- The list of properties (name-value pairs). Properties
are supported.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>root-nodetype</term>
- <listitem>
- <para>
- The node type for root node initialization.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>root-permissions</term>
- <listitem>
- <para>
- Default permissions of the root node. It is defined
as a set of semicolon-delimited permissions containing a group of space-delimited
identities (user, group etc, see Organization service documentation for details) and the
type of permission. For example any read; <emphasis role="bold">:/admin
read;</emphasis>:/admin add_node; <emphasis role="bold">:/admin
set_property;</emphasis>:/admin remove means that users from group
<emphasis>admin</emphasis> have all permissions and other users have only a
'read' permission.
- </para>
- <para>
- Configurable initializer adds a capability to
override workspace initial startup procedure (used for Clustering). Also it replaces
workspace element parameters auto-init-root-nodetype and auto-init-permissions with
root-nodetype and root-permissions.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Cache_configuration">
- <title>Cache configuration:</title>
- <variablelist>
- <title/>
- <varlistentry>
- <term>enabled</term>
- <listitem>
- <para>
- If workspace cache is enabled or not.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>class</term>
- <listitem>
- <para>
- Cache implementation class, optional from 1.9.
Default value is.
<literal>org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl</literal>.
- </para>
- <para>
- Cache can be configured to use concrete
implementation of WorkspaceStorageCache interface. JCR core has two implementation to
use:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- LinkedWorkspaceStorageCacheImpl - default,
with configurable read behavior and statistic.
- </para>
- </listitem>
- <listitem>
- <para>
- WorkspaceStorageCacheImpl - pre 1.9, still
can be used.
- </para>
- </listitem>
- </itemizedlist>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>properties</term>
- <listitem>
- <para>
- The list of properties (name-value pairs) for
Workspace cache.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>max-size</term>
- <listitem>
- <para>
- Cache maximum size (maxSize prior to v.1.9).
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>live-time</term>
- <listitem>
- <para>
- Cached item live time (liveTime prior to v.1.9).
- </para>
- <para>
- From 1.9
<literal>LinkedWorkspaceStorageCacheImpl</literal> supports additional
optional parameters.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>statistic-period</term>
- <listitem>
- <para>
- Period (time format) of cache statistic thread
execution, 5 minutes by default.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>statistic-log</term>
- <listitem>
- <para>
- If true cache statistic will be printed to default
logger (log.info), false by default or not.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>statistic-clean</term>
- <listitem>
- <para>
- If true cache statistic will be cleaned after was
gathered, false by default or not.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>cleaner-period</term>
- <listitem>
- <para>
- Period of the eldest items remover execution, 20
minutes by default.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>blocking-users-count</term>
- <listitem>
- <para>
- Number of concurrent users allowed to read cache
storage, 0 - unlimited by default.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Query_Handler_configuration">
- <title>Query Handler configuration:</title>
- <variablelist>
- <title/>
- <varlistentry>
- <term>class</term>
- <listitem>
- <para>
- A Query Handler class name.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>properties</term>
- <listitem>
- <para>
- The list of properties (name-value pairs) for a Query
Handler (indexDir).
- </para>
- <para>
- Properties and advanced features described in Search
Configuration.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Portal_configuration-Lock_Manager_configuration">
- <title>Lock Manager configuration:</title>
- <variablelist>
- <title/>
- <varlistentry>
- <term>time-out</term>
- <listitem>
- <para>
- Time after which the unused global lock will be
removed.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>persister</term>
- <listitem>
- <para>
- A class for storing lock information for future use.
For example, remove lock after jcr restart.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>path</term>
- <listitem>
- <para>
- A lock folder. Each workspace has its own one.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-<!-- <section
id="sect-Reference_Guide-Portal_configuration-Help_application_to_prohibit_the_use_of_closed_sessions">
- <title>Help application to prohibit the use of closed
sessions</title>
- <para>
- Products that use eXo JCR, sometimes misuse it since they continue to use
a session that has been closed through a method call on a node, a property or even the
session itself. To prevent bad practices we propose three modes which are the following:
- </para>
- <orderedlist>
- <listitem>
- <para>
- If the system property
<emphasis>exo.jcr.prohibit.closed.session.usage</emphasis> has been set to
<emphasis>true</emphasis>, then a RepositoryException will be thrown any time
an application will try to access to a closed session. In the stack trace, you will be
able to know the call stack that closes the session.
- </para>
-
- </listitem>
- <listitem>
- <para>
- If the system property
<emphasis>exo.jcr.prohibit.closed.session.usage</emphasis> has not been set
and the system property <emphasis>exo.product.developing</emphasis> has been
set to <emphasis>true</emphasis>, then a warning will be logged in the log
file with the full stack trace in order to help identifying the root cause of the issue.
In the stack trace, you will be able to know the call stack that closes the session.
- </para>
-
- </listitem>
- <listitem>
- <para>
- If none of the previous system properties have been set, then we
will ignore that the issue and let the application use the closed session as it was
possible before without doing anything in order to allow applications to migrate step by
step.
- </para>
-
- </listitem>
-
- </orderedlist>
-
- </section> --> </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Multi_language_Support_the_JCR_RDB">
- <title>Multi-language Support</title>
- <para>
- Whenever a relational database is used to store multilingual text data in the eXo
Java Content Repository the configuration must be adapted to support UTF-8 encoding.
Dialect is automatically detected for certified database. You can still enforce it in case
of failure, see below.
- </para>
- <para>
- The following sections describe enabling UTF-8 support with various databases.
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <xref
linkend="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-Oracle"/>
- </para>
- </listitem>
- <listitem>
- <para>
- <xref
linkend="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-DB2"/>
- </para>
- </listitem>
- <listitem>
- <para>
- <xref
linkend="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-MySQL"/>
- </para>
- </listitem>
- <listitem>
- <para>
- <xref
linkend="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-PostgreSQL"/>
- </para>
- </listitem>
- </itemizedlist>
- <note>
- <itemizedlist>
- <listitem>
- <remark>NEEDINFO - FILE PATHS - The path needs to be updated with the
equivalent path for JBoss Portal Platform instead of gatein, please see below para. New
info required?</remark>
- <para>
- The configuration file to be modified for these changes is
<filename><replaceable>JPP_HOME</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml</filename>.
- </para>
- </listitem>
- <listitem>
- <para>
- The datasource <parameter>jdbcjcr</parameter> used in the
following examples can be configured via the
<literal>InitialContextInitializer</literal> component.
- </para>
- </listitem>
- </itemizedlist>
- </note>
- <section
id="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-Oracle">
- <title>Oracle</title>
- <para>
- In order to run multilanguage JCR on an Oracle backend Unicode encoding for
characters set should be applied to the database. Other Oracle globalization parameters do
not have any effect. The property to modify is
<literal>NLS_CHARACTERSET</literal>.
- </para>
- <para>
- The <literal>NLS_CHARACTERSET = AL32UTF8</literal> entry has been
successfully tested with many European and Asian languages.
- </para>
- <para>
- Example of database configuration:
- </para>
- <programlisting>NLS_LANGUAGE AMERICAN
-NLS_TERRITORY AMERICA
-NLS_CURRENCY $
-NLS_ISO_CURRENCY AMERICA
-NLS_NUMERIC_CHARACTERS .,
-NLS_CHARACTERSET AL32UTF8
-NLS_CALENDAR GREGORIAN
-NLS_DATE_FORMAT DD-MON-RR
-NLS_DATE_LANGUAGE AMERICAN
-NLS_SORT BINARY
-NLS_TIME_FORMAT HH.MI.SSXFF AM
-NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
-NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
-NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
-NLS_DUAL_CURRENCY $
-NLS_COMP BINARY
-NLS_LENGTH_SEMANTICS BYTE
-NLS_NCHAR_CONV_EXCP FALSE
-NLS_NCHAR_CHARACTERSET AL16UTF16</programlisting>
-<!-- <warning>
- <para>
- JCR 1.12.x doesn't use NVARCHAR columns, so that the value of the
parameter NLS_NCHAR_CHARACTERSET does not matter for JCR.
- </para>
-
- </warning> --> <para>
- Create database with Unicode encoding and use Oracle dialect for the
Workspace Container:
- </para>
- <programlisting language="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default54.xml"
parse="text"/></programlisting>
- </section>
- <section
id="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-DB2">
- <title>DB2</title>
- <para>
- DB2 Universal Database (DB2 UDB) supports <ulink
url="http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?to...
and UTF-16/UCS-2</ulink>. When a Unicode database is created,
<parameter>CHAR</parameter>, <parameter>VARCHAR</parameter> and
<parameter>LONG VARCHAR</parameter> data are stored in UTF-8 form.
- </para>
- <para>
- This enables JCR multi-lingual support.
- </para>
- <para>
- Below is an example of creating a UTF-8 database using the
<parameter>db2</parameter> dialect for a workspace container with DB2 version
9 and higher:
- </para>
- <programlisting>DB2 CREATE DATABASE dbname USING CODESET UTF-8 TERRITORY
US
-</programlisting>
- <programlisting language="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default56.xml"
parse="text"/></programlisting>
- <note>
- <para>
- For DB2 version 8.<replaceable>x</replaceable> support change
the property "dialect" to db2v8.
- </para>
- </note>
- </section>
- <section
id="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-MySQL">
- <title>MySQL</title>
- <para>
- Using JCR with a MySQL-back end requires a special dialect <ulink
url="http://jira.exoplatform.org/browse/JCR-375">MySQL-UTF8&... to
be used for internationalization support.
- </para>
- <para>
- The database default charset should be
<parameter>latin1</parameter> so as to use limited index space effectively
(767 for <literal>InnoDB</literal>).
- </para>
- <para>
- If the database default charset is multibyte, a JCR database initialization
error is encountered concerning index creation failure.
- </para>
- <para>
- JCR can work on any single byte default charset of database, with UTF8
supported by MySQL server. However it has only been tested using the
<parameter>latin1</parameter> charset.
- </para>
- <para>
- An example entry:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default57.xml"
parse="text"/></programlisting>
- </section>
- <section
id="sect-Reference_Guide-Multi_language_Support_the_JCR_RDB-PostgreSQL">
- <title>PostgreSQL</title>
- <para>
- Multilingual support can be enabled with a PostgreSQL-back end in <ulink
url="http://www.postgresql.org/docs/8.3/interactive/charset.html&quo...
ways</ulink>:
- </para>
- <orderedlist>
- <listitem>
- <para>
- Using the locale features of the operating system to provide
locale-specific collation order, number formatting, translated messages, and other
aspects.
- </para>
- <para>
- UTF-8 is widely used on Linux distributions by default, so it can be
useful in such cases.
- </para>
- </listitem>
- <listitem>
- <para>
- Providing a number of different character sets defined in the
PostgreSQL server, including multiple-byte character sets, to support storing text any
language, and providing character set translation between client and server.
- </para>
- <para>
- Using UTF-8 database charset is recommended as it will allow
any-to-any conversations and make this issue transparent for the JCR.
- </para>
- </listitem>
- </orderedlist>
- <para>
- Example of a database with UTF-8 encoding using PgSQL dialect for the
Workspace Container:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default58.xml"
parse="text"/></programlisting>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Search_Configuration">
- <title>Configuring Search</title>
- <para>
- The search function in JCR can be configured to perform in specific ways. This
section will discuss configuring the search function to improve search performance and
results.
- </para>
- <para>
- Below is an example of the configuration file that governs search behaviors.
Refer to <xref
linkend="sect-Reference_Guide-Search_Configuration-Global_Search_Index"/> for
how searching operates in JCR and information about customized searches.
- </para>
- <para>
- The JCR index configuration file is located at
<filename><replaceable>JPP_HOME</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml</filename>.
- </para>
- <para>
- A code example is included below with a list of the configuration parameters
shown below that.
- </para>
- <programlisting language="XML" role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default61.xml"
parse="text"/></programlisting>
- <para>
- The table below outlines <emphasis
role="bold">some</emphasis> of the Configuration Parameters available,
their default setting, which version of eXo JCR they were implemented in and other useful
information (further parameters are explained in <xref
linkend="sect-Reference_Guide-JBoss_Cache_configuration-Indexer_lock_manager_and_data_container_configuration"/>):
- </para>
- <table
id="tabl-Reference_Guide-Search_Configuration-Configuration_parameters">
-<!-- align="left" pgwide="1" -->
<title>Configuration parameters</title>
- <tgroup cols="4">
- <colspec colname="1" colwidth="90pt"/>
- <colspec colname="2" colwidth="90pt"/>
- <colspec colname="3" colwidth="150pt"/>
- <colspec colname="4" colwidth="50pt"/>
- <thead>
- <row>
- <entry>
- <para>
- Parameter
- </para>
- </entry>
- <entry>
- <para>
- Default
- </para>
- </entry>
- <entry>
- <para>
- Description
- </para>
- </entry>
- <entry> Implemented in Version </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>
- <para>
- index-dir
- </para>
- </entry>
- <entry>
- <para>
- none
- </para>
- </entry>
- <entry>
- <para>
- The location of the index directory. This parameter is
mandatory. It is called "<literal>indexDir</literal>" in
versions prior to eXo JCR version 1.9.
- </para>
- </entry>
- <entry> 1.0 </entry>
- </row>
- <row>
- <entry>
- <para>
- use-compoundfile
- </para>
- </entry>
- <entry>
- <para>
- true
- </para>
- </entry>
- <entry>
- <para>
- Advises lucene to use compound files for the index files.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- min-merge-docs
- </para>
- </entry>
- <entry>
- <para>
- 100
- </para>
- </entry>
- <entry>
- <para>
- The minimum number of nodes in an index until segments are
merged.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- volatile-idle-time
- </para>
- </entry>
- <entry> 3 </entry>
- <entry>
- <para>
- Idle time in seconds until the volatile index part is moved
to a persistent index even though <literal>minMergeDocs</literal> is not
reached.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- max-merge-docs
- </para>
- </entry>
- <entry>
- <para>
- Integer.MAX_VALUE
- </para>
- </entry>
- <entry>
- <para>
- The maximum number of nodes in segments that will be merged.
The default value changed to <literal>Integer.MAX_VALUE</literal> in eXo JCR
version 1.9.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- merge-factor
- </para>
- </entry>
- <entry>
- <para>
- 10
- </para>
- </entry>
- <entry>
- <para>
- Determines how often segment indices are merged.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- max-field-length
- </para>
- </entry>
- <entry>
- <para>
- 10000
- </para>
- </entry>
- <entry>
- <para>
- The number of words that are full-text indexed at most per
property.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- cache-size
- </para>
- </entry>
- <entry>
- <para>
- 1000
- </para>
- </entry>
- <entry>
- <para>
- Size of the document number cache. This cache maps UUID to
lucene document numbers
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- force-consistencycheck
- </para>
- </entry>
- <entry>
- <para>
- false
- </para>
- </entry>
- <entry>
- <para>
- Runs a consistency check on every start up. If false, a
consistency check is only performed when the search index detects a prior forced
shutdown.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- auto-repair
- </para>
- </entry>
- <entry>
- <para>
- true
- </para>
- </entry>
- <entry>
- <para>
- Errors detected by a consistency check are automatically
repaired. If false, errors are only written to the log.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry> query-class </entry>
- <entry> QueryImpl </entry>
- <entry>
- <para>
- Classname that implements the javax.jcr.query.Query
interface.
- </para>
- <para>
- This class must also extend from the class:
<literal>org.exoplatform.services.jcr.impl.core.
query.AbstractQueryImpl</literal>.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- document-order
- </para>
- </entry>
- <entry>
- <para>
- true
- </para>
- </entry>
- <entry>
- <para>
- If true and the query does not contain an 'order
by' clause, result nodes will be in document order. For better performance set to
'false' when queries return many nodes.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- result-fetch-size
- </para>
- </entry>
- <entry>
- <para>
- Integer.MAX_VALUE
- </para>
- </entry>
- <entry>
- <para>
- The number of results when a query is executed. Default
value: <literal>Integer.MAX_VALUE</literal>.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- excerptprovider-class
- </para>
- </entry>
- <entry>
- <para>
- DefaultXMLExcerpt
- </para>
- </entry>
- <entry>
- <para>
- The name of the class that implements
<literal>org.exoplatform.services.jcr.impl.core.
query.lucene.ExcerptProvider</literal>.
- </para>
- <para>
- This should be used for the
<literal>rep:excerpt()</literal> function in a query.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- support-highlighting
- </para>
- </entry>
- <entry>
- <para>
- false
- </para>
- </entry>
- <entry>
- <para>
- If set to true additional information is stored in the index
to support highlighting using the <literal>rep:excerpt()</literal> function.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- synonymprovider-class
- </para>
- </entry>
- <entry>
- <para>
- none
- </para>
- </entry>
- <entry>
- <para>
- The name of a class that implements
<literal>org.exoplatform.services.jcr.impl.core.
query.lucene.SynonymProvider</literal>.
- </para>
- <para>
- The default value is null.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- synonymprovider-config-path
- </para>
- </entry>
- <entry>
- <para>
- none
- </para>
- </entry>
- <entry>
- <para>
- The path to the synonym provider configuration file. This
path is interpreted relative to the path parameter. If there is a path element inside the
<literal>SearchIndex</literal> element, then this path is interpreted relative
to the root path of the path. Whether this parameter is mandatory depends on the synonym
provider implementation. The default value is null.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- indexing-configuration-path
- </para>
- </entry>
- <entry>
- <para>
- none
- </para>
- </entry>
- <entry>
- <para>
- The path to the indexing configuration file.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- indexing-configuration-class
- </para>
- </entry>
- <entry>
- <para>
- IndexingConfigurationImpl
- </para>
- </entry>
- <entry>
- <para>
- The name of the class that implements
<literal>org.exoplatform.services.jcr.impl.core.
query.lucene.IndexingConfiguration</literal>.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- force-consistencycheck
- </para>
- </entry>
- <entry>
- <para>
- false
- </para>
- </entry>
- <entry>
- <para>
- If set to true a consistency check is performed depending on
the parameter <literal>forceConsistencyCheck</literal>. If set to false no
consistency check is performed on start up, even if a redo log had been applied.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- spellchecker-class
- </para>
- </entry>
- <entry>
- <para>
- none
- </para>
- </entry>
- <entry>
- <para>
- The name of a class that implements
<literal>org.exoplatform.services.jcr.impl.core.
query.lucene.SpellChecker</literal>.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- errorlog-size
- </para>
- </entry>
- <entry>
- <para>
- 50(KB)
- </para>
- </entry>
- <entry>
- <para>
- The default size of error log file in KB.
- </para>
- </entry>
- <entry> 1.9 </entry>
- </row>
- <row>
- <entry>
- <para>
- upgrade-index
- </para>
- </entry>
- <entry>
- <para>
- false
- </para>
- </entry>
- <entry>
- <para>
- Allows JCR to convert an existing index into the new format.
It is also possible to set this property via system property.
- </para>
- <para>
- Indexes prior to eXo JCR 1.12 will not run with eXo JCR 1.12.
You must run an automatic migration.
- </para>
- <para>
- Start eXo JCR with:
- </para>
- <programlisting><command>
-Dupgrade-index=true</command></programlisting>
- <para>
- The old index format is then converted in the new index
format. After the conversion the new format is used.
- </para>
- <para>
- On subsequent starts this option is no longer needed. The old
index is replaced and a back conversion is not possible
- </para>
- <para>
- It is recommended that a backup of the index be made before
conversion. (Only for migrations from JCR 1.9 and later.)
- </para>
- </entry>
- <entry> 1.12 </entry>
- </row>
- <row>
- <entry>
- <para>
- analyzer
- </para>
- </entry>
- <entry>
- <para>
- org.apache.lucene.analysis. standard.StandardAnalyzer
- </para>
- </entry>
- <entry>
- <para>
- Class name of a lucene analyzer to use for full-text indexing
of text.
- </para>
- </entry>
- <entry> 1.12 </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <section
id="sect-Reference_Guide-Search_Configuration-Global_Search_Index">
- <title>Global Search Index</title>
- <para>
- By default eXo JCR uses the Lucene standard Analyzer to index contents. This
analyzer uses some standard filters in the method that analyzes the content
- </para>
- <example>
- <title>Standard Analyzed Filters</title>
- <programlisting language="Java"
role="Java"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/default62.java" parse="text"/></programlisting>
- <para>
- Comment #1: The first filter (StandardFilter) removes possessive
apostrophes (<emphasis role="bold">'s</emphasis>) from the
end of words and removes periods (<emphasis
role="bold">.</emphasis>) from acronyms.
- </para>
- <para>
- Comment #2: The second filter (LowerCaseFilter) normalizes token
text to lower case.
- </para>
- <para>
- Comment #3: The last filter (StopFilter) removes stop words from
a token stream. The stop set is defined in the analyzer.
- </para>
- </example>
- <para>
- The global search index is configured in the
<filename><replaceable>JPP_HOME</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml</filename>
configuration file within the "query-handler" tag.
- </para>
- <programlisting language="XML"
role="XML"><query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
-</programlisting>
- <para>
- The same analyzer should always be used for indexing and for querying in
lucene otherwise results may be unpredictable. eXo JCR does this automatically. The
StandardAnalyzer (configured by default) can, however, be replaced with another.
- </para>
- <para>
- A customized QueryHandler can also be easily created.
- </para>
- <formalpara
id="form-Reference_Guide-Global_Search_Index-Customized_Search_Indexes_and_Analyzers">
- <title>Customized Search Indexes and Analyzers</title>
- <para>
- By default Exo JCR uses the Lucene standard Analyzer to index contents.
This analyzer uses some standard filters in the method that analyzes the content:
- </para>
- </formalpara>
- <programlisting language="Java" role="Java">public
TokenStream tokenStream(String fieldName, Reader reader) {
- StandardTokenizer tokenStream = new StandardTokenizer(reader,
replaceInvalidAcronym);
- tokenStream.setMaxTokenLength(maxTokenLength);
- TokenStream result = new StandardFilter(tokenStream);
- result = new LowerCaseFilter(result);
- result = new StopFilter(result, stopSet);
- return result;
- }</programlisting>
- <itemizedlist>
- <listitem>
- <para>
- The first one (StandardFilter) removes 's (as 's in
"Peter's") from the end of words and removes dots from
acronyms.
- </para>
- </listitem>
- <listitem>
- <para>
- The second one (LowerCaseFilter) normalizes token text to lower
case.
- </para>
- </listitem>
- <listitem>
- <para>
- The last one (StopFilter) removes stop words from a token stream. The
stop set is defined in the analyzer.
- </para>
- </listitem>
- </itemizedlist>
- <para>
- Additional filters can be used in specific cases. The
<phrase>ISOLatin1AccentFilter</phrase> filter, for example, which replaces
accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented
equivalents.
- </para>
- <para>
- The <phrase>ISOLatin1AccentFilter</phrase> is not present in the
current lucene version used by eXo.
- </para>
- <para>
- In order to use a different filter, a new analyzer must be created, as well
as new search index to use the analyzer. These are packaged into a jar file, which is then
deployed with the application.
- </para>
- <procedure
id="proc-Reference_Guide-Global_Search_Index-Create_a_new_filter_analyzer_and_search_index">
- <title>Create a new filter, analyzer and search index</title>
- <step>
- <para>
- Create a new filter with the method:
- </para>
- <programlisting language="Java" role="JAVA">public
final Token next(final Token reusableToken) throws java.io.IOException
-</programlisting>
- <para>
- This defines how characters are read and used by the filter.
- </para>
- </step>
- <step>
- <para>
- Create the analyzer.
- </para>
- <para>
- The analyzer must extend
<literal>org.apache.lucene.analysis.standard.StandardAnalyzer</literal> and
overload the method.
- </para>
- <para>
- Use the following to use new filters.
- </para>
- <programlisting language="Java" role="JAVA">public
TokenStream tokenStream(String fieldName, Reader reader)
-</programlisting>
- </step>
- <step>
- <para>
- To create the new search index, extend
<literal>org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex</literal>
and write the constructor to set the correct analyzer.
- </para>
- <para>
- Use the method below to return your analyzer:
- </para>
- <programlisting language="Java" role="JAVA">public
Analyzer getAnalyzer() {
-return MyAnalyzer;
-}
-</programlisting>
- </step>
- </procedure>
- <note>
- <para>
- In eXo JCR version 1.12 (and later) the analyzer can be directly set in
the configuration. For users with this version the creation of a new SearchIndex for new
analyzers is redundant.
- </para>
- </note>
- <para>
- To configure an application to use a new
<literal>SearchIndex</literal>, replace the following code:
- </para>
- <programlisting language="XML"
role="XML"><query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
-
-</programlisting>
- <para>
- in
<filename><replaceable>JPP_HOME</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml</filename>
with the new class:
- </para>
- <programlisting language="XML"
role="XML"><query-handler
class="mypackage.indexation.MySearchIndex>
-
-</programlisting>
- <para>
- To configure an application to use a new analyzer, add the
<parameter>analyzer</parameter> parameter to each query-handler configuration
in
<filename><replaceable>JPP_HOME</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml</filename>:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="../../../../extras/Advanced_Development_JCR_search-configuration/default69.xml"
parse="text"/></programlisting>
- <para>
- The new <literal>SearchIndex</literal> will start to index
contents with the specified filters when the JCR is next started.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Search_Configuration-IndexingConfiguration">
- <title>IndexingConfiguration</title>
- <para>
- From version 1.9, the default search index implementation in JCR allows user
control over which properties of a node are indexed. Different analyzers can also be set
for different nodes.
- </para>
- <para>
- The configuration parameter is called
<literal>indexingConfiguration</literal> and is not set by default. This means
all properties of a node are indexed.
- </para>
- <para>
- To configure the indexing behavior add a parameter to the query-handler
element in your configuration file.
- </para>
- <programlisting language="XML" role="XML"><param
name="indexing-configuration-path"
value="/indexing_configuration.xml"/>
-
-</programlisting>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Node_Scope_Limit">
- <title>Node Scope Limit</title>
- <para>
- The node scope can be limited so that only certain properties of a node
type are indexed. This can optimize the index size.
- </para>
- </formalpara>
- <para>
- With the configuration below only properties named
<parameter>Text</parameter> are indexed for
<parameter>nt:unstructured</parameter> node types. This configuration also
applies to all nodes whose type extends from
<parameter>nt:unstructured</parameter>.
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default71.xml"
parse="text"/></programlisting>
- <note>
- <title>Namespace Prefixes</title>
- <para>
- The <phrase>namespace prefixes</phrase> must be declared
throughout the XML file in the configuration element that is being used.
- </para>
- </note>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Indexing_Boost_Value">
- <title>Indexing Boost Value</title>
- <para>
- It is also possible to configure a <phrase>boost
value</phrase> for the nodes that match the index rule. The default boost value is
1.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield a higher score value
and appear as more relevant.
- </para>
- </formalpara>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default72.xml"
parse="text"/></programlisting>
- <para>
- If you do not wish to boost the complete node, but only certain properties,
you can also provide a boost value for the listed properties:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default73.xml"
parse="text"/></programlisting>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Conditional_Index_Rules">
- <title>Conditional Index Rules</title>
- <para>
- You may also add a <phrase>condition</phrase> to the index
rule and have multiple rules with the same nodeType. The first index rule that matches
will apply and all remaining ones are ignored:
- </para>
- </formalpara>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default74.xml"
parse="text"/></programlisting>
- <para>
- In the above example the first rule only applies if the
<parameter>nt:unstructured</parameter> node has a priority property with a
value <parameter>high</parameter>. The condition syntax only supports the
equals operator and a string literal.
- </para>
- <para>
- Properties may also be referenced on the condition that are not on the
current node:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default75.xml"
parse="text"/></programlisting>
- <para>
- The indexing configuration allows the type of a node in the condition to be
specified. Please note however that the type match must be exact. It does not consider sub
types of the specified node type.
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default76.xml"
parse="text"/></programlisting>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Exclusion_from_the_Node_Scope_Index">
- <title>Exclusion from the Node Scope Index</title>
- <para>
- All configured properties are full-text indexed by default (if they are
of type STRING and included in the node scope index).
- </para>
- </formalpara>
- <para>
- A node scope search normally finds all nodes of an index. That is to say;
<literal>jcr:contains(., 'foo')</literal> returns all nodes
that have a string property containing the word
'<replaceable>foo</replaceable>'.
- </para>
- <para>
- Properties can be explicitly excluded from the node scope index with:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default77.xml"
parse="text"/></programlisting>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Index_Aggregates">
- <title>Index Aggregates</title>
- <para>
- Sometimes it is useful to include the contents of descendant nodes into a
single node to more easily search on content that is scattered across multiple nodes.
- </para>
- </formalpara>
- <para>
- JCR allows the definition of index aggregates based on relative path patterns
and primary node types.
- </para>
- <para>
- The following example creates an index aggregate on
<literal>nt:file</literal> that includes the content of the jcr:content node:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default78.xml"
parse="text"/></programlisting>
- <para>
- Included nodes can also be restricted to a certain type:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default79.xml"
parse="text"/></programlisting>
- <para>
- The <emphasis role="bold">*</emphasis> wild-card can be
used to match all child nodes:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default80.xml"
parse="text"/></programlisting>
- <para>
- Nodes to a certain depth below the current node can be included by adding
multiple include elements. The <parameter>nt:file</parameter> node may contain
a complete XML document under jcr:content for example:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default81.xml"
parse="text"/></programlisting>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Property_Level_Analyzers">
- <title>Property-Level Analyzers</title>
- <para>
- How a property has to be analyzed can be defined in the following
configuration section. If there is an analyzer configuration for a property, this analyzer
is used for indexing and searching of this property. For example:
- </para>
- </formalpara>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default82.xml"
parse="text"/></programlisting>
- <para>
- The configuration above sets lucene <emphasis
role="bold">KeywordAnalyzer</emphasis> to index and search the property
"<replaceable>mytext</replaceable>" across the entire
workspace while the "<replaceable>mytext2</replaceable>"
property is searched with the <emphasis
role="bold">WhitespaceAnalyzer</emphasis>.
- </para>
- <para>
- The <emphasis role="bold">WhitespaceAnalyzer</emphasis>
tokenizes a property, the <emphasis
role="bold">KeywordAnalyzer</emphasis> takes the property as a whole.
- </para>
- <para>
- Using different analyzers for different languages can be particularly
useful.
- </para>
- <formalpara
id="form-Reference_Guide-IndexingConfiguration-Characteristics_of_Node_Scope_Searches">
- <title>Characteristics of Node Scope Searches</title>
- <para>
- Unexpected behavior may be encountered when using analyzers to search
within a <emphasis>property</emphasis> compared to searching within a
<emphasis>node scope</emphasis>. This is because the node scope always uses
the global analyzer.
- </para>
- </formalpara>
- <para>
- For example: the property
"<parameter>mytext</parameter>" contains the text;
"<emphasis>testing my analyzers</emphasis>" but no analyzers
have been configured for this property (and the default analyzer in SearchIndex has not
been changed).
- </para>
- <para>
- If the query is:
- </para>
- <programlisting language="Java" role="JAVA">xpath =
"//*[jcr:contains(mytext,'analyzer')]"
-</programlisting>
- <para>
- The <literal>xpath</literal> does not return a result in the node
with the property above and default analyzers.
- </para>
- <para>
- Also, if a search is done on the node scope as follows:
- </para>
- <programlisting language="Java" role="JAVA">xpath =
"//*[jcr:contains(.,'analyzer')]"
-</programlisting>
- <para>
- No result will be returned.
- </para>
- <para>
- Only specific analyzers can be set on a node property, and the node scope
indexing and analyzing is always done with the globally defined analyzer in the
SearchIndex element.
- </para>
- <para>
- If the analyzer used to index the "mytext" property above
is changed to:
- </para>
- <programlisting language="XML"
role="XML"><analyzer
class="org.apache.lucene.analysis.Analyzer.GermanAnalyzer">
-<property>mytext</property>
-</analyzer>
-</programlisting>
- <para>
- The search below would return a result because of the word stemming
(analyzers - analyzer).
- </para>
- <programlisting language="Java" role="JAVA">xpath =
"//*[jcr:contains(mytext,'analyzer')]"
-</programlisting>
- <para>
- The second search in the example:
- </para>
- <programlisting language="Java" role="JAVA">xpath =
"//*[jcr:contains(.,'analyzer')]"
-</programlisting>
- <para>
- Would still not give a result, since the node scope is indexed with the
global analyzer, which in this case does not take into account any word stemming.
- </para>
- <para>
- Be aware that when using analyzers for specific properties, a result may be
found in a property for certain search text, but the same search text in the node scope of
the property may not find a result.
- </para>
- <note>
- <para>
- Both index rules and index aggregates influence how content is indexed in
JCR. If the configuration is changed, the existing content is not automatically re-indexed
according to the new rules.
- </para>
- <para>
- Content must be manually re-indexed when the configuration is changed.
- </para>
- </note>
- </section>
- <section
id="sect-Reference_Guide-Search_Configuration-Advanced_features">
- <title>Advanced features</title>
- <para>
- eXo JCR supports some advanced features, which are not specified in JSR 170:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- Get a text excerpt with <emphasis
role="bold">highlighted words</emphasis> that matches the query:
<xref
linkend="sect-Reference_Guide-Highlighting-DefaultXMLExcerpt"/>>.
- </para>
- </listitem>
- <listitem>
- <para>
- Search a term and its <emphasis
role="bold">synonyms</emphasis>: <xref
linkend="sect-Reference_Guide-Searching_Repository_Content-SynonymSearch"/>.
- </para>
- </listitem>
- <listitem>
- <para>
- Search <emphasis role="bold">similar</emphasis>
nodes: <xref
linkend="sect-Reference_Guide-Searching_Repository_Content-Similarity"/>.
- </para>
- </listitem>
- <listitem>
- <para>
- Check <emphasis role="bold">spelling</emphasis>
of a full text query statement: <xref
linkend="sect-Reference_Guide-Searching_Repository_Content-SpellChecker"/>.
- </para>
- </listitem>
- <listitem>
- <para>
- Define index <emphasis role="bold">aggregates and
rules</emphasis>: IndexingConfiguration.
- </para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-JDBC_Data_Container_Config">
- <title>Configuring the JDBC Data Container</title>
- <section
id="sect-Reference_Guide-JDBC_Data_Container_Config-Introduction">
- <title>Introduction</title>
- <para>
- eXo JCR persistent data container can work in two configuration modes:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <phrase>Multi-database</phrase>: One database for each
workspace (used in standalone eXo JCR service mode)
- </para>
- </listitem>
- <listitem>
- <para>
- <phrase>Single-database</phrase>: All workspaces
persisted in one database (used in embedded eXo JCR service mode, e.g. in eXo portal)
- </para>
- </listitem>
- </itemizedlist>
- <para>
- The data container uses the JDBC driver to communicate with the actual
database software, i.e. any JDBC-enabled data storage can be used with eXo JCR
implementation.
- </para>
- <para>
- Currently the data container is tested with the following RDBMS:
- </para>
-<!-- Source Metadata
-URL: NA (email from Nicholas Filetto to jbossexoD(a)googlegroups.com on 10/18/2011
-Author [w/email]: Nicholas Filetto: nicolas.filotto(a)exoplatform.com
-License: NA
- --> <table
id="tabl-Reference_Guide-Introduction-Supported-databases">
- <title>Supported databases</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Database </entry>
- <entry> Driver Version </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> IBM DB2 9.7 (FP5) </entry>
- <entry> IBM DB2 JDBC Universal Driver Architecture 4.13.80
</entry>
- </row>
- <row>
- <entry> Oracle 11g R1 (11.1.0.7.0) </entry>
- <entry> Oracle JDBC Driver 11.1.0.7 </entry>
- </row>
- <row>
- <entry> Oracle 11g R1 RAC (11.1.0.7.0) </entry>
- <entry> Oracle JDBC Driver 11.1.0.7 </entry>
- </row>
- <row>
- <entry> Oracle 11g R2 (11.2.0.3.0) </entry>
- <entry> Oracle JDBC Driver v11.2.0.3.0 </entry>
- </row>
- <row>
- <entry> Oracle 11g R2 RAC (11.2.0.3.0) </entry>
- <entry> Oracle JDBC Driver v11.2.0.3.0 </entry>
- </row>
- <row>
- <entry> MySQL 5.1 </entry>
- <entry> MySQL Connector/J 5.1.21 </entry>
- </row>
- <row>
- <entry> MySQL 5.5 </entry>
- <entry> MySQL Connector/J 5.1.21 </entry>
- </row>
- <row>
- <entry> Microsoft SQL Server 2008 </entry>
- <entry> Microsoft SQL Server JDBC Driver 3.0.1301.101, Microsoft
SQL Server JDBC Driver 4.0.2206.100 </entry>
- </row>
- <row>
- <entry> Microsoft SQL Server 2008 R2 </entry>
- <entry> Microsoft SQL Server JDBC Driver 3.0.1301.101, Microsoft
SQL Server JDBC Driver 4.0.2206.100 </entry>
- </row>
- <row>
- <entry> PostgreSQL 8.4.8 </entry>
- <entry> JDBC4 Postgresql Driver, Version 8.4-703 </entry>
- </row>
- <row>
- <entry> PostgreSQL 9.1.0 </entry>
- <entry> JDBC4 Postgresql Driver, Version 9.1-903 </entry>
- </row>
- <row>
- <entry> Sybase ASE 15.7 </entry>
- <entry> Sybase jConnect JDBC driver v7 </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <note>
- <title>Isolation Levels</title>
- <para>
- The JCR requires at least the
<parameter>READ_COMMITED</parameter> isolation level and other RDBMS
configurations can cause some side-effects and issues. So, please, make sure proper
isolation level is configured on database server side.
- </para>
- </note>
- <note>
- <para>
- One more mandatory JCR requirement for underlying databases is a case
sensitive collation. Microsoft SQL Server both 2005 and 2008 customers must configure
their server with collation corresponding to personal needs and requirements, but
obligatorily case sensitive. For more information please refer to Microsoft SQL Server
documentation page "Selecting a SQL Server Collation" <ulink
url="http://msdn.microsoft.com/en-us/library/ms144250.aspx">...
- </para>
- </note>
- <note>
- <para>
- Be aware that JCR does not support MyISAM storage engine for the MySQL
relational database management system.
- </para>
- </note>
- <para>
- Each database software supports ANSI SQL standards but also has its own
specifics. Therefore each database has its own configuration setting in the eXo JCR as a
database dialect parameter. More detailed configuration of the database can be set by
editing the metadata SQL-script files.
- </para>
- <remark>NEEDINFO - FILE PATHS - The path needs to be updated with the
equivalent path for JBoss Portal Platform instead of gatein, please see below para. New
info required?</remark>
- <para>
- You can find SQL-scripts in <filename>conf/storage/</filename>
directory of the
<filename><replaceable>JPP_HOME</replaceable>/modules/org/gatein/lib/main/exo.jcr.component.core-&JCR_VERSION;.jar</filename>
file .
- </para>
- <para>
- The following tables show the correspondence between the scripts and
databases:
- </para>
- <table id="tabl-Reference_Guide-Introduction-Single_database">
- <title>Single-database</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Database </entry>
- <entry> Script </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> MySQL DB </entry>
- <entry>
- <filename>jcr-sjdbc.mysql.sql</filename>
- </entry>
- </row>
- <row>
- <entry> MySQL DB with utf-8 </entry>
- <entry>
- <filename>jcr-sjdbc.mysql-utf8.sql</filename>
- </entry>
- </row>
- <row>
- <entry> PostgresSQL </entry>
- <entry>
- <filename>jcr-sjdbc.pqsql.sql</filename>
- </entry>
- </row>
- <row>
- <entry> Oracle DB </entry>
- <entry>
- <filename>jcr-sjdbc.ora.sql</filename>
- </entry>
- </row>
- <row>
- <entry> DB2 9.7 </entry>
- <entry>
- <filename>jcr-sjdbc.db2.sql</filename>
- </entry>
- </row>
- <row>
- <entry> MS SQL Server </entry>
- <entry>
- <filename>jcr-sjdbc.mssql.sql</filename>
- </entry>
- </row>
- <row>
- <entry> Sybase </entry>
- <entry>
- <filename>jcr-sjdbc.sybase.sql</filename>
- </entry>
- </row>
- <row>
- <entry> HSQLDB </entry>
- <entry>
- <filename>jcr-sjdbc.sql</filename>
- </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <table id="tabl-Reference_Guide-Introduction-Multi_database">
- <title>Multi-database</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Database </entry>
- <entry> Script </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> MySQL DB </entry>
- <entry>
- <filename>jcr-mjdbc.mysql.sql</filename>
- </entry>
- </row>
- <row>
- <entry> MySQL DB with utf-8 </entry>
- <entry>
- <filename>jcr-mjdbc.mysql-utf8.sql</filename>
- </entry>
- </row>
- <row>
- <entry> PostgresSQL </entry>
- <entry>
- <filename>jcr-mjdbc.pqsql.sql</filename>
- </entry>
- </row>
- <row>
- <entry> Oracle DB </entry>
- <entry>
- <filename>jcr-mjdbc.ora.sql</filename>
- </entry>
- </row>
- <row>
- <entry> DB2 9.7 </entry>
- <entry>
- <filename>jcr-mjdbc.db2.sql</filename>
- </entry>
- </row>
- <row>
- <entry> MS SQL Server </entry>
- <entry>
- <filename>jcr-mjdbc.mssql.sql</filename>
- </entry>
- </row>
- <row>
- <entry> Sybase </entry>
- <entry>
- <filename>jcr-mjdbc.sybase.sql</filename>
- </entry>
- </row>
- <row>
- <entry> HSQLDB </entry>
- <entry>
- <filename>jcr-mjdbc.sql</filename>
- </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <para>
- If a non-ANSI node name is used, you must use a database with MultiLanguage
support. Some JDBC drivers need additional parameters for establishing a Unicode friendly
connection. For example under mysql it is necessary to add an additional parameter for the
JDBC driver at the end of JDBC URL:
- </para>
- <para>
- There are preconfigured configuration files for HSQLDB. Look for these files
in /conf/portal and /conf/standalone folders of the jar-file
<package>exo.jcr.component.core-&JCR_VERSION;.jar</package> or
source-distribution of eXo JCR implementation.
- </para>
- <example
id="exam-Reference_Guide-Introduction-Example_Parameter">
- <title>Example Parameter</title>
-
<
programlisting><code>jdbc:mysql://exoua.dnsalias.net/portal?char...
- </example>
- <para>
- The configuration files are located in service jars
<filename>/conf/portal/configuration.xml</filename> (eXo services including
JCR Repository Service) and <filename>exo-jcr-config.xml</filename>
(repositories configuration) by default. In JBoss Portal Platform, the JCR is configured
in portal web application
<filename>portal/WEB-INF/conf/jcr/jcr-configuration.xml</filename> (JCR
Repository Service and related services) and
<filename>repository-configuration.xml</filename> (repositories
configuration).
- </para>
- <para>
- Read more about <xref
linkend="chap-Reference_Guide-JCR_configuration"/>.
- </para>
- </section>
- <section
id="sect-Reference_Guide-JDBC_Data_Container_Config-Multi_database_Configuration">
- <title>Multi-database Configuration</title>
- <para>
- You need to configure each workspace in a repository as part of
multi-database configuration. Databases may reside on remote servers as required.
- </para>
- <procedure>
- <title/>
- <step>
- <para>
- Configure the data containers in the
<literal>org.exoplatform.services.naming.InitialContextInitializer</literal>
service. It's the JNDI context initializer which registers (binds) naming
resources (DataSources) for data containers.
- </para>
- <para>
- For example (two data containers
<parameter>jdbcjcr</parameter> - local HSQLDB,
<parameter>jdbcjcr1</parameter> - remote MySQL):
- </para>
- <programlisting language="XML" role="XML">
-<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/example-1.xml" parse="text"/></programlisting>
- <substeps>
- <step>
- <para>
- Configure the database connection parameters:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <parameter>driverClassName</parameter>,
e.g. "org.hsqldb.jdbcDriver", "com.mysql.jdbc.Driver",
"org.postgresql.Driver"
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>url</parameter>, e.g.
"jdbc:hsqldb:file:target/temp/data/portal",
"jdbc:mysql://exoua.dnsalias.net/jcr"
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>username</parameter>, e.g.
"sa", "exoadmin"
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>password</parameter>, e.g.
"", "exo12321"
- </para>
- </listitem>
- </itemizedlist>
- </step>
- </substeps>
- <para>
- There can be connection pool configuration parameters
(org.apache.commons.dbcp.BasicDataSourceFactory):
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <parameter>maxActive</parameter>, e.g. 50
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>maxIdle</parameter>, e.g. 5
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>initialSize</parameter>, e.g. 5
- </para>
- </listitem>
- <listitem>
- <para>
- and other according to <ulink
url="http://jakarta.apache.org/commons/dbcp/configuration.html"... DBCP
configuration</ulink>
- </para>
- </listitem>
- </itemizedlist>
- </step>
- <step>
- <para>
- Configure the repository service. Each workspace will be configured
for its own data container.
- </para>
- <para>
- For example (two workspaces <parameter>ws</parameter> -
jdbcjcr, <parameter>ws1</parameter> - jdbcjcr1):
- </para>
- <programlisting language="XML" role="XML">
-<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/example-2.xml" parse="text"/></programlisting>
- <itemizedlist>
- <listitem>
- <para>
- <parameter>source-name</parameter>: A
javax.sql.DataSource name configured in InitialContextInitializer component (was
<parameter>sourceName</parameter> prior JCR 1.9);
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>dialect</parameter>: A database
dialect, one of <literal>hsqldb</literal>,
<literal>mysql</literal>, <literal>mysql-utf8</literal>,
<literal>pgsql</literal>, <literal>oracle</literal>,
<literal>oracle-oci</literal>, <literal>mssql</literal>,
<literal>sybase</literal>, <literal>derby</literal>,
<literal>db2</literal>, <literal>db2v8</literal> or
<literal>auto</literal> for dialect autodetection;
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>multi-db</parameter>: Enable
multi-database container with this parameter (set value "true");
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>max-buffer-size: A</parameter> a
threshold (in bytes) after which a <literal>javax.jcr.Value</literal> content
will be swapped to a file in a temporary storage. A swap for pending changes, for
example.
- </para>
- </listitem>
- <listitem>
- <para>
- <parameter>swap-directory</parameter>: A path in
the file system used to swap the pending changes.
- </para>
- </listitem>
- </itemizedlist>
- </step>
- </procedure>
- <para>
- This procedure configures two workspace which will be persistent in two
different databases (<emphasis>ws</emphasis> in HSQLDB and
<emphasis>ws1</emphasis> in MySQL).
- </para>
- </section>
- <section
id="sect-Reference_Guide-JDBC_Data_Container_Config-Single_database_Configuration">
- <title>Single-database Configuration</title>
- <para>
- Configuring a single-database data container is easier than configuring a
multi-database data container as only one naming resource must be configured.
- </para>
- <example
id="exam-Reference_Guide-Single_database_Configuration-jdbcjcr_Data_Container">
- <title>jdbcjcr Data Container</title>
- <programlisting language="XML">
-<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/example-3.xml" parse="text"/></programlisting>
- </example>
- <para>
- Configure repository workspaces with this one database. The
<parameter>multi-db</parameter> parameter must be set as
<literal>false</literal>.
- </para>
- <para>
- For example (two workspaces <parameter>ws</parameter> -
<literal>jdbcjcr</literal>, <parameter>ws1</parameter> -
<literal>jdbcjcr</literal>):
- </para>
- <example
id="exam-Reference_Guide-Single_database_Configuration-Example">
- <title>Example</title>
- <programlisting language="XML" role="XML">
-<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/example-4.xml" parse="text"/></programlisting>
- </example>
- <para>
- This configures two persistent workspaces in one database (PostgreSQL).
- </para>
- <section
id="sect-Reference_Guide-Single_database_Configuration-Configuration_without_DataSource">
- <title>Configuration without DataSource</title>
- <para>
- It is possible to configure the repository without binding
<literal>javax.sql.DataSource</literal> in the JNDI service if you have a
dedicated JDBC driver implementation with special features like XA transactions,
statements/connections pooling etc:
- </para>
- <procedure>
- <title/>
- <step>
- <para>
- Remove the configuration in
<literal>InitialContextInitializer</literal> for your database and configure a
new one directly in the workspace container.
- </para>
- </step>
- <step>
- <para>
- Remove parameter <parameter>source-name</parameter>
and add next lines instead. Describe your values for a JDBC driver, database URL and
username.
- </para>
- </step>
- </procedure>
- <warning>
- <title>Connection Pooling</title>
- <para>
- Ensure the JDBC driver provides connection pooling. Connection
pooling is strongly recommended for use with the JCR to prevent a database overload.
- </para>
- </warning>
- <programlisting language="XML"
role="XML"><workspace name="ws"
auto-init-root-nodetype="nt:unstructured">
- <container
class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="dialect"
value="hsqldb"/>
- <property name="driverliteral"
value="org.hsqldb.jdbcDriver"/>
- <property name="url"
value="jdbc:hsqldb:file:target/temp/data/portal"/>
- <property name="username"
value="su"/>
- <property name="password"
value=""/>
- ......</programlisting>
- </section>
- <section
id="sect-Reference_Guide-Single_database_Configuration-Dynamic_Workspace_Creation">
- <title>Dynamic Workspace Creation</title>
- <para>
- Workspaces can be added dynamically during runtime.
- </para>
- <para>
- This can be performed in two steps:
- </para>
- <procedure>
- <title/>
- <step>
- <para>
-
<literal>ManageableRepository.configWorkspace(WorkspaceEntry
wsConfig)</literal>: Register a new configuration in RepositoryContainer and create
a WorkspaceContainer.
- </para>
- </step>
- <step>
- <para>
- <literal>ManageableRepository.createWorkspace(String
workspaceName)</literal>: Creation a new workspace.
- </para>
- </step>
- </procedure>
- </section>
- </section>
- <section
id="sect-Reference_Guide-JDBC_Data_Container_Config-Simple_and_Complex_queries">
- <title>Simple and Complex queries</title>
- <para>
- eXo JCR provides two ways to interact with the database;
- </para>
- <variablelist>
- <title/>
- <varlistentry>
- <term>
- <literal>JDBCStorageConnection</literal>
- </term>
- <listitem>
- <para>
- Which uses simple queries. Simple queries do not use sub queries,
left or right joins. They are implemented in such a way as to support as many database
dialects as possible.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>
- <literal>CQJDBCStorageConection</literal>
- </term>
- <listitem>
- <para>
- Which uses complex queries. Complex queries are optimized to
reduce the number of database calls.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- <para>
- Simple queries will be used if you chose
<literal>org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer</literal>:
- </para>
- <programlisting language="XML"
role="XML"><workspaces>
- <workspace name="ws"
auto-init-root-nodetype="nt:unstructured">
- <container
class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- ...
- </workspace>
-</worksapces>
-</programlisting>
- <para>
- Complex queries will be used if you chose
<literal>org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer</literal>:
- </para>
- <programlisting language="XML"
role="XML"><workspaces>
- <workspace name="ws"
auto-init-root-nodetype="nt:unstructured">
- <container
class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
- ...
- </workspace>
-</worksapces></programlisting>
- </section>
- <section
id="sect-Reference_Guide-JDBC_Data_Container_Config-Force_Query_Hints">
- <title>Force Query Hints</title>
- <para>
- Some databases, such as Oracle and MySQL, support hints to increase query
performance. The eXo JCR has separate Complex Query implementations for the Orcale
database dialect, which uses query hints to increase performance for few important
queries.
- </para>
- <para>
- To enable this option, use the following configuration property:
- </para>
- <programlisting language="XML"
role="XML"><workspace name="ws"
auto-init-root-nodetype="nt:unstructured">
- <container
class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="dialect"
value="oracle"/>
- <property name="force.query.hints"
value="true" />
- ......</programlisting>
- <para>
- Query hints are only used for Complex Queries with the Oracle dialect. For
all other dialects this parameter is ignored.
- </para>
- </section>
- <section
id="sect-Reference_Guide-JDBC_Data_Container_Config-Notes_for_Microsoft_Windows_users">
- <title>Notes for Microsoft Windows users</title>
- <para>
- The current configuration of eXo JCR uses <ulink
url="http://commons.apache.org/dbcp/">Apache DBCP</ulink> connection
pool (<literal>org.apache.commons.dbcp.BasicDataSourceFactory</literal>).
- </para>
- <para>
- It is possible to set a high value for the
<parameter>maxActive</parameter> parameter in the
<filename>configuration.xml</filename> file. This creates a high use of TCP/IP
ports from a client machine inside the pool (the JDBC driver, for example). As a result,
the data container can throw exceptions like "<emphasis>Address already in
use</emphasis>".
- </para>
- <para>
- To solve this problem, you must configure the client's machine
networking software to use shorter timeouts for open TCP/IP ports.
- </para>
- <para>
- This is done by editing two registry keys within the
<parameter>HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters</parameter>
node. Both of these keys are unset by default. To set the keys as required:
- </para>
- <procedure>
- <title/>
- <step>
- <para>
- Set the <parameter>MaxUserPort</parameter> registry key
to <parameter>=dword:00001b58</parameter>. This sets the maximum of open ports
to 7000 or higher (the default is 5000).
- </para>
- </step>
- <step>
- <para>
- Set <parameter>TcpTimedWaitDelay</parameter> to
<parameter>=dword:0000001e</parameter>. This sets
<parameter>TIME_WAIT</parameter> parameter to 30 seconds (the default is
240).
- </para>
- </step>
- </procedure>
- <example
id="exam-Reference_Guide-Notes_for_Microsoft_Windows_users-Sample_Registry_File">
- <title>Sample Registry File</title>
- <programlisting>Windows Registry Editor Version 5.00
-
-[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
-"MaxUserPort"=dword:00001b58
-"TcpTimedWaitDelay"=dword:0000001e</programlisting>
- </example>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-External_Value_Storages">
- <title>External Value Storages</title>
- <section
id="sect-Reference_Guide-External_Value_Storages-Introduction">
- <title>Introduction</title>
- <para>
- JCR values are stored in the Workspace Data container by default. The eXo JCR
offers an additional option of storing JCR values separately from the Workspace Data
container which can help keep Binary Large Objects (BLOBs) separate.
- </para>
-<!-- <para>
- Value storage configuration is a part of the repository configuration. Refer
to <xref
linkend="sect-Reference_Guide-JCR_configuration-Example_of_the_portal_system_workspace"
/> for more details.
- </para> --> <para>
- Tree-based storage is recommended in most cases.
- </para>
-<!-- Not sure this is necessary
-<para>
-If you run an application on Amazon EC2 - the S3 option may be interesting for
architecture. Simple 'flat' storage is good in speed of creation/deletion of
values, it might be a compromise for a small storages.
-</para> --> </section>
- <section
id="sect-Reference_Guide-External_Value_Storages-Tree_File_Value_Storage">
- <title>Tree File Value Storage</title>
- <para>
- Tree File Value Storage holds values in tree-like file system files.
<property>Path</property> property points to the root directory to store the
files.
- </para>
- <para>
- This is a recommended type of external storage because it can contain large
amount of files limited only by disk/volume free space.
- </para>
- <para>
- However, using Tree File Value Storage can result in a higher time on value
deletion, due to the removal of unused tree-nodes.
- </para>
- <example>
- <title>Tree File Value Storage Configuration</title>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default25.xml"
parse="text"/></programlisting>
- <para>
- Comment #1: The <emphasis
role="bold">id</emphasis> is the value storage unique identifier, used
for linking with properties stored in a workspace container.
- </para>
- <para>
- Comment #2: the <emphasis
role="bold">path</emphasis> is a location where value files will be
stored.
- </para>
- </example>
- <para>
- Each file value storage can have the <function>filters</function>
for incoming values. A filter can match values by
<property>property-type</property>,
<property>property-name</property>,
<property>ancestor-path</property>. It can also match the size of values
stored (<property>min-value-size</property>) in bytes.
- </para>
- <para>
- In the previous example a filter with
<property>property-type</property> and
<property>min-value-size</property> has been used. This results in storage for
binary values with size greater of 1MB.
- </para>
- <para>
- It is recommended that properties with large values are stored in file value
storage only.
- </para>
- <para>
- The example below shows a value storage with different locations for large
files (<property>min-value-size</property> a 20Mb-sized filter).
- </para>
- <para>
- A value storage uses ORed logic in the process of filter selection. This
means the first filter in the list will be called first and if it is not matched the next
will be called, and so on.
- </para>
- <para>
- In this example a value matches the 20MB filter
<property>min-value-size</property> and will be stored in the path
"<literal>data/20Mvalues</literal>". All other filters will
be stored in "<literal>data/values</literal>".
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default26.xml"
parse="text"/></programlisting>
- </section>
- <section
id="sect-Reference_Guide-External_Value_Storages-Disabling_value_storage">
- <title>Disabling value storage</title>
- <para>
- The JCR allows you to disable value storage by adding the following property
into its configuration.
- </para>
- <programlisting language="XML"><property
name="enabled" value="false"
/></programlisting>
- <warning>
- <title>Warning</title>
- <para>
- It is recommended that this functionality be used for internal and
testing purpose only, and with caution, as all stored values will be inaccessible.
- </para>
- </warning>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Workspace_Data_Container">
- <title>Workspace Data Container</title>
- <para>
- Each Workspace of the JCR has its own persistent storage to hold that
workspace's items data. The eXo JCR can be configured so that it can use one or
more workspaces that are logical units of the repository content.
- </para>
- <para>
- The physical data storage mechanism is configured using mandatory element
<emphasis role="bold">container</emphasis>. The type of container is
described in the attribute <parameter>class =
<replaceable>fully_qualified_name_of_org.exoplatform.services.jcr.storage.WorkspaceDataContainer_subclass</replaceable></parameter>.
- </para>
- <example>
- <title>Physical Data Storage Configuration</title>
- <programlisting language="XML"
role="XML"><container
class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="source-name"
value="jdbcjcr1"/>
- <property name="dialect"
value="hsqldb"/>
- <property name="multi-db"
value="true"/>
- <property name="max-buffer-size"
value="200K"/>
- <property name="swap-directory"
value="target/temp/swap/ws"/>
- <property name="lazy-node-iterator-page-size"
value="50"/>
- <property name="acl-bloomfilter-false-positive-probability"
value="0.1d"/>
- <property name="acl-bloomfilter-elements-number"
value="1000000"/>
- </properties></programlisting>
- <para>
- <literal>source-name</literal>: The JDBC data source name
which is registered in JDNI by InitialContextInitializer. This was known as
<literal>sourceName</literal> in versions prior to 1.9.
- </para>
- <para>
- <literal>dialect</literal>: The database dialect. Must be
one of the following: <literal>hsqldb</literal>,
<literal>mysql</literal>, <literal>mysql-utf8</literal>,
<literal>pgsql</literal>, <literal>oracle</literal>,
<literal>oracle-oci</literal>, <literal>mssql</literal>,
<literal>sybase</literal>, <literal>derby</literal>,
<literal>db2</literal> or <literal>db2v8</literal>).
- </para>
- <para>
- <literal>multi-db</literal>: This parameter, if
<literal>true</literal>, enables multi-database container.
- </para>
- <para>
- <literal>max-buffer-size</literal>: A threshold in bytes.
If a value size is greater than this setting, then it will be spooled to a temporary
file.
- </para>
- <para>
- <literal>swap-directory</literal>: A location where the
value will be spooled if no value storage is configured but a
<literal>max-buffer-size</literal> is exceeded.
- </para>
- <para>
- <literal>lazy-node-iterator-page-size</literal>:
"Lazy" child nodes iterator settings. Defines size of page, the number
of nodes that are retrieved from persistent storage at once.
- </para>
- <para>
-
<literal>acl-bloomfilter-false-positive-probability</literal>: ACL
Bloom-filter settings. ACL Bloom-filter desired false positive probability. Range [0..1].
Default value 0.1d.
- </para>
- <para>
- <literal>acl-bloomfilter-elements-number</literal>: ACL
Bloom-filter settings. Expected number of ACL-elements in the Bloom-filter. Default value
1000000.
- </para>
- </example>
- <note>
- <para>
- Bloom filters are not supported by all the cache implementations so far only
the inplementation for infinispan supports it.
- </para>
- <para>
- Bloom-filter used to avoid read nodes that definitely do not have ACL.
<emphasis
role="bold">acl-bloomfilter-false-positive-probability</emphasis> and
<emphasis role="bold">acl-bloomfilter-elements-number</emphasis>
used to configure such filters. Bloom filters are not supported by all the cache
implementations so far only the inplementation for infinispan supports it.
- </para>
- <para>
- More about Bloom filters you can read here <ulink
url="http://en.wikipedia.org/wiki/Bloom_filter">http://en.wi...;.
- </para>
- </note>
- <para>
- The eXo JCR has a JDBC-based, relational database, production ready <emphasis
role="bold">Workspace Data Container</emphasis>.
- </para>
- <para>
- Workspace Data Container <emphasis>may</emphasis> support external
storages for <literal>javax.jcr.Value</literal> (which can be the case for
BLOB values for example) using the optional element
<literal>value-storages</literal>.
- </para>
- <para>
- The Data Container will try to read or write a Value using the underlying value
storage plug-in if the filter criteria (see below) match the current property.
- </para>
- <example>
- <title>External Value Storage Configuration</title>
- <programlisting language="XML"
role="XML"><value-storages>
- <value-storage id="Storage #1"
class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
- <properties>
- <property name="path"
value="data/values"/>
- </properties>
- <filters>
- <filter property-type="Binary"
min-value-size="1M"/><!-- Values large of 1Mbyte
-->
- </filters>
-.........
-</value-storages></programlisting>
- <para>
- <literal>value-storage</literal> is the subclass of
<literal>org.exoplatform.services.jcr.storage.value.ValueStoragePlugin</literal>
and <literal>properties</literal> are optional plug-in specific parameters.
- </para>
- <para>
- <literal>filters</literal>: Each file value storage can
have the filter(s) for incoming values. If there are several filter criteria, they all
have to match (AND-Condition).
- </para>
- </example>
- <para>
- A filter can match values by property type (property-type), property
name (property-name), ancestor path (ancestor-path) and/or the size of values stored
(min-value-size, e.g. 1M, 4.2G, 100 (bytes)).
- </para>
- <para>
- In a code sample, we use a filter with property-type and
min-value-size only. That means that the storage is only for binary values whose size is
greater than 1Mbyte.
- </para>
- <para>
- It is recommended that you store properties with large values in a
file value storage only.
- </para>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Cluster_Configuration">
- <title>Configuring Cluster</title>
- <section
id="sect-Reference_Guide-Cluster_Configuration-Launching_Cluster">
- <title>Launching Cluster</title>
- <section
id="sect-Reference_Guide-Launching_Cluster-Configuring_JCR_to_use_external_configuration">
- <title>Configuring JCR to use external configuration</title>
- <itemizedlist>
- <listitem>
- <para>
- To manually configure a repository, create a new configuration
file (<filename>exo-jcr-configuration.xml</filename> for example). For
details, see <xref linkend="chap-Reference_Guide-JCR_configuration"/>.
- </para>
- <para>
- The configuration file must be formatted as follows:
- </para>
- <example>
- <title>External Configuration</title>
- <programlisting language="XML"
role="XML"><repository-service
default-repository="repository1">
- <repositories>
- <repository name="repository1"
system-workspace="ws1"
default-workspace="ws1">
- <security-domain>exo-domain</security-domain>
- <access-control>optional</access-control>
-
<authentication-policy>org.exoplatform.services.jcr.impl.core.access.JAASAuthenticator</authentication-policy>
- <workspaces>
- <workspace name="ws1">
- <container
class="org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
- <properties>
- <property name="source-name"
value="jdbcjcr" />
- <property name="dialect"
value="oracle" />
- <property name="multi-db"
value="false" />
- <property name="update-storage"
value="false" />
- <property name="max-buffer-size"
value="200k" />
- <property name="swap-directory"
value="../temp/swap/production" />
- </properties>
- <value-storages>
- <![CDATA[<!-- Comment #1 -->]]>
- </value-storages>
- </container>
- <initializer
class="org.exoplatform.services.jcr.impl.core.ScratchWorkspaceInitializer">
- <properties>
- <property name="root-nodetype"
value="nt:unstructured" />
- </properties>
- </initializer>
- <cache enabled="true"
class="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache">
- <![CDATA[<!-- Comment #2 -->]]>
- </cache>
- <query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <![CDATA[<!-- Comment #3 -->]]>
- </query-handler>
- <lock-manager
class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
- <![CDATA[<!-- Comment #4 -->]]>
- </lock-manager>
- </workspace>
- <workspace name="ws2">
- ...
- </workspace>
- <workspace name="wsN">
- ...
- </workspace>
- </workspaces>
- </repository>
- </repositories>
-</repository-service></programlisting>
- <para>
- Comment #1: Refer to <xref
linkend="exam-Reference_Guide-Configuration_requirements-Value_Storage_configuration"/>.
- </para>
- <para>
- Comment #2: Refer to <xref
linkend="exam-Reference_Guide-Configuration_requirements-Cache_configuration"/>.
- </para>
- <para>
- Comment #3: Refer to <xref
linkend="exam-Reference_Guide-Configuration_requirements-Indexer_configuration"/>.
- </para>
- <para>
- Comment #4: Refer to <xref
linkend="exam-Reference_Guide-Configuration_requirements-Lock_Manager_configuration"/>.
- </para>
- </example>
- </listitem>
- <listitem>
- <para>
- Then, update
<parameter>RepositoryServiceConfiguration</parameter> configuration in the
<filename>exo-configuration.xml</filename> to reference your file:
- </para>
- <programlisting language="XML"
role="XML"><component>
-
<key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
-
<type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
- <init-params>
- <value-param>
- <name>conf-path</name>
- <description>JCR configuration file</description>
- <value>exo-jcr-configuration.xml</value>
- </value-param>
- </init-params>
-</component></programlisting>
- </listitem>
- </itemizedlist>
- </section>
- </section>
- <section
id="sect-Reference_Guide-Cluster_Configuration-Requirements">
- <title>Requirements</title>
- <section
id="sect-Reference_Guide-Requirements-Environment_requirements">
- <title>Environment requirements</title>
- <itemizedlist>
- <listitem>
- <para>
- Every node of the cluster <emphasis
role="bold">must</emphasis> have the same mounted Network File System
(<abbrev>NFS</abbrev>) with the read and write permissions on it.
- </para>
- </listitem>
- <listitem>
- <para>
- Every node of cluster <emphasis
role="bold">must</emphasis> use the same database.
- </para>
- </listitem>
- <listitem>
- <para>
- The same Clusters on different nodes <emphasis
role="bold">must</emphasis> have the same names.
- </para>
- <example
id="exam-Reference_Guide-Environment_requirements-Example">
- <title>Example</title>
- <para>
- If the <emphasis>Indexer</emphasis> cluster in
the <emphasis>production</emphasis> workspace on the first node is named
<literal>production_indexer_cluster</literal>, then
<emphasis>indexer</emphasis> clusters in the
<emphasis>production</emphasis> workspace on all other nodes <emphasis
role="bold">must</emphasis> also be named
<literal>production_indexer_cluster</literal>.
- </para>
- </example>
- </listitem>
- </itemizedlist>
- </section>
- <section
id="sect-Reference_Guide-Requirements-Configuration_requirements">
- <title>Configuration requirements</title>
- <para>
- The configuration of every workspace in the repository must contain the
following elements:
- </para>
- <example
id="exam-Reference_Guide-Configuration_requirements-Value_Storage_configuration">
- <title>Value Storage configuration</title>
- <programlisting language="XML"
role="XML"><value-storages>
- <value-storage id="system"
class="org.exoplatform.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
- <properties>
- <property name="path"
value="/mnt/tornado/temp/values/production" /> <!--path
within NFS where ValueStorage will hold it's data-->
- </properties>
- <filters>
- <filter property-type="Binary" />
- </filters>
- </value-storage>
-</value-storages></programlisting>
- </example>
- <example
id="exam-Reference_Guide-Configuration_requirements-Cache_configuration">
- <title>Cache configuration</title>
- <programlisting language="XML"
role="XML"><cache enabled="true"
class="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache">
- <properties>
- <property name="jbosscache-configuration"
value="jar:/conf/portal/test-jbosscache-data.xml" />
<!-- path to JBoss Cache configuration for data storage -->
- <property name="jgroups-configuration"
value="jar:/conf/portal/udp-mux.xml" />
<!-- path to JGroups configuration -->
- <property name="jbosscache-cluster-name"
value="JCR_Cluster_cache_production" />
<!-- JBoss Cache data storage cluster name -->
- <property name="jgroups-multiplexer-stack"
value="true" />
- </properties>
-</cache></programlisting>
- </example>
- <example
id="exam-Reference_Guide-Configuration_requirements-Indexer_configuration">
- <title>Indexer configuration</title>
- <programlisting language="XML"
role="XML"><query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <properties>
- <property name="changesfilter-class"
value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter"
/>
- <property name="index-dir"
value="/mnt/tornado/temp/jcrlucenedb/production" />
<!-- path within NFS where ValueStorage will hold it's data
-->
- <property name="jbosscache-configuration"
value="jar:/conf/portal/test-jbosscache-indexer.xml" />
<!-- path to JBoss Cache configuration for indexer -->
- <property name="jgroups-configuration"
value="jar:/conf/portal/udp-mux.xml" />
<!-- path to JGroups configuration -->
- <property name="jbosscache-cluster-name"
value="JCR_Cluster_indexer_production" />
<!-- JBoss Cache indexer cluster name -->
- <property name="jgroups-multiplexer-stack"
value="true" />
- </properties>
-</query-handler></programlisting>
- </example>
- <example
id="exam-Reference_Guide-Configuration_requirements-Lock_Manager_configuration">
- <title>Lock Manager configuration</title>
- <programlisting language="XML"
role="XML"><lock-manager
class="org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl">
- <properties>
- <property name="time-out" value="15m"
/>
- <property name="jbosscache-configuration"
value="jar:/conf/portal/test-jbosscache-lock.xml" />
<!-- path to JBoss Cache configuration for lock manager -->
- <property name="jgroups-configuration"
value="jar:/conf/portal/udp-mux.xml" />
<!-- path to JGroups configuration -->
- <property name="jgroups-multiplexer-stack"
value="true" />
- <property name="jbosscache-cluster-name"
value="JCR_Cluster_lock_production" />
<!-- JBoss Cache locks cluster name -->
-
- <property name="jbosscache-cl-cache.jdbc.table.name"
value="jcrlocks_production"/> <!--
the name of the DB table where lock's data will be stored -->
- <property name="jbosscache-cl-cache.jdbc.table.create"
value="true"/>
- <property name="jbosscache-cl-cache.jdbc.table.drop"
value="false"/>
- <property name="jbosscache-cl-cache.jdbc.table.primarykey"
value="jcrlocks_production_pk"/>
- <property name="jbosscache-cl-cache.jdbc.fqn.column"
value="fqn"/>
- <property name="jbosscache-cl-cache.jdbc.node.column"
value="node"/>
- <property name="jbosscache-cl-cache.jdbc.parent.column"
value="parent"/>
- <property name="jbosscache-cl-cache.jdbc.datasource"
value="jdbcjcr"/>
- </properties>
-</lock-manager></programlisting>
- </example>
- </section>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-JBoss_Cache_configuration">
- <title>Configuring JBoss Cache</title>
- <section
id="sect-Reference_Guide-JBoss_Cache_configuration-Indexer_lock_manager_and_data_container_configuration">
- <title>Indexer, lock manager and data container
configuration</title>
- <para>
- Each mentioned component uses instances of the JBoss Cache product for
caching in clustered environment. So every element has its own transport and has to be
configured correctly. As usual, workspaces have similar configuration differing only in
cluster-names (and, possibly, some other parameters). The simplest way to configure them
is to define their own configuration files for each component in each workspace:
- </para>
- <programlisting language="XML"
role="XML"><property name="jbosscache-configuration"
value="conf/standalone
- /test-jbosscache-lock-db1-ws1.xml" /></programlisting>
- <para>
- But if there are few workspaces, configuring them in such a way can be
painful and hard-manageable. eXo JCR offers a template-based configuration for JBoss Cache
instances. You can have one template for Lock Manager, one for Indexer and one for data
container and use them in all the workspaces, defining the map of substitution parameters
in a main configuration file. Just simply define ${jbosscache-<parameter
name>} inside xml-template and list correct value in JCR configuration file just
below "jbosscache-configuration", as shown:
- </para>
- <para>
- Template:
- </para>
- <programlisting language="XML" role="XML">...
-<clustering mode="replication"
clusterName="${jbosscache-cluster-name}">
- <stateRetrieval timeout="20000"
fetchInMemoryState="false" />
-...</programlisting>
- <para>
- and JCR configuration file:
- </para>
- <programlisting language="XML" role="XML">...
-<property name="jbosscache-configuration"
value="jar:/conf/portal/jbosscache-lock.xml" />
-<property name="jbosscache-cluster-name"
value="JCR-cluster-locks-db1-ws" />
-...</programlisting>
- </section>
- <section
id="sect-Reference_Guide-JBoss_Cache_configuration-JGroups_configuration">
- <title>JGroups configuration</title>
- <para>
- JGroups is used by JBoss Cache for network communications and transport in a
clustered environment. If the property is defined in component configuration, it will be
injected into the JBoss Cache instance on start up.
- </para>
- <programlisting language="XML"
role="XML"><property name="jgroups-configuration"
value="your/path/to/modified-udp.xml" /></programlisting>
- <para>
- As outlined above, each component (lock manager, data container and query
handler) for each workspace requires its own clustered environment. In other words, they
have their own clusters with unique names.
- </para>
- <para>
- Each cluster should, by default, perform multi-casts on a separate port. This
configuration leads to much unnecessary overhead on cluster. This is why JGroups offers a
multiplexer feature, providing ability to use one single channel for set of clusters.
- </para>
- <para>
- The multiplexer reduces network overheads and increase performance and
stability of application. To enable multiplexer stack, you should define appropriate
configuration file (<filename>upd-mux.xml</filename> is pre-shipped one with
eXo JCR) and set "jgroups-multiplexer-stack" into
"true".
- </para>
- <programlisting language="XML"
role="XML"><property name="jgroups-configuration"
value="jar:/conf/portal/udp-mux.xml" />
-<property name="jgroups-multiplexer-stack"
value="true" /></programlisting>
- </section>
- <section
id="sect-Reference_Guide-JBoss_Cache_configuration-Sharing_JBoss_Cache_instances">
- <title>Sharing JBoss Cache instances</title>
- <para>
- As a single JBoss Cache instance can be demanding on resources, and the
default setup will have an instance each for the indexer, the lock manager and the data
container on each workspace, an environment that uses multiple workspace may benefit from
sharing a JBoss Cache instance between several instances of the same type (the lock
manager instance, for example).
- </para>
- <para>
- This feature is disabled by default and can be enabled at the component
configuration level by setting the <parameter>jbosscache-shareable</parameter>
property to <literal>true</literal>:
- </para>
- <programlisting language="XML"
role="XML"><property name="jbosscache-shareable"
value="true" /></programlisting>
- <para>
- Once enabled, this feature will allow the JBoss Cache instance used by a
component to be re-used by another components of the same type with the same JBoss Cache
configuration (with the exception of the eviction configuration, which can differ).
- </para>
- <para>
- This means that all the parameters of type
<parameter>jbosscache-<replaceable><PARAM_NAME></replaceable></parameter>
must be identical between the components of same type of different workspaces.
- </para>
- <para>
- Therefore, if you can use the same values for the parameters in each
workspace, you only need three JBoss Cache instances (one instance each for the indexer,
lock manager and data container) running at once. This can relieve resource stress
significantly.
- </para>
- </section>
- <section
id="sect-Reference_Guide-JBoss_Cache_configuration-Shipped_JBoss_Cache_configuration_templates">
- <title>Shipped JBoss Cache configuration templates</title>
- <para>
- The eXo JCR implementation is shipped with ready-to-use JBoss Cache
configuration templates for JCR's components. They are located in
<filename><replaceable>JPP_HOME</replaceable>/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/jbosscache</filename>
directory, inside either the <filename>cluster</filename> or
<filename>local</filename> directory.
- </para>
- <section
id="sect-Reference_Guide-Shipped_JBoss_Cache_configuration_templates-Data_container_template">
- <title>Data container template</title>
- <para>
- The data container template is
<filename>config.xml</filename>:
- </para>
- <programlisting language="XML"
role="XML"><?xml version="1.0"
encoding="UTF-8"?>
-<jbosscache
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:jboss:jbosscache-core:config:3.1">
-
- <locking useLockStriping="false"
concurrencyLevel="50000"
lockParentForChildInsertRemove="false"
- lockAcquisitionTimeout="20000" />
-
- <clustering mode="replication"
clusterName="${jbosscache-cluster-name}">
- <stateRetrieval timeout="20000"
fetchInMemoryState="false" />
- <jgroupsConfig multiplexerStack="jcr.stack" />
- <sync />
- </clustering>
-
- <!-- Eviction configuration -->
- <eviction wakeUpInterval="5000">
- <default
algorithmClass="org.jboss.cache.eviction.LRUAlgorithm"
-
actionPolicyClass="org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.ParentNodeEvictionActionPolicy"
- eventQueueSize="1000000">
- <property name="maxNodes"
value="1000000" />
- <property name="timeToLive"
value="120000" />
- </default>
- </eviction>
-</jbosscache></programlisting>
- </section>
- <section
id="sect-Reference_Guide-Shipped_JBoss_Cache_configuration_templates-Lock_manager_template">
- <title>Lock manager template</title>
- <para>
- The lock manager template is
<filename>lock-config.xml</filename>:
- </para>
- <programlisting language="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/lock-config.xml_code"
parse="text"/></programlisting>
- </section>
- <section
id="sect-Reference_Guide-Shipped_JBoss_Cache_configuration_templates-Query_handler_indexer_template">
- <title>Query handler (indexer) template</title>
- <para>
- The query handler template is called
<filename>indexer-config.xml</filename>:
- </para>
- <programlisting language="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="extras/indexer-config.xml_code"
parse="text"/></programlisting>
- </section>
- </section>
- </section>
- <section xmlns="" id="chap-Reference_Guide-LockManager">
- <title>LockManager</title>
- <para>
- The LockManager stores lock objects. It can lock or release objects as required.
It is also responsible for removing stale locks.
- </para>
- <para>
- The LockManager in JBoss Portal Platform is implemented with
<classname>org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockManagerImpl</classname>.
- </para>
- <para>
- It is enabled by adding <literal>lock-manager-configuration</literal>
to <literal>workspace-configuration</literal>.
- </para>
- <para>
- For example:
- </para>
- <programlisting language="XML" role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default47.xml"
parse="text"/></programlisting>
- <section
id="sect-Reference_Guide-LockManager-CacheableLockManagerImpl">
- <title>CacheableLockManagerImpl</title>
- <para>
- <classname>CacheableLockManagerImpl</classname> stores lock
objects in JBoss-cache (which implements JDBCCacheLoader to store locks in a database).
This means its locks are replicable and can affect an entire cluster rather than just a
single node.
- </para>
- <para>
- The length of time LockManager allows a lock to remain in place can be
configured with the "<literal>time-out</literal>" property.
- </para>
- <para>
- The LockRemover thread periodically polls LockManager for locks that have
passed the time-out limit and must be removed.
- </para>
- <para>
- The time-out for LockRemover is set as follows (the default value is 30m):
- </para>
- <programlisting language="XML"><properties>
- <property name="time-out" value="10m"
/>
- ...
-</properties></programlisting>
-<!-- Doesn't seem necessary
-<formalpara>
-<title>Configuration</title>
-<para>
-Replication requirements are same as for Cache
-</para>
-</formalpara>
-<para>
-Full JCR configuration example can be seen in <xref
linkend="sect-Reference_Guide-Clustering_with_JBoss_Application_Server_REMOVABLE"/>.
-</para>
-<title>Configuration Tips:</title>
-<listitem>
-<para>
-The <parameter>clusterName</parameter> ("jbosscache-cluster-name")
must be unique;
-</para>
-</listitem>
-<listitem>
-<para>
-The <parameter>cache.jdbc.table.name</parameter> must be unique per
datasource;
-</para>
-</listitem>
-<listitem>
-<para>
-The <parameter>cache.jdbc.fqn.type</parameter> and
<parameter>cache.jdbc.node.type</parameter> parameters must be configured
according to the database being used.
-</para>
-</listitem>
-</itemizedlist> --> <para>
- There are a number of ways to configure
<classname>CacheableLockManagerImpl</classname>. Each involves configuring
JBoss Cache and JDBCCacheLoader.
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <xref
linkend="sect-Reference_Guide-CacheableLockManagerImpl-Simple_JBoss_Cache_Configuration"/>
- </para>
- </listitem>
- <listitem>
- <para>
- <xref
linkend="sect-Reference_Guide-CacheableLockManagerImpl-Template_JBoss_Cache_Configuration"/>
- </para>
- </listitem>
- </itemizedlist>
- <para>
- Refer to <ulink
url="http://community.jboss.org/wiki/JBossCacheJDBCCacheLoader"...
for more information about JBoss Cache and JDBCCacheLoader.
- </para>
- <section
id="sect-Reference_Guide-CacheableLockManagerImpl-Simple_JBoss_Cache_Configuration">
- <title>Simple JBoss Cache Configuration</title>
- <para>
- One method to configure the LockManager is to put a JBoss Cache
configuration file path into <classname>CacheableLockManagerImpl</classname>.
- </para>
- <note>
- <para>
- This is not the most efficient method for configuring the LockManager
as it requires a JBoss Cache configuration file for each LockManager configuration in each
workspace of each repository. The configuration set up can subsequently become quite
difficult to manage.
- </para>
- <para>
- This method is useful, however, if a single, specially configured
LockManager is required.
- </para>
- </note>
- <para>
- The required configuration is shown in the example below:
- </para>
- <programlisting language="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default49.xml"
parse="text"/></programlisting>
- <para>
- Sample content of the
<replaceable>jbosscache-lock-config.xml</replaceable> file specified in the
<replaceable>jbosscache-configuration</replaceable> property is shown in the
code example below.
- </para>
- <example>
- <title>Sample Content of the jbosscache-lock-config.xml
File</title>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default50.xml"
parse="text"/></programlisting>
- <para>
- Comment #1: The cluster name at <parameter>clustering
mode="replication"
clusterName="JBoss-Cache-Lock-Cluster_Name"</parameter> must be
unique;
- </para>
- <para>
- Comment #2: The
<parameter>cache.jdbc.table.name</parameter> must be unique per datasource.
- </para>
- <para>
- Comment #3: The
<parameter>cache.jdbc.node.type</parameter> and
<parameter>cache.jdbc.fqn.type</parameter> parameters must be configured
according to the database in use. Refer to the table below for information about data
types.
- </para>
- </example>
- <table
id="tabl-Reference_Guide-Simple_JBoss_Cache_Configuration-Data_Types_in_Different_Databases">
- <title>Data Types in Different Databases</title>
- <tgroup cols="3">
- <thead>
- <row>
- <entry> DataBase name </entry>
- <entry> Node data type </entry>
- <entry> FQN data type </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> default </entry>
- <entry> BLOB </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- <row>
- <entry> HSSQL </entry>
- <entry> OBJECT </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- <row>
- <entry> MySQL </entry>
- <entry> LONGBLOB </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- <row>
- <entry> ORACLE </entry>
- <entry> BLOB </entry>
- <entry> VARCHAR2(512) </entry>
- </row>
- <row>
- <entry> PostgreSQL </entry>
- <entry> bytea </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- <row>
- <entry> MSSQL </entry>
- <entry> VARBINARY(MAX) </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- <row>
- <entry> DB2 </entry>
- <entry> BLOB </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- <row>
- <entry> Sybase </entry>
- <entry> IMAGE </entry>
- <entry> VARCHAR(512) </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
- <section
id="sect-Reference_Guide-CacheableLockManagerImpl-Template_JBoss_Cache_Configuration">
- <title>Template JBoss Cache Configuration</title>
- <para>
- Another method to configure LockManager is to use a JBoss Cache
configuration template for all LockManagers.
- </para>
- <para>
- Below is an example
<filename>test-jbosscache-lock.xml</filename> template file:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/you.xml"
parse="text"/></programlisting>
- <para>
- The parameters that will populate the above file are shown below:
- </para>
- <example>
- <title>JBoss Cache Configuration Parameters</title>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default51.xml"
parse="text"/></programlisting>
- <para>
- Comment #1: The
<literal>jgroups-configuration</literal> has been moved to a separate
configuration file (<filename>udp-mux.xml</filename>, shown below). In this
case the <filename>udp-mux.xml</filename> is a common configuration for all
JGroup components (QueryHandler, cache, LockManager), but this is not a requirement of the
configuration method.
- </para>
- <para>
- Comment #2: The
<parameter>jbosscache-cl-cache.jdbc.fqn.column</parameter> and
<parameter>jbosscache-cl-cache.jdbc.node.type</parameter> parameters are not
explicitly defined as <parameter>cache.jdbc.fqn.type</parameter> and
<parameter>cache.jdbc.node.type</parameter> are defined in the JBoss Cache
configuration.
- </para>
- </example>
- <para>
- Refer to <xref
linkend="tabl-Reference_Guide-Simple_JBoss_Cache_Configuration-Data_Types_in_Different_Databases"/>
for information about setting these parameters or set them as
<parameter>AUTO</parameter> and the data type will by detected automatically.
- </para>
- <para>
- <filename>udp-mux.xml</filename>:
- </para>
- <programlisting language="XML"
role="XML"><xi:include
xmlns:xi="http://www.w3.org/2001/XInclude" href="extras/default52.xml"
parse="text"/></programlisting>
- </section>
- <section
id="sect-Reference_Guide-CacheableLockManagerImpl-Lock_migration_from_1.12.x">
- <title>Lock Migration</title>
- <para>
- There are three options available:
- </para>
- <variablelist
id="vari-Reference_Guide-Lock_migration_from_1.12.x-Lock_Migration_Options">
- <title>Lock Migration Options</title>
- <varlistentry>
- <term>When new Shareable Cache feature is not going to be used and
all locks should be kept after migration.</term>
- <listitem>
- <procedure>
- <title/>
- <step>
- <para>
- Ensure that the same lock tables are used in
configuration
- </para>
- </step>
- <step>
- <para>
- Start the server
- </para>
- </step>
- </procedure>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>When new Shareable Cache feature is not going to be used and
all locks should be removed after migration.</term>
- <listitem>
- <procedure>
- <title/>
- <step>
- <para>
- Ensure that the same lock tables used in
configuration
- </para>
- </step>
- <step>
- <para>
- Start the sever WITH system property:
- </para>
- <programlisting>-Dorg.exoplatform.jcr.locks.force.remove=true
-</programlisting>
- </step>
- <step>
- <para>
- Stop the server
- </para>
- </step>
- <step>
- <para>
- Start the server WITHOUT system property:
- </para>
- <programlisting>-Dorg.exoplatform.jcr.locks.force.remove
-</programlisting>
- </step>
- </procedure>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>When new Shareable Cache feature will be used (in this case all
locks are removed after migration).</term>
- <listitem>
- <procedure>
- <title/>
- <step>
- <para>
- Start the sever WITH system property:
- </para>
- <programlisting>-Dorg.exoplatform.jcr.locks.force.remove=true
-</programlisting>
- </step>
- <step>
- <para>
- Stop the server.
- </para>
- </step>
- <step>
- <para>
- Start the server WITHOUT system property:
- </para>
- <programlisting>-Dorg.exoplatform.jcr.locks.force.remove
-</programlisting>
- </step>
- <step>
- <title>Optional:</title>
- <para>
- Manually remove old tables for lock.
- </para>
- </step>
- </procedure>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-QueryHandler_configuration">
- <title>Configuring QueryHandler</title>
- <section
id="sect-Reference_Guide-QueryHandler_configuration-Indexing_in_clustered_environment">
- <title>Indexing in clustered environment</title>
- <para>
- JCR offers indexing strategies for clustered environments using the
advantages of running in a single JVM or doing the best to use all resources available in
cluster. JCR uses Lucene library as underlying search and indexing engine, but it has
several limitations that greatly reduce possibilities and limits the usage of cluster
advantages. That's why eXo JCR offers two strategies that are suitable for
it's own usecases. They are clustered with shared index and clustered with local
indexes. Each one has it's pros and cons.
- </para>
- <para>
- Clustered implementation with local indexes combines in-memory buffer index
directory with delayed file-system flushing. This index is called
"Volatile" and it is invoked in searches also. Within some conditions
volatile index is flushed to the persistent storage (file system) as new index directory.
This allows to achieve great results for write operations.
- </para>
- <figure>
- <title id="diagramlocalindez">Local Index
Diagram</title>
- <mediaobject>
- <imageobject>
- <imagedata width="444" align="center"
fileref="images/diagram-local-index.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- <para>
- As this implementation designed for clustered environment it has additional
mechanisms for data delivery within cluster. Actual text extraction jobs done on the same
node that does content operations (i.e. write operation). Prepared
"documents" (Lucene term that means block of data ready for indexing)
are replicated withing cluster nodes and processed by local indexes. So each cluster
instance has the same index content. When new node joins the cluster it has no initial
index, so it must be created. There are some supported ways of doing this operation. The
simplest is to simply copy the index manually but this is not intended for use. If no
initial index found JCR uses automated scenarios. They are controlled via configuration
(see "index-recovery-mode" parameter) offering full re-indexing from
database or copying from another cluster node.
- </para>
- <para>
- For some reasons having a multiple index copies on each instance can be
costly. So shared index can be used instead (see diagram below).
- </para>
- <figure>
- <title id="diagramsharedindex">Shared Index
Diagram</title>
- <mediaobject>
- <imageobject>
- <imagedata width="444" align="center"
fileref="images/diagram-shared-index.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- <para>
- This indexing strategy combines advantages of in-memory index along with
shared persistent index offering "near" real time search capabilities.
This means that newly added content is accessible via search practically immediately. This
strategy allows nodes to index data in their own volatile (in-memory) indexes, but
persistent indexes are managed by single "coordinator" node only. Each
cluster instance has a read access for shared index to perform queries combining search
results found in own in-memory index also. Take in account that shared folder must be
configured in your system environment (i.e. mounted NFS folder). But this strategy in some
extremely rare cases can have a bit different volatile indexes within cluster instances
for a while. In a few seconds they will be up2date.
- </para>
- <para>
- See more about <xref
linkend="chap-Reference_Guide-Search_Configuration"/> .
- </para>
- </section>
- <section
id="sect-Reference_Guide-QueryHandler_configuration-Configuration">
- <title>Configuration</title>
- <section
id="sect-Reference_Guide-Configuration-Query_handler_configuration_overview">
- <title>Query-handler configuration overview</title>
- <para>
- Configuration example:
- </para>
- <programlisting language="XML"><workspace
name="ws">
- <query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <properties>
- <property name="index-dir"
value="shareddir/index/db1/ws" />
- <property name="changesfilter-class"
-
value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter"
/>
- <property name="jbosscache-configuration"
value="jbosscache-indexer.xml" />
- <property name="jgroups-configuration"
value="udp-mux.xml" />
- <property name="jgroups-multiplexer-stack"
value="true" />
- <property name="jbosscache-cluster-name"
value="JCR-cluster-indexer-ws" />
- <property name="max-volatile-time"
value="60" />
- <property name="rdbms-reindexing"
value="true" />
- <property name="reindexing-page-size"
value="1000" />
- <property name="index-recovery-mode"
value="from-coordinator" />
- <property name="index-recovery-filter"
value="org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter"
/>
- </properties>
- </query-handler>
-</workspace>
-</programlisting>
- <table
id="tabl-Reference_Guide-Query_handler_configuration_overview-Configuration_properties">
- <title>Configuration properties</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Property name </entry>
- <entry> Description </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> index-dir </entry>
- <entry> path to index </entry>
- </row>
- <row>
- <entry> changesfilter-class </entry>
- <entry> template of JBoss-cache configuration for all
query-handlers in repository </entry>
- </row>
- <row>
- <entry> jbosscache-configuration </entry>
- <entry> template of JBoss-cache configuration for all
query-handlers in repository </entry>
- </row>
- <row>
- <entry> jgroups-configuration </entry>
- <entry> jgroups-configuration is template configuration for all
components (search, cache, locks) [Add link to document describing template
configurations] </entry>
- </row>
- <row>
- <entry> jgroups-multiplexer-stack </entry>
- <entry> [TODO about jgroups-multiplexer-stack - add link to JBoss
doc] </entry>
- </row>
- <row>
- <entry> jbosscache-cluster-name </entry>
- <entry> cluster name (must be unique) </entry>
- </row>
- <row>
- <entry> max-volatile-time </entry>
- <entry> max time to live for Volatile Index </entry>
- </row>
- <row>
- <entry> rdbms-reindexing </entry>
- <entry> indicate that need to use rdbms reindexing mechanism if
possible, the default value is true </entry>
- </row>
- <row>
- <entry> reindexing-page-size </entry>
- <entry> maximum amount of nodes which can be retrieved from
storage for re-indexing purpose, the default value is 100 </entry>
- </row>
- <row>
- <entry> index-recovery-mode </entry>
- <entry> If the parameter has been set to
<command>from-indexing</command>, so a full indexing will be automatically
launched (default behavior), if the parameter has been set to
<command>from-coordinator</command>, the index will be retrieved from
coordinator </entry>
- </row>
- <row>
- <entry> index-recovery-filter </entry>
- <entry> Defines implementation class or classes of
RecoveryFilters, the mechanism of index synchronization for Local Index strategy.
</entry>
- </row>
- <row>
- <entry> async-reindexing </entry>
- <entry> Controls the process of re-indexing on JCR's
startup. If this flag is set, indexing will be launched asynchronously, without blocking
the JCR. Default is "<literal>false</literal>".
</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <formalpara
id="form-Reference_Guide-Query_handler_configuration_overview-Improving_Query_Performance_With_postgreSQL_and_rdbms_reindexing">
- <title>Improving Query Performance With
<literal>postgreSQL</literal> and
<parameter>rdbms-reindexing</parameter></title>
- <para>
- If you use <literal>postgreSQL</literal> and
<parameter>rdbms-reindexing</parameter> is set to
<literal>true</literal>, the performance of the queries used while indexing
can be improved by:
- </para>
- </formalpara>
- <procedure>
- <title/>
- <step>
- <para>
- Set the parameter
"<parameter>enable_seqscan</parameter>" to
"<literal>off</literal>"
- </para>
- <para>
- <emphasis role="bold">OR</emphasis>
- </para>
- <para>
- Set
"<parameter>default_statistics_target</parameter>" to at
least "<literal>50</literal>".
- </para>
- </step>
- <step>
- <para>
- Restart DB server and make analyze of the JCR_SVALUE (or
JCR_MVALUE) table.
- </para>
- </step>
- </procedure>
- <formalpara
id="form-Reference_Guide-Query_handler_configuration_overview-Improving_Query_Performance_With_DB2_and_rdbms_reindexing">
- <title>Improving Query Performance With
<literal>DB2</literal> and
<parameter>rdbms-reindexing</parameter></title>
- <para>
- If you use <literal>DB2</literal> and
<parameter>rdbms-reindexing</parameter> is set to
<literal>true</literal>, the performance of the queries used while indexing
can be improved by:
- </para>
- </formalpara>
- <procedure>
- <title/>
- <step>
- <para>
- Make statistics on tables by running the following for
<literal>JCR_SITEM</literal> (or <literal>JCR_MITEM</literal>) and
<literal>JCR_SVALUE</literal> (or <literal>JCR_MVALUE</literal>)
tables:
- </para>
- <programlisting><code>RUNSTATS ON TABLE
<scheme>.<table> WITH DISTRIBUTION AND INDEXES
ALL</code></programlisting>
- </step>
- </procedure>
- </section>
- <section
id="sect-Reference_Guide-Configuration-Cluster_ready_indexing">
- <title>Cluster-ready indexing</title>
- <para>
- For both cluster-ready implementations JBoss Cache, JGroups and Changes
Filter values must be defined. Shared index requires some kind of remote or shared file
system to be attached in a system (i.e. NFS, SMB or etc). Indexing directory
("indexDir" value) must point to it. Setting
"changesfilter-class" to
"org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter"
will enable shared index implementation.
- </para>
- <programlisting language="XML"
role="XML"><workspace name="ws">
- <query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <properties>
- <property name="index-dir"
value="/mnt/nfs_drive/index/db1/ws" />
- <property name="changesfilter-class"
-
value="org.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter"
/>
- <property name="jbosscache-configuration"
value="jbosscache-indexer.xml" />
- <property name="jgroups-configuration"
value="udp-mux.xml" />
- <property name="jgroups-multiplexer-stack"
value="true" />
- <property name="jbosscache-cluster-name"
value="JCR-cluster-indexer-ws" />
- <property name="max-volatile-time"
value="60" />
- <property name="rdbms-reindexing"
value="true" />
- <property name="reindexing-page-size"
value="1000" />
- <property name="index-recovery-mode"
value="from-coordinator" />
- </properties>
- </query-handler>
-</workspace></programlisting>
- <para>
- In order to use cluster-ready strategy based on local indexes, when each
node has own copy of index on local file system, the following configuration must be
applied. Indexing directory must point to any folder on local file system and
"changesfilter-class" must be set to
"org.exoplatform.services.jcr.impl.core.query.jbosscache.LocalIndexChangesFilter".
- </para>
- <programlisting language="XML"
role="XML"><workspace name="ws">
- <query-handler
class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <properties>
- <property name="index-dir"
value="/mnt/nfs_drive/index/db1/ws" />
- <property name="changesfilter-class"
-
value="org.exoplatform.services.jcr.impl.core.query.jbosscache.LocalIndexChangesFilter"
/>
- <property name="jbosscache-configuration"
value="jbosscache-indexer.xml" />
- <property name="jgroups-configuration"
value="udp-mux.xml" />
- <property name="jgroups-multiplexer-stack"
value="true" />
- <property name="jbosscache-cluster-name"
value="JCR-cluster-indexer-ws" />
- <property name="max-volatile-time"
value="60" />
- <property name="rdbms-reindexing"
value="true" />
- <property name="reindexing-page-size"
value="1000" />
- <property name="index-recovery-mode"
value="from-coordinator" />
- </properties>
- </query-handler>
-</workspace>
-</programlisting>
- </section>
- <section
id="sect-Reference_Guide-Configuration-Local_Index_Recovery_Filters">
- <title>Local Index Recovery Filters</title>
- <para>
- A common usecase for all cluster-ready applications is a hot joining and
leaving of processing units. All nodes that are joining a cluster for the first time or
nodes joining after some downtime, must be in a synchronized state.
- </para>
- <para>
- When using shared value storages, databases and indexes, cluster nodes
are synchronized at any given time. But is not the case when a local index strategy is
used.
- </para>
- <para>
- If a new node joins a cluster, without an index it is retrieved or
recreated. Nodes can be also be restarted and thus the index is not empty. By default,
even though the existing index is thought to be up to date, it can be outdated.
- </para>
- <para>
- The JBoss Portal Platform JCR offers a mechanism called
<literal>RecoveryFilters</literal> that will automatically retrieve index for
the joining node on start up. This feature is a set of filters that can be defined via
<literal>QueryHandler</literal> configuration:
- </para>
- <programlisting language="XML"><property
name="index-recovery-filter"
value="org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter"
/></programlisting>
- <para>
- Filter numbers are not limited so they can be combined:
- </para>
- <programlisting language="XML"><property
name="index-recovery-filter"
value="org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter"
/>
- <property name="index-recovery-filter"
value="org.exoplatform.services.jcr.impl.core.query.lucene.SystemPropertyRecoveryFilter"
/>
-</programlisting>
- <para>
- If any one returns fires, the index is re-synchronized. This feature uses
standard index recovery mode defined by previously described parameter (can be
"from-indexing" (default) or "from-coordinator")
- </para>
- <programlisting language="XML"><property
name="index-recovery-mode" value="from-coordinator"
/>
-</programlisting>
- <para>
- There are multiple filter implementations:
- </para>
- <variablelist
id="vari-Reference_Guide-Local_Index_Recovery_Filters-org.exoplatform.services.jcr.impl.core.query.lucene.DummyRecoveryFilter">
- <varlistentry>
-
<term>org.exoplatform.services.jcr.impl.core.query.lucene.DummyRecoveryFilter</term>
- <listitem>
- <para>
- Always returns true, for cases when index must be force
resynchronized (recovered) each time.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
-
<term>org.exoplatform.services.jcr.impl.core.query.lucene.SystemPropertyRecoveryFilter</term>
- <listitem>
- <para>
- Returns value of system property
"<literal>org.exoplatform.jcr.recoveryfilter.forcereindexing</literal>".
So index recovery can be controlled from the top without changing documentation using
system properties.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
-
<term>org.exoplatform.services.jcr.impl.core.query.lucene.ConfigurationPropertyRecoveryFilter</term>
- <listitem>
- <para>
- Returns value of <literal>QueryHandler</literal>
configuration property
"<literal>index-recovery-filter-forcereindexing</literal>".
So index recovery can be controlled from configuration separately for each workspace. For
example:
- </para>
- <programlisting language="XML"><property
name="index-recovery-filter"
value="org.exoplatform.services.jcr.impl.core.query.lucene.ConfigurationPropertyRecoveryFilter"
/>
- <property name="index-recovery-filter-forcereindexing"
value="true" />
-</programlisting>
- </listitem>
- </varlistentry>
- <varlistentry>
-
<term>org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFilter</term>
- <listitem>
- <para>
- Checks the number of documents in index on coordinator side
and self-side. It returns <literal>true</literal> if the count differs.
- </para>
- <para>
- The advantage of this filter compared to others, is that it
will skip reindexing for workspaces where the index was not modified.
- </para>
- <para>
- For example; if there is ten repositories with three
workspaces in each and only one is heavily used in the cluster, this filter will only
reindex those workspaces that have been changed, without affecting other indexes.
- </para>
- <para>
- This greatly reduces start up time.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- <section
id="sect-Reference_Guide-Configuration-JBoss_Cache_template_configuration">
- <title>JBoss-Cache template configuration</title>
- <para>
- JBoss-Cache template configuration for query handler is about the same
for both clustered strategies.
- </para>
- <example
id="exam-Reference_Guide-JBoss_Cache_template_configuration-jbosscache_indexer.xml">
- <title>jbosscache-indexer.xml</title>
- <programlisting language="XML"
role="XML"><?xml version="1.0"
encoding="UTF-8"?>
-<jbosscache
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:jboss:jbosscache-core:config:3.1">
- <locking useLockStriping="false"
concurrencyLevel="50000"
lockParentForChildInsertRemove="false"
- lockAcquisitionTimeout="20000" />
- <!-- Configure the TransactionManager -->
- <transaction
transactionManagerLookupClass="org.jboss.cache.transaction.JBossStandalone
- JTAManagerLookup" />
- <clustering mode="replication"
clusterName="${jbosscache-cluster-name}">
- <stateRetrieval timeout="20000"
fetchInMemoryState="false" />
- <jgroupsConfig multiplexerStack="jcr.stack" />
- <sync />
- </clustering>
- <!-- Eviction configuration -->
- <eviction wakeUpInterval="5000">
- <default
algorithmClass="org.jboss.cache.eviction.FIFOAlgorithm"
eventQueueSize="1000000">
- <property name="maxNodes"
value="10000" />
- <property name="minTimeToLive"
value="60000" />
- </default>
- </eviction>
-</jbosscache></programlisting>
- </example>
- <para>
- Read more about template configurations <xref
linkend="chap-Reference_Guide-JBoss_Cache_configuration"/>.
- </para>
- </section>
- </section>
- <section
id="sect-Reference_Guide-QueryHandler_configuration-Asynchronous_Reindexing">
- <title>Asynchronous Re-indexing</title>
- <para>
- Managing a large data set using a JCR in a production environment at times
requires special operations with Indexes, stored on File System. One of those maintenance
operations is a recreation of it. Also called "re-indexing". There are
various usecases when it's important to do. They include hardware faults, hard
restarts, data-corruption, migrations and JCR updates that brings new features related to
index. Usually index re-creation requested on server's startup or in runtime.
- </para>
- <section
id="sect-Reference_Guide-Asynchronous_Reindexing-On_startup_indexing">
- <title>On startup indexing</title>
- <para>
- A common usecase for updating and re-creating the index is to stop the
server and manually remove indexes for workspaces requiring it. When the server is
re-started, the missing indexes are automatically recovered by re-indexing.
- </para>
- <para>
- The eXo JCR Supports direct RDBMS re-indexing, which can be faster than
ordinary and can be configured via <literal>QueryHandler</literal> parameter
<parameter>rdbms-reindexing</parameter> set to
<literal>true</literal>.
- </para>
- <para>
- A new feature is asynchronous indexing on startup. Usually startup is
blocked until the indexing process is finished. This block can take any period of time,
depending on amount of data persisted in repositories. But this can be resolved by using
an asynchronous approaches of startup indexation.
- </para>
- <para>
- Essentially, all indexing operations are performed in the background
without blocking the repository. This is controlled by the value of the
<parameter>async-reindexing</parameter> parameter in
<literal>QueryHandler</literal> configuration.
- </para>
- <para>
- With asynchronous indexation active, the JCR starts with no active
indexes present. Queries on JCR still can be executed without exceptions, but no results
will be returned until index creation completed.
- </para>
- <para>
- The index state check is accomplished via
<literal>QueryManagerImpl</literal>:
- </para>
- <para>
-
-<programlisting lang="java">boolean online =
((QueryManagerImpl)Workspace.getQueryManager()).getQueryHandeler().isOnline();</programlisting>
-
- </para>
- <para>
- The <emphasis role="bold">OFFLINE</emphasis> state
means that the index is currently re-creating. When the state is changed, a corresponding
log event is printed. When the background index task starts the index is switched to
<emphasis role="bold">OFFLINE</emphasis>, with following log event
:
- </para>
- <programlisting>[INFO] Setting index OFFLINE
(repository/production[system]).</programlisting>
- <para>
- When the indexing process is finished, the following two events are
logged :
- </para>
- <programlisting>[INFO] Created initial index for 143018 nodes
(repository/production[system]).
-[INFO] Setting index ONLINE (repository/production[system]).</programlisting>
- <para>
- Those two log lines indicates the end of process for workspace given in
brackets. Calling isOnline() as mentioned above, will also return true.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Asynchronous_Reindexing-Hot_Asynchronous_Workspace_Reindexing_via_JMX">
- <title>Hot Asynchronous Workspace Re-indexing using JMX</title>
- <para>
- Some hard system faults, errors during upgrades, migration issues and
some other factors may corrupt the index. Current versions of JCR supports <emphasis
role="bold">Hot Asynchronous Workspace Reindexing</emphasis> feature.
It allows Service Administrators to launch the process in background without stopping or
blocking the whole application by using any JMX-compatible console.
- </para>
- <figure>
- <title id="jmx-jconsole">JMX Jconsole</title>
- <mediaobject>
- <imageobject>
- <imagedata align="center"
fileref="images/jmx-jconsole.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- <para>
- The server can continue working as expected while the index is
recreated.
- </para>
- <para>
- This depends on the flag <parameter>allow
queries</parameter> being passed via JMX interface to the reindex operation
invocation. If the flag is set, the application continues working.
- </para>
- <para>
- However, there is one critical limitation users must be aware of;
<emphasis>the index is frozen while the background task is
running</emphasis>.
- </para>
- <para>
- This means that queries are performed on a version of the index present
at the moment the indexing task is started, and that data written into the repository
after startup will not be available through the search until process completes.
- </para>
- <para>
- Data added during re-indexation is also indexed, but will be available
only when reindexing is complete. The JCR makes a snapshot of indexes at the invocation of
the asynchronous indexing task and uses that snapshot for searches.
- </para>
- <para>
- When the operation is finished, the stale index is replaced by the newly
created index, which included any newly added data.
- </para>
- <para>
- If the <parameter>allow queries</parameter> flag is set to
<literal>false</literal>, then all queries will throw an exception while task
is running. The current state can be acquired using the following JMX operation:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- getHotReindexingState() - returns information about latest
invocation: start time, if in progress or finish time if done.
- </para>
- </listitem>
- </itemizedlist>
- </section>
- <section
id="sect-Reference_Guide-Asynchronous_Reindexing-Notices">
- <title>Notices</title>
- <para>
- Hot re-indexing via JMX cannot be launched if the index is already in
offline mode. This means that the index is currently involved in some other operations,
such as re-indexing at startup, copying in cluster to another node or whatever.
- </para>
- <para>
- Also; <emphasis>Hot Asynchronous Reindexing via
JMX</emphasis> and <literal>on startup</literal> reindexing are
different features. So you can't get the state of startup reindexing using
command <code>getHotReindexingState</code> in JMX interface, but there are
some common JMX operations:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- getIOMode - returns current index IO mode (READ_ONLY /
READ_WRITE), belongs to clustered configuration states;
- </para>
- </listitem>
- <listitem>
- <para>
- getState - returns current state: ONLINE / OFFLINE.
- </para>
- </listitem>
- </itemizedlist>
- </section>
- </section>
- <section
id="sect-Reference_Guide-QueryHandler_configuration-Advanced_tuning">
- <title>Advanced tuning</title>
- <section
id="sect-Reference_Guide-Advanced_tuning-Lucene_tuning">
- <title>Lucene tuning</title>
- <para>
- As mentioned, JCR Indexing is based on the Lucene indexing library as the
underlying search engine. It uses Directories to store index and manages access to index
by Lock Factories.
- </para>
- <para>
- By default, the JCR implementation uses optimal combination of Directory
implementation and Lock Factory implementation.
- </para>
- <para>
- The <literal>SimpleFSDirectory</literal> is used in Windows
environments and the <literal>NIOFSDirectory</literal> implementation is used
in non-Windows systems.
- </para>
- <para>
- <literal>NativeFSLockFactory</literal> is an optimal solution
for a wide variety of cases including clustered environment with NFS shared resources.
- </para>
- <para>
- But those defaults can be overridden in the system properties.
- </para>
- <para>
- Two properties:
<literal>org.exoplatform.jcr.lucene.store.FSDirectoryLockFactoryClass</literal>
and <literal>org.exoplatform.jcr.lucene.FSDirectory.class</literal> control
(and change) the default behavior.
- </para>
- <para>
- The first defines the implementation of abstract Lucene
<literal>LockFactory</literal> class and the second sets implementation class
for <literal>FSDirectory</literal> instances.
- </para>
- <para>
- For more information, refer to the Lucene documentation. But be careful,
for while the JCR allows users to change implementation classes of Lucene internals, it
does not guarantee the stability and functionality of those changes.
- </para>
- </section>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-JBossTransactionsService">
- <title>JBossTransactionsService</title>
- <section
id="sect-Reference_Guide-JBossTransactionsService-Introduction">
- <title>Introduction</title>
- <para>
- JBossTransactionsService implements eXo TransactionService and provides access to
<ulink
url="http://www.jboss.org/jbosstm/">JBoss Transaction Service
(JBossTS)</ulink> JTA implementation via eXo container dependency.
- </para>
- <para>
- TransactionService used in JCR cache
<emphasis>org.exoplatform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache</emphasis>
implementation.
- </para>
- </section>
- <section
id="sect-Reference_Guide-JBossTransactionsService-Configuration">
- <title>Configuration</title>
- <para>
- Example configuration:
- </para>
- <programlisting language="XML" role="XML">
<component>
-
<key>org.exoplatform.services.transaction.TransactionService</key>
-
<type>org.exoplatform.services.transaction.jbosscache.JBossTransactionsService</type>
- <init-params>
- <value-param>
- <name>timeout</name>
- <value>3000</value>
- </value-param>
- </init-params>
- </component></programlisting>
- <para>
- timeout - XA transaction timeout in seconds
- </para>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-JCR_Query_Usecases">
- <title>JCR Query Use-cases</title>
- <section
id="sect-Reference_Guide-JCR_Query_Usecases-Introduction">
- <title>Introduction</title>
- <para>
- The JCR supports two query languages; JCR and XPath. A query, whether XPath
or SQL, specifies a subset of nodes within a workspace, called the result set. The result
set constitutes all the nodes in the workspace that meet the constraints stated in the
query.
- </para>
- </section>
- <section
id="sect-Reference_Guide-JCR_Query_Usecases-Query_Lifecycle">
- <title>Query Lifecycle</title>
- <section
id="sect-Reference_Guide-Query_Lifecycle-Query_Creation_and_Execution">
- <title>Query Creation and Execution</title>
- <example
id="exam-Reference_Guide-Query_Lifecycle-Query_Creation_and_Execution-SQL">
- <title>SQL</title>
- <programlisting language="Java" role="Java">// get
QueryManager
-QueryManager queryManager = workspace.getQueryManager(); 
-// make SQL query
-Query query = queryManager.createQuery("SELECT * FROM nt:base ",
Query.SQL);
-// execute query
-QueryResult result = query.execute();</programlisting>
- </example>
- <example
id="exam-Reference_Guide-Query_Lifecycle-Query_Creation_and_Execution-XPath">
- <title>XPath</title>
- <programlisting language="Java" role="Java">// get
QueryManager
-QueryManager queryManager = workspace.getQueryManager();
-// make XPath query
-Query query = queryManager.createQuery("//element(*,nt:base)",
Query.XPATH);
-// execute query
-QueryResult result = query.execute();</programlisting>
- </example>
- </section>
- <section
id="sect-Reference_Guide-Query_Lifecycle-Query_Result_Processing">
- <title>Query Result Processing</title>
- <programlisting language="Java" role="Java">// fetch
query result
-QueryResult result = query.execute();</programlisting>
- <para>
- To fetch the nodes:
- </para>
- <programlisting language="Java"
role="Java">NodeIterator it = result.getNodes();</programlisting>
- <para>
- The results can be formatted in a table:
- </para>
- <programlisting language="Java" role="Java">// get
column names
-String[] columnNames = result.getColumnNames();
-// get column rows
-RowIterator rowIterator = result.getRows();
-while(rowIterator.hasNext()){
- // get next row
- Row row = rowIterator.nextRow();
- // get all values of row
- Value[] values = row.getValues();
-}</programlisting>
- </section>
- <section id="sect-Reference_Guide-Query_Lifecycle-Scoring">
- <title>Scoring</title>
- <para>
- The result returns a score for each row in the result set. The score
contains a value that indicates a rating of how well the result node matches the query. A
high value means a better matching than a low value. This score can be used for ordering
the result.
- </para>
- <para>
- eXo JCR Scoring is a mapping of Lucene scoring. For a more in-depth
understanding, please study <ulink
url="http://lucene.apache.org/java/2_4_1/scoring.html">Lucene
documentation</ulink>.
- </para>
- <para>
- The <literal>jcr:score</literal> is calculated as;
<literal>(lucene score)*1000f</literal>.
- </para>
-<!--<para>
- Score may be increased for specified nodes, see <xref
linkend="sect-Reference_Guide-Changing_Priority_of_Node" />
- </para>
- <para>
- Also, see an example in sect-Reference_Guide-Ordering_by_Score />
- </para>--> </section>
- </section>
- <section
id="sect-Reference_Guide-JCR_Query_Usecases-Tips_and_tricks">
- <title>Tips and tricks</title>
- <section xmlns=""
id="sect-Reference_Guide-XPath_queries_containing_node_names_starting_with_a_number">
- <title>XPath queries containing node names starting with a
number</title>
- <para>
- If you execute an XPath request like this...
- </para>
- <programlisting language="Java" role="Java">// get
QueryManager
-QueryManager queryManager = workspace.getQueryManager();
-// make XPath query
-Query query =
queryManager.createQuery("/jcr:root/Documents/Publie/2010//element(*,
exo:article)", Query.XPATH);</programlisting>
- <para>
- ...you will receive an <code>Invalid request</code> error. This is because
XML (and thus XPath) does not allow names starting with a number.
- </para>
- <para>
- Therefore, XPath requests using a node name that starts with a number are invalid.
- </para>
- <para>
- Some possible alternatives are:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- Use an SQL request.
- </para>
- </listitem>
- <listitem>
- <para>
- Use escaping:
- </para>
- <programlisting language="Java" role="Java">//
get QueryManager
-QueryManager queryManager = workspace.getQueryManager();
-// make XPath query
-Query query =
queryManager.createQuery("/jcr:root/Documents/Publie/_x0032_010//element(*,
exo:article)", Query.XPATH);</programlisting>
- </listitem>
- </itemizedlist>
- </section>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Searching_Repository_Content">
- <title>Searching Repository Content</title>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-Introduction">
- <title>Introduction</title>
- <para>
- You can find the JCR configuration file here:
<filename><replaceable>JPP_DIST</replaceable>/gatein/gatein.ear/portal.war/portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>.
- </para>
- <para>
- Please refer to <xref
linkend="chap-Reference_Guide-Search_Configuration"/> for more information
about index configuration.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-Bi_directional_RangeIterator">
- <title>Bi-directional RangeIterator</title>
- <para>
- <literal>QueryResult.getNodes()</literal> will return bi-directional
<literal>NodeIterator</literal> implementation.
- </para>
- <note>
- <para>
- Bi-directional NodeIterator is <emphasis role="bold">not
supported</emphasis> in two cases:
- </para>
- <orderedlist>
- <listitem>
- <para>
- SQL query: select * from nt:base
- </para>
- </listitem>
- <listitem>
- <para>
- XPath query: //* .
- </para>
- </listitem>
- </orderedlist>
- </note>
- <para>
- <literal>TwoWayRangeIterator</literal> interface:
- </para>
- <programlisting language="Java" role="Java">/**
- * Skip a number of elements in the iterator.
- *
- * @param skipNum the non-negative number of elements to skip
- * @throws java.util.NoSuchElementException if skipped past the first element
- * in the iterator.
- */
-public void skipBack(long skipNum);</programlisting>
- <para>
- Usage:
- </para>
- <programlisting language="Java"
role="Java">NodeIterator iter = queryResult.getNodes();
-while (iter.hasNext()) {
- if (skipForward) {
- iter.skip(10); // Skip 10 nodes in forward direction
- } else if (skipBack) {
- TwoWayRangeIterator backIter = (TwoWayRangeIterator) iter;
- backIter.skipBack(10); // Skip 10 nodes back
- }
- .......
-}</programlisting>
- </section>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-Fuzzy_Searches">
- <title>Fuzzy Searches</title>
- <para>
- The JBoss Portal Platform JCR supports features such as Lucene Fuzzy Searches. To
perform a fuzzy search, form your query like the one below:
- </para>
- <programlisting language="Java"
role="Java">QueryManager qman = session.getWorkspace().getQueryManager();
-Query q = qman.createQuery("select * from nt:base where contains(field,
'ccccc~')", Query.SQL);
-QueryResult res = q.execute();</programlisting>
- </section>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-SynonymSearch">
- <title>SynonymSearch</title>
- <para>
- Searching with synonyms is integrated in the
<literal>jcr:contains()</literal> function and uses the same syntax as synonym
searches in web search engines (Google, for example). If a search term is prefixed by a
tilde symbol ( ~ ), synonyms of the search term are taken into consideration. For
example:
- </para>
- <programlisting>SQL: select * from nt:resource where contains(.,
'~parameter')
-
-XPath: //element(*, nt:resource)[jcr:contains(.,
'~parameter')</programlisting>
- <para>
- This feature is disabled by default and you need to add a configuration parameter to
the query-handler element in your JCR configuration file to enable it.
- </para>
- <programlisting language="XML" role="XML"><param
name="synonymprovider-config-path" value="..you path to
configuration file....."/>
-<param name="synonymprovider-class"
value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider"/></programlisting>
- <programlisting language="XML" role="XML">/**
- * <code>SynonymProvider</code> defines an interface for a
component that
- * returns synonyms for a given term.
- */
-public interface SynonymProvider {
-
- /**
- * Initializes the synonym provider and passes the file system resource to
- * the synonym provider configuration defined by the configuration value of
- * the <code>synonymProviderConfigPath</code> parameter.
The resource may be
- * <code>null</code> if the configuration parameter is not
set.
- *
- * @param fsr the file system resource to the synonym provider
- * configuration.
- * @throws IOException if an error occurs while initializing the synonym
- * provider.
- */
- public void initialize(InputStream fsr) throws IOException;
-
- /**
- * Returns an array of terms that are considered synonyms for the given
- * <code>term</code>.
- *
- * @param term a search term.
- * @return an array of synonyms for the given
<code>term</code> or an empty
- * array if no synonyms are known.
- */
- public String[] getSynonyms(String term);
-}</programlisting>
- </section>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-Highlighting">
- <title>Highlighting</title>
- <para>
- An <literal>ExcerptProvider</literal> retrieves text excerpts for a node
in the query result and marks up the words in the text that match the query terms.
- </para>
- <para>
- By default, match highlighting is disabled because as it requires that additional
information is written to the search index.
- </para>
- <para>
- To enable this feature, you need to add a configuration parameter to the
<parameter>query-handler</parameter> element in your JCR configuration file:
- </para>
- <programlisting language="XML" role="XML"><param
name="support-highlighting"
value="true"/></programlisting>
- <para>
- Additionally, there is a parameter that controls the format of the excerpt created. In
JCR 1.9, the default is set to
<literal>org.exoplatform.services.jcr.impl.core.query.lucene.DefaultHTMLExcerpt</literal>.
The configuration parameter for this setting is:
- </para>
- <programlisting language="XML" role="XML"><param
name="excerptprovider-class"
value="org.exoplatform.services.jcr.impl.core.query.lucene.DefaultXMLExcerpt"/></programlisting>
- <section
id="sect-Reference_Guide-Highlighting-DefaultXMLExcerpt">
- <title>DefaultXMLExcerpt</title>
- <para>
- This excerpt provider creates an XML fragment of the following form:
- </para>
- <programlisting language="XML"
role="XML"><excerpt>
- <fragment>
- <highlight>exoplatform</highlight> implements both
the mandatory
- XPath and optional SQL <highlight>query</highlight>
syntax.
- </fragment>
- <fragment>
- Before parsing the XPath <highlight>query</highlight>
in
- <highlight>exoplatform</highlight>, the statement is
surrounded
- </fragment>
-</excerpt></programlisting>
- </section>
- <section
id="sect-Reference_Guide-Highlighting-DefaultHTMLExcerpt">
- <title>DefaultHTMLExcerpt</title>
- <para>
- This excerpt provider creates an HTML fragment of the following form:
- </para>
- <programlisting language="HTML"
role="HTML"><div>
- <span>
- <strong>exoplatform</strong> implements both the
mandatory XPath
- and optional SQL <strong>query</strong> syntax.
- </span>
- <span>
- Before parsing the XPath <strong>query</strong> in
- <strong>exoplatform</strong>, the statement is
surrounded
- </span>
-</div></programlisting>
- </section>
- <section id="sect-Reference_Guide-Highlighting-Usage">
- <title>Usage</title>
- <para>
- If you are using XPath, you must use the <code>rep:excerpt()</code>
function in the last location step, just like you would select properties:
- </para>
- <programlisting language="Java"
role="Java">QueryManager qm = session.getWorkspace().getQueryManager();
-Query q = qm.createQuery("//*[jcr:contains(.,
'exoplatform')]/(@Title|rep:excerpt(.))", Query.XPATH);
-QueryResult result = q.execute();
-for (RowIterator it = result.getRows(); it.hasNext(); ) {
- Row r = it.nextRow();
- Value title = r.getValue("Title");
- Value excerpt = r.getValue("rep:excerpt(.)");
-}</programlisting>
- <para>
- The above code searches for nodes that contain the word
<emphasis>exoplatform</emphasis> and then gets the value of the
<parameter>Title</parameter> property and an excerpt for each resultant node.
- </para>
- <para>
- It is also possible to use a relative path in the call
<code>Row.getValue()</code> while the query statement still remains the same.
Also, you may use a relative path to a string property. The returned value will then be an
excerpt based on string value of the property.
- </para>
- <para>
- Both available excerpt providers will create fragments of about 150 characters and up
to three fragments.
- </para>
- <para>
- In SQL, the function is called <code>excerpt()</code> without the rep
prefix, but the column in the <literal>RowIterator</literal> will nonetheless
be labelled <code>rep:excerpt(.)</code>.
- </para>
- <programlisting language="Java"
role="Java">QueryManager qm = session.getWorkspace().getQueryManager();
-Query q = qm.createQuery("select excerpt(.) from nt:resource where contains(.,
'exoplatform')", Query.SQL);
-QueryResult result = q.execute();
-for (RowIterator it = result.getRows(); it.hasNext(); ) {
- Row r = it.nextRow();
- Value excerpt = r.getValue("rep:excerpt(.)");
-}</programlisting>
- </section>
- </section>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-SpellChecker">
- <title>SpellChecker</title>
- <para>
- The lucene based query handler implementation supports a pluggable spell-checker
mechanism. By default, spell checking is not available, it must be configured first.
- </para>
- <para>
- Information about the <parameter>spellCheckerClass</parameter> parameter
is available in <xref
linkend="chap-Reference_Guide-Search_Configuration"/>.
- </para>
- <para>
- The JCR currently provides an implementation class which uses the <ulink
url="http://wiki.apache.org/jakarta-lucene/SpellChecker">luc...;.
- </para>
- <para>
- The dictionary is derived from the fulltext, indexed content of the workspace and
updated periodically. You can configure the refresh interval by picking one of the
available inner classes of
<literal>org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker</literal>:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- <literal>OneMinuteRefreshInterval</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>FiveMinutesRefreshInterval</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>ThirtyMinutesRefreshInterval</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>OneHourRefreshInterval</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>SixHoursRefreshInterval</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>TwelveHoursRefreshInterval</literal>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>OneDayRefreshInterval</literal>
- </para>
- </listitem>
- </itemizedlist>
- <para>
- For example, if you want a refresh interval of six hours, the class name would be;
<literal>org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker$SixHoursRefreshInterval</literal>.
- </para>
- <para>
- If you use
<literal>org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker</literal>,
the refresh interval will be one hour.
- </para>
- <para>
- The spell checker dictionary is stored as a lucene index under
<filename><index-dir>/spellchecker</filename>. If this index
does not exist, a background thread will create it on start up. Similarly, the dictionary
refresh is also done in a background thread so as not to block regular queries.
- </para>
- <section id="sect-Reference_Guide-SpellChecker-Usage">
- <title>Usage</title>
- <para>
- You can spell check a fulltext statement either with an XPath or a SQL query:
- </para>
- <programlisting language="Java" role="Java">//
rep:spellcheck('explatform') will always evaluate to true
-Query query =
qm.createQuery("/jcr:root[rep:spellcheck('explatform')]/(rep:spellcheck())",
Query.XPATH);
-RowIterator rows = query.execute().getRows();
-// the above query will always return the root node no matter what string we check
-Row r = rows.nextRow();
-// get the result of the spell checking
-Value v = r.getValue("rep:spellcheck()");
-if (v == null) {
- // no suggestion returned, the spelling is correct or the spell checker
- // does not know how to correct it.
-} else {
- String suggestion = v.getString();
-}</programlisting>
- <para>
- And the same using SQL:
- </para>
- <programlisting language="Java" role="Java">//
SPELLCHECK('exoplatform') will always evaluate to true
-Query query = qm.createQuery("SELECT rep:spellcheck() FROM nt:base WHERE
jcr:path = '/' AND SPELLCHECK('explatform')",
Query.SQL);
-RowIterator rows = query.execute().getRows();
-// the above query will always return the root node no matter what string we check
-Row r = rows.nextRow();
-// get the result of the spell checking
-Value v = r.getValue("rep:spellcheck()");
-if (v == null) {
- // no suggestion returned, the spelling is correct or the spell checker
- // does not know how to correct it.
-} else {
- String suggestion = v.getString();
-}</programlisting>
- </section>
- </section>
- <section
id="sect-Reference_Guide-Searching_Repository_Content-Similarity">
- <title>Similarity</title>
- <para>
- Starting with version, 1.12 JCR allows you to search for nodes that are similar to an
existing node.
- </para>
- <para>
- Similarity is determined by looking up terms that are common to nodes. There are some
conditions that must be met for a term to be considered. This is required to limit the
number possibly relevant terms.
- </para>
- <para>
- To be considered, terms must:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- Be at least four characters long.
- </para>
- </listitem>
- <listitem>
- <para>
- Occur at least twice in the source node.
- </para>
- </listitem>
- <listitem>
- <para>
- Occur in at least five other nodes.
- </para>
- </listitem>
- </itemizedlist>
- <note>
- <title>Note</title>
- <para>
- The similarity function requires that the support Hightlighting is enabled. Please
make sure that you have the following parameter set for the query handler in your
<filename>workspace.xml</filename>.
- </para>
- <programlisting language="XML"
role="XML"><param name="support-highlighting"
value="true"/></programlisting>
- </note>
- <para>
- The functions (<code>rep:similar()</code> in XPath and
<code>similar()</code> in SQL) have two arguments:
- </para>
- <variablelist>
- <title/>
- <varlistentry>
- <term>relativePath</term>
- <listitem>
- <para>
- A relative path to a descendant node or a period (<literal>.</literal>)
for the current node.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>absoluteStringPath</term>
- <listitem>
- <para>
- A string literal that contains the path to the node for which to find similar
nodes.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- <warning>
- <title>Warning</title>
- <para>
- Relative path is not supported yet.
- </para>
- </warning>
- <example id="exam-Reference_Guide-Similarity-Example">
- <title>Example</title>
- <programlisting>//element(*, nt:resource)[rep:similar(.,
'/parentnode/node.txt/jcr:content')]</programlisting>
- <para>
- Finds <literal>nt:resource</literal> nodes, which are similar to node by
path <filename>/parentnode/node.txt/jcr:content</filename>.
- </para>
- </example>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Fulltext_Search_And_Affecting_Settings">
- <title>Full Text Search And Affecting Settings</title>
- <formalpara
id="form-Reference_Guide-Fulltext_Search_And_Affecting_Settings-Property_content_indexing">
- <title>Property content indexing</title>
- <para>
- Each property of a node (if it is indexable) is processed with the Lucene analyzer and
stored in the Lucene index. This is called indexing of a property. It allows fulltext
searching of these indexed properties.
- </para>
- </formalpara>
- <section
id="sect-Reference_Guide-Fulltext_Search_And_Affecting_Settings-Lucene_Analyzers">
- <title>Lucene Analyzers</title>
- <para>
- The purpose of analyzers is to transform all strings stored in the index into a
well-defined condition. The same analyzer(s) is/are used when searching in order to adapt
the query string to the index reality.
- </para>
- <para>
- Therefore, performing the same query using different analyzers can return different
results.
- </para>
- <para>
- The example below illustrates how the same string is transformed by different
analyzers.
- </para>
- <table
id="tabl-Reference_Guide-Lucene_Analyzers-The_quick_brown_fox_jumped_over_the_lazy_dogs">
- <title>"The quick brown fox jumped over the lazy
dogs"</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Analyzer </entry>
- <entry> Parsed </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> org.apache.lucene.analysis.WhitespaceAnalyzer
</entry>
- <entry> [The] [quick] [brown] [fox] [jumped] [over] [the] [lazy]
[dogs] </entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.SimpleAnalyzer </entry>
- <entry> [the] [quick] [brown] [fox] [jumped] [over] [the] [lazy]
[dogs] </entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.StopAnalyzer </entry>
- <entry> [quick] [brown] [fox] [jumped] [over] [lazy] [dogs]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.standard.StandardAnalyzer
</entry>
- <entry> [quick] [brown] [fox] [jumped] [over] [lazy] [dogs]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.snowball.SnowballAnalyzer
</entry>
- <entry> [quick] [brown] [fox] [jump] [over] [lazi] [dog]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.standard.StandardAnalyzer
(configured without stop word - jcr default analyzer) </entry>
- <entry> [the] [quick] [brown] [fox] [jumped] [over] [the] [lazy]
[dogs] </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <table
id="tabl-Reference_Guide-Lucene_Analyzers-XYampZ_Corporation_xyzexample.com">
- <title>"XY&Z Corporation -
xyz(a)example.com&quot;</title>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> Analyzer </entry>
- <entry> Parsed </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> org.apache.lucene.analysis.WhitespaceAnalyzer
</entry>
- <entry> [XY&Z] [Corporation] [-] [xyz(a)example.com]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.SimpleAnalyzer </entry>
- <entry> [xy] [z] [corporation] [xyz] [example] [com]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.StopAnalyzer </entry>
- <entry> [xy] [z] [corporation] [xyz] [example] [com]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.standard.StandardAnalyzer
</entry>
- <entry> [xy&z] [corporation] [xyz@example] [com]
</entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.snowball.SnowballAnalyzer
</entry>
- <entry> [xy&z] [corpor] [xyz@exampl] [com] </entry>
- </row>
- <row>
- <entry> org.apache.lucene.analysis.standard.StandardAnalyzer
(configured without stop word - jcr default analyzer) </entry>
- <entry> [xy&z] [corporation] [xyz@example] [com]
</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <note>
- <para>
- <literal>StandardAnalyzer</literal> is the default analyzer in the JBoss
Portal Platform JCR search engine. But it does not use stop words.
- </para>
- </note>
- <para>
- You can assign your analyzer as described in <xref
linkend="chap-Reference_Guide-Search_Configuration"/>.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Fulltext_Search_And_Affecting_Settings-Property_Indexing">
- <title>Property Indexing</title>
- <para>
- Different properties are indexed in different ways and this affects whether it can be
searched via fulltext by property or not.
- </para>
- <para>
- Only two property types are indexed as fulltext searcheable:
<parameter>STRING</parameter> and <parameter>BINARY</parameter>.
- </para>
- <table
id="tabl-Reference_Guide-Property_Indexing-Fulltext_search_by_different_properties">
- <title>Fulltext search by different properties</title>
- <tgroup cols="3">
- <thead>
- <row>
- <entry> Property Type </entry>
- <entry> Fulltext search by all properties </entry>
- <entry> Fulltext search by exact property </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> STRING </entry>
- <entry> YES </entry>
- <entry> YES </entry>
- </row>
- <row>
- <entry> BINARY </entry>
- <entry> YES </entry>
- <entry> NO </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <para>
- For example, the <literal>jcr:data</literal> property (which is
<parameter>BINARY</parameter>) will not be found with a query structured as:
- </para>
- <programlisting>SELECT * FROM nt:resource WHERE CONTAINS(jcr:data,
'some string')</programlisting>
- <para>
- This is because, <parameter>BINARY</parameter> is not searchable
by fulltext search by exact property.
- </para>
- <para>
- However, the following query <emphasis>will</emphasis> return some results
(provided, of course they node contains the targeted data):
- </para>
- <programlisting>SELECT * FROM nt:resource WHERE CONTAINS( * ,
'some string')</programlisting>
- </section>
- <section
id="sect-Reference_Guide-Fulltext_Search_And_Affecting_Settings-Different_Analyzers">
- <title>Different Analyzers</title>
- <para>
- First of all, we will fill repository by nodes with mixin type
'mix:title' and different values of 'jcr:description'
property.
- </para>
- <programlisting>root
- ├── document1 (mix:title) jcr:description = "The quick brown fox jumped over
the lazy dogs"
- ├── document2 (mix:title) jcr:description = "Brown fox live in
forest."
- └── document3 (mix:title) jcr:description = "Fox is a nice animal."
-</programlisting>
- <para>
- The example below shows different Analyzers in action. The first instance uses base
JCR settings, so the string; "<emphasis>The quick brown fox jumped over the
lazy dogs</emphasis>" will be transformed to the set; <emphasis
role="bold">{[the] [quick] [brown] [fox] [jumped] [over] [the] [lazy] [dogs]
}</emphasis>.
- </para>
- <programlisting language="Java" role="Java">// make SQL
query
-QueryManager queryManager = workspace.getQueryManager();
-String sqlStatement = "SELECT * FROM mix:title WHERE CONTAINS(jcr:description,
'the')";
-// create query
-Query query = queryManager.createQuery(sqlStatement, Query.SQL);
-// execute query and fetch result
-QueryResult result = query.execute();</programlisting>
- <para>
- The <literal>NodeIterator</literal> will return
<emphasis>document1</emphasis>.
- </para>
- <para>
- However, if the default analyzer is changed to
<literal>org.apache.lucene.analysis.StopAnalyzer</literal>, the repository
populated again (the new Analyzer must process node properties) and the same query run, it
will return nothing, because stop words like
"<emphasis>the</emphasis>" will be excluded from parsed
string set.
- </para>
- </section>
- </section>
- <section xmlns="" id="chap-Reference_Guide-WebDAV">
- <title>WebDAV</title>
- <section id="sect-Reference_Guide-WebDAV-Introduction">
- <title>Introduction</title>
- <para>
- The <application>WebDAV</application> protocol enables you to use
third party tools to communicate with hierarchical content servers via the HTTP protocol.
It is possible to add and remove documents or a set of documents from a path on the
server.
- </para>
- <para>
- <application>DeltaV</application> is an extension of the WebDav
protocol that allows managing document versioning. The
<emphasis>Locking</emphasis> feature guarantees protection against multiple
access when writing resources. The ordering support allows changing the position of the
resource in the list and sort the directory to make the directory tree viewed
conveniently. The full-text search makes it easy to find the necessary documents. You can
search by using two languages: SQL and XPATH.
- </para>
- <para>
- In the eXo JCR, the WebDAV layer (based on the code taken from the extension
modules of the reference implementation) is plugged in on top of our JCR implementation.
This makes it possible to browse a workspace using the third party tools regardless of
operating system environments. You can use a Java WebDAV client, such as
<application>DAVExplorer</application> or <application>Internet
Explorer</application> using <menuchoice>
- <guimenu>File</guimenu>
- <guimenuitem>Open as a Web Folder</guimenuitem>
- </menuchoice>.
- </para>
- <para>
- WebDav is an extension of the REST service. To get the WebDav server ready,
you must deploy the REST application. Then, you can access any workspaces of your
repository by using the following URL:
- </para>
- <para>
- <ulink
url="http://host:port/portal/rest/private/jcr/{RepositoryName}/{WorkspaceName}/{Path}"
type="http"/>
- </para>
- <para>
- When accessing the WebDAV server via <ulink
url="http://localhost:8080/rest/jcr/repository/production"
type="http"/>, you can substitute <ulink
url="http://localhost:8080/rest/jcr/repository/production"
type="http">production</ulink> with <ulink
url="http://localhost:8080/rest/jcr/repository/collaboration"
type="http">collaboration</ulink>.
- </para>
- <para>
- You will be asked to enter your login credentials. These will then be checked
by using the organization service that can be implemented thanks to an InMemory (dummy)
module or a DB module or an LDAP one and the JCR user session will be created with the
correct JCR Credentials.
- </para>
- <note>
- <title>Note:</title>
- <para>
- If you try the "in ECM" option, add
"@ecm" to the user's password. Alternatively, you may modify
jaas.conf by adding the <emphasis role="bold">domain=ecm</emphasis>
option as follows:
- </para>
- <programlisting>exo-domain {
- org.exoplatform.services.security.jaas.BasicLoginModule required domain=ecm;
-};</programlisting>
- </note>
- </section>
- <section id="sect-Reference_Guide-WebDAV-WebDAV_Configuration">
- <title>WebDAV Configuration</title>
- <para>
- The WebDAV configuration file:
- </para>
- <programlisting language="XML"
role="XML"><component>
-
<key>org.exoplatform.services.webdav.WebDavServiceImpl</key>
-
<type>org.exoplatform.services.webdav.WebDavServiceImpl</type>
- <init-params>
-
- <!-- this parameter indicates the default login and password values
- used as credentials for accessing the repository -->
- <!-- value-param>
- <name>default-identity</name>
- <value>admin:admin</value>
- </value-param -->
-
- <!-- this is the value of WWW-Authenticate header -->
- <value-param>
- <name>auth-header</name>
- <value>Basic realm="eXo-Platform Webdav Server
1.6.1"</value>
- </value-param>
-
- <!-- default node type which is used for the creation of collections
-->
- <value-param>
- <name>def-folder-node-type</name>
- <value>nt:folder</value>
- </value-param>
-
- <!-- default node type which is used for the creation of files -->
- <value-param>
- <name>def-file-node-type</name>
- <value>nt:file</value>
- </value-param>
-
- <!-- if MimeTypeResolver can't find the required mime type,
- which conforms with the file extension, and the mimeType header is absent
- in the HTTP request header, this parameter is used
- as the default mime type-->
- <value-param>
- <name>def-file-mimetype</name>
- <value>application/octet-stream</value>
- </value-param>
-
- <!-- This parameter indicates one of the three cases when you update the
content of the resource by PUT command.
- In case of "create-version", PUT command creates the new
version of the resource if this resource exists.
- In case of "replace" - if the resource exists, PUT command
updates the content of the resource and its last modification date.
- In case of "add", the PUT command tries to create the new
resource with the same name (if the parent node allows same-name siblings).-->
-
- <value-param>
- <name>update-policy</name>
- <value>create-version</value>
- <!--value>replace</value -->
- <!-- value>add</value -->
- </value-param>
-
- <!--
- This parameter determines how service responds to a method that attempts to
modify file content.
- In case of "checkout-checkin" value, when a modification
request is applied to a checked-in version-controlled resource, the request is
automatically preceded by a checkout and followed by a checkin operation.
- In case of "checkout" value, when a modification request is
applied to a checked-in version-controlled resource, the request is automatically preceded
by a checkout operation.
- -->
- <value-param>
- <name>auto-version</name>
- <value>checkout-checkin</value>
- <!--value>checkout</value -->
- </value-param>
-
- <!--
- This parameter is responsible for managing Cache-Control header value which will
be returned to the client.
- You can use patterns like "text/*", "image/*"
or wildcard to define the type of content.
- -->
- <value-param>
- <name>cache-control</name>
-
<value>text/xml,text/html:max-age=3600;image/png,image/jpg:max-age=1800;*/*:no-cache;</value>
- </value-param>
-
- <!--
- This parameter determines the absolute path to the folder icon file, which is
shown
- during WebDAV view of the contents
- -->
- <value-param>
- <name>folder-icon-path</name>
- <value>/absolute/path/to/file</value>
- </value-param>
-
- </init-params>
-</component></programlisting>
- </section>
- <section
id="sect-Reference_Guide-WebDAV-Corresponding_WebDav_and_JCR_actions">
- <title>Corresponding WebDAV and JCR actions</title>
- <table>
- <title/>
- <tgroup cols="2">
- <thead>
- <row>
- <entry> WebDav </entry>
- <entry> JCR </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> COPY </entry>
- <entry> Workspace.copy(...) </entry>
- </row>
- <row>
- <entry> DELETE </entry>
- <entry> Node.remove() </entry>
- </row>
- <row>
- <entry> GET </entry>
- <entry> Node.getProperty(...); Property.getValue() </entry>
- </row>
- <row>
- <entry> HEAD </entry>
- <entry> Node.getProperty(...); Property.getLength() </entry>
- </row>
- <row>
- <entry> MKCOL </entry>
- <entry> Node.addNode(...) </entry>
- </row>
- <row>
- <entry> MOVE </entry>
- <entry> Session.move(...) or Workspace.move(...) </entry>
- </row>
- <row>
- <entry> PROPFIND </entry>
- <entry> Session.getNode(...); Node.getNode(...);
Node.getNodes(...); Node.getProperties() </entry>
- </row>
- <row>
- <entry> PROPPATCH </entry>
- <entry> Node.setProperty(...); Node.getProperty(...).remove()
</entry>
- </row>
- <row>
- <entry> PUT </entry>
- <entry>
Node.addNode("node","nt:file");
Node.setProperty("jcr:data", "data") </entry>
- </row>
- <row>
- <entry> CHECKIN </entry>
- <entry> Node.checkin() </entry>
- </row>
- <row>
- <entry> CHECKOUT </entry>
- <entry> Node.checkout() </entry>
- </row>
- <row>
- <entry> REPORT </entry>
- <entry> Node.getVersionHistory(); VersionHistory.getAllVersions();
Version.getProperties() </entry>
- </row>
- <row>
- <entry> RESTORE </entry>
- <entry> Node.restore(...) </entry>
- </row>
- <row>
- <entry> UNCHECKOUT </entry>
- <entry> Node.restore(...) </entry>
- </row>
- <row>
- <entry> VERSION-CONTROL </entry>
- <entry> Node.addMixin("mix:versionable")
</entry>
- </row>
- <row>
- <entry> LOCK </entry>
- <entry> Node.lock(...) </entry>
- </row>
- <row>
- <entry> UNLOCK </entry>
- <entry> Node.unlock() </entry>
- </row>
- <row>
- <entry> ORDERPATCH </entry>
- <entry> Node.orderBefore(...) </entry>
- </row>
- <row>
- <entry> SEARCH </entry>
- <entry> Workspace.getQueryManager(); QueryManager.createQuery();
Query.execute() </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
- <section id="sect-Reference_Guide-WebDAV-WebDAV_Considerations">
- <title>WebDAV Considerations</title>
-<!-- DOCS NOTE: This content is duplicated in the Site Publisher User Guide to avoid
cross-document linking.
- Any changes here should also be made there--> <para>
- There are some restrictions for WebDAV in different operating systems.
- </para>
- <formalpara
id="form-Reference_Guide-WebDAV_Considerations-Windows_7">
- <title>Windows 7</title>
- <para>
- When attempting to set up a web folder through <guilabel>Add a
Network Location</guilabel> or <guilabel>Map a Network Drive</guilabel>
through <guilabel>My Computer</guilabel>, an error message stating
<guilabel>The folder you entered does not appear to be valid. Please choose
another</guilabel> or <guilabel>Windows cannot access … Check the spelling of
the name. Otherwise, there might be …</guilabel> may be encountered. These errors
may appear when you are using SSL or non-SSL.
- </para>
- </formalpara>
- <para>
- To fix this, do as follows:
- </para>
- <procedure>
- <step>
- <para>
- Go to Windows Registry Editor.
- </para>
- </step>
- <step>
- <para>
- Find a key:
\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlset\services\WebClient\Parameters\BasicAuthLevel
.
- </para>
- </step>
- <step>
- <para>
- Change the value to 2.
- </para>
- </step>
- </procedure>
- <formalpara
id="form-Reference_Guide-WebDAV_Considerations-Microsoft_Office_2010">
- <title>Microsoft Office 2010</title>
- <para>
- If you have:
- </para>
- </formalpara>
- <itemizedlist>
- <listitem>
- <para>
- Microsoft Office 2007/2010 applications installed on a client
computer AND...
- </para>
- </listitem>
- <listitem>
- <para>
- The client computer is connected to a web server configured for Basic
authentication VIA...
- </para>
- </listitem>
- <listitem>
- <para>
- A connection that does not use Secure Sockets Layer (SSL) AND...
- </para>
- </listitem>
- <listitem>
- <para>
- You try to access an Office file that is stored on the remote
server...
- </para>
- </listitem>
- <listitem>
- <para>
- You might experience the following symptoms when you try to open or
to download the file:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- The Office file does not open or download.
- </para>
- </listitem>
- <listitem>
- <para>
- You do not receive a Basic authentication password prompt
when you try to open or to download the file.
- </para>
- </listitem>
- <listitem>
- <para>
- You do not receive an error message when you try to open the
file. The associated Office application starts. However, the selected file does not open.
- </para>
- </listitem>
- </itemizedlist>
- </listitem>
- </itemizedlist>
- <para>
- These outcomes can be circumvented by enabling Basic authentication on the
client machine.
- </para>
- <para>
- To enable Basic authentication on the client computer, follow these steps:
- </para>
- <procedure>
- <step>
- <para>
- Click Start, type <literal>regedit</literal> in the Start
Search box, and then press Enter.
- </para>
- </step>
- <step>
- <para>
- Locate and then click the following registry subkey:
- </para>
- <para>
-
<envar>HKEY_CURRENT_USER\Software\Microsoft\Office\14.0\Common\Internet</envar>
- </para>
- </step>
- <step>
- <para>
- On the <guilabel>Edit</guilabel> menu, point to
<guilabel>New</guilabel>, and then click <guilabel>DWORD
Value</guilabel>.
- </para>
- </step>
- <step>
- <para>
- Type <literal>BasicAuthLevel</literal>, and then press
<keycap>Enter</keycap>.
- </para>
- </step>
- <step>
- <para>
- Right-click <literal>BasicAuthLevel</literal>, and then
click <guilabel>Modify</guilabel>.
- </para>
- </step>
- <step>
- <para>
- In the Value data box, type <literal>2</literal>, and
then click <guilabel>OK</guilabel>.
- </para>
- </step>
- </procedure>
- </section>
- </section>
- <section xmlns="" id="chap-Reference_Guide-FTP">
- <title>FTP</title>
- <section id="sect-Reference_Guide-FTP-Introduction">
- <title>Introduction</title>
- <para>
- The JCR-FTP Server operates as an FTP server with access to a content stored
in JCR repositories in the form of <literal>nt:file/nt:folder</literal> nodes
or their successors. The client of an executed Server can be any FTP client. The FTP
server is supported by a standard configuration which can be changed as required.
- </para>
- </section>
- <section id="sect-Reference_Guide-FTP-Configuration_Parameters">
- <title>Configuration Parameters</title>
- <variablelist
id="vari-Reference_Guide-Configuration_Parameters-Parameters">
- <title>Parameters</title>
- <varlistentry>
- <term>command-port:</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>command-port</name>
- <value>21</value>
-</value-param></programlisting>
- <para>
- The value of the command channel port. The value
'<literal>21</literal>' is default.
- </para>
- <para>
- If you have already other FTP server installed in your system,
this parameter needs to be changed (to <literal>2121</literal>, for example)
to avoid conflicts or if the port is protected.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>data-min-port and data-max-port</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>data-min-port</name>
- <value>52000</value>
-</value-param></programlisting>
- <programlisting language="XML"
role="XML"><value-param>
- <name>data-max-port</name>
- <value>53000</value>
-</value-param></programlisting>
- <para>
- These two parameters indicate the minimum and maximum values of
the range of ports, used by the server. The usage of the additional data channel is
required by the FTP protocol, which is used to transfer the contents of files and the
listing of catalogues. This range of ports should be free from listening by other
server-programs.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>system</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>system</name>
-
- <value>Windows_NT</value>
- or
- <value>UNIX Type: L8</value>
-</value-param></programlisting>
- <para>
- Types of formats of listing of catalogues which are supported.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>client-side-encoding</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>client-side-encoding</name>
-
- <value>windows-1251</value>
- or
- <value>KOI8-R</value>
-
-</value-param></programlisting>
- <para>
- This parameter specifies the coding which is used for dialogue
with the client.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>def-folder-node-type</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>def-folder-node-type</name>
- <value>nt:folder</value>
-</value-param></programlisting>
- <para>
- This parameter specifies the type of a node, when an FTP-folder
is created.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>def-file-node-type</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>def-file-node-type</name>
- <value>nt:file</value>
-</value-param></programlisting>
- <para>
- This parameter specifies the type of a node, when an FTP-file is
created.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>def-file-mime-type</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>def-file-mime-type</name>
- <value>application/zip</value>
-</value-param></programlisting>
- <para>
- The mime type of a created file is chosen by using its file
extension. In case, a server cannot find the corresponding mime type, this value is used.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>cache-folder-name</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>cache-folder-name</name>
- <value>../temp/ftp_cache</value>
-</value-param></programlisting>
- <para>
- The Path of the cache folder.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>upload-speed-limit</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>upload-speed-limit</name>
- <value>20480</value>
-</value-param></programlisting>
- <para>
- Restriction of the upload speed. It is measured in bytes.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>download-speed-limit</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>download-speed-limit</name>
- <value>20480</value>
-</value-param></programlisting>
- <para>
- Restriction of the download speed. It is measured in bytes.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>timeout</term>
- <listitem>
- <programlisting language="XML"
role="XML"><value-param>
- <name>timeout</name>
- <value>60</value>
-</value-param></programlisting>
- <para>
- Defines the value of a timeout.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Use_External_Backup_Tool">
- <title>Use External Backup Tool</title>
- <section
id="sect-Reference_Guide-Use_External_Backup_Tool-Repository_Suspending">
- <title>Repository Suspending</title>
- <para>
- To have the repository content consistent with the search index and value storage, the
repository should be suspended. This means all working threads are suspended until a
resume operation is performed. The index will be flushed.
- </para>
- <para>
- JCR provides ability to suspend repository via JMX.
- </para>
- <figure>
- <title id="repositorysuspendcontroller">Repository Suspend
Controller</title>
- <mediaobject>
- <imageobject>
- <imagedata width="444"
fileref="images/repository-suspend-controller.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- <para>
- To suspend repository you need to invoke the <literal>suspend()</literal>
operation. The returned result will be
"<emphasis>suspended</emphasis>" if everything passed
successfully.
- </para>
- <figure>
- <title id="repository-suspend-controller-suspended.">Repository
Suspend Controller Suspended</title>
- <mediaobject>
- <imageobject>
- <imagedata
fileref="images/repository-suspend-controller-suspended.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- <para>
- An "<emphasis>undefined</emphasis>" result means not all
components were successfully suspended. Check the console to review the stack traces.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Use_External_Backup_Tool-Backup">
- <title>Backup</title>
- <para>
- You can backup your content manually or by using third part software. You should back
up:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- Database.
- </para>
- </listitem>
- <listitem>
- <para>
- Lucene index.
- </para>
- </listitem>
- <listitem>
- <para>
- Value storage (if configured).
- </para>
- </listitem>
- </itemizedlist>
- </section>
- <section
id="sect-Reference_Guide-Use_External_Backup_Tool-Repository_Resuming">
- <title>Repository Resuming</title>
- <para>
- Once a backup is done you need to invoke the <literal>resume()</literal>
operation to switch the repository back to on-line. The returned result will be
"<emphasis>on-line</emphasis>".
- </para>
- <figure>
- <title id="repository-suspend-controller-online.">Repository
Suspend Controller Online</title>
- <mediaobject>
- <imageobject>
- <imagedata
fileref="images/repository-suspend-controller-online.png"/>
- </imageobject>
- </mediaobject>
- </figure>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-eXo_JCR_statistics">
- <title>eXo JCR statistics</title>
- <section
id="sect-Reference_Guide-eXo_JCR_statistics-Statistics_on_the_Database_Access_Layer">
- <title>Statistics on the Database Access Layer</title>
- <para>
- In order to have a better idea of the time spent into the database access
layer, it can be interesting to get some statistics on that part of the code, knowing that
most of the time spent into eXo JCR is mainly the database access.
- </para>
- <para>
- These statistics will then allow you to identify, without using any profiler,
what is abnormally slow in this layer which could help diagnose, and fix, a problem.
- </para>
- <para>
- If you use
<envar>org.exoplatform.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer</envar>
or
<envar>org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer</envar>
as <envar>WorkspaceDataContainer</envar>, you can get statistics on the time
spent into the database access layer.
- </para>
- <para>
- The database access layer (in eXo JCR) is represented by the methods of the
interface
<envar>org.exoplatform.services.jcr.storage.WorkspaceStorageConnection</envar>,
so for all the methods defined in this interface, we can have the following figures:
- </para>
- <itemizedlist>
- <listitem>
- <para>
- The minimum time spent into the method.
- </para>
- </listitem>
- <listitem>
- <para>
- The maximum time spent into the method.
- </para>
- </listitem>
- <listitem>
- <para>
- The average time spent into the method.
- </para>
- </listitem>
- <listitem>
- <para>
- The total amount of time spent into the method.
- </para>
- </listitem>
- <listitem>
- <para>
- The total amount of time the method has been called.
- </para>
- </listitem>
- </itemizedlist>
- <para>
- Those figures are also available globally for all the methods which gives us
the global behavior of this layer.
- </para>
- <para>
- If you want to enable the statistics, you just need to set the JVM parameter
called <parameter>JDBCWorkspaceDataContainer.statistics.enabled</parameter> to
<emphasis>true</emphasis>. The corresponding CSV file is
<filename>StatisticsJDBCStorageConnection-${creation-timestamp}.csv</filename>
for more details about how the CSV files are managed, please refer to the section
dedicated to the statistics manager.
- </para>
- <para>
- The format of each column header is
<replaceable>${method-alias}</replaceable>-<replaceable>${metric-alias}</replaceable>.
The metric alias are described in the statistics manager section.
- </para>
- <para>
- The name of the category of statistics corresponding to these statistics is
<literal>JDBCStorageConnection</literal>, this name is mostly needed to access
to the statistics through JMX.
- </para>
- <table
id="tabl-Reference_Guide-Statistics_on_the_Database_Access_Layer-Method_Alias">
- <title>Method Alias</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry> global </entry>
- <entry> This is the alias for all the methods. </entry>
- </row>
- <row>
- <entry> getItemDataById </entry>
- <entry> This is the alias for the method
<emphasis>getItemData(String identifier).</emphasis></entry>
- </row>
- <row>
- <entry> getItemDataByNodeDataNQPathEntry </entry>
- <entry> This is the alias for the method
<emphasis>getItemData(NodeData parentData, QPathEntry
name).</emphasis></entry>
- </row>
- <row>
- <entry> getChildNodesData </entry>
- <entry> This is the alias for the method
<emphasis>getChildNodesData(NodeData parent).</emphasis></entry>
- </row>
- <row>
- <entry> getChildNodesCount </entry>
- <entry> This is the alias for the method
<emphasis>getChildNodesCount(NodeData parent).</emphasis></entry>
- </row>
- <row>
- <entry> getChildPropertiesData </entry>
- <entry> This is the alias for the method
<emphasis>getChildPropertiesData(NodeData parent).</emphasis></entry>
- </row>
- <row>
- <entry> listChildPropertiesData </entry>
- <entry> This is the alias for the method
<emphasis>listChildPropertiesData(NodeData parent).</emphasis></entry>
- </row>
- <row>
- <entry> getReferencesData </entry>
- <entry> This is the alias for the method
<emphasis>getReferencesData(String nodeIdentifier).</emphasis></entry>
- </row>
- <row>
- <entry> commit </entry>
- <entry> This is the alias for the method
<emphasis>commit().</emphasis></entry>
- </row>
- <row>
- <entry> addNodeData </entry>
- <entry> This is the alias for the method
<emphasis>add(NodeData data).</emphasis></entry>
- </row>
- <row>
- <entry> addPropertyData </entry>
- <entry> This is the alias for the method
<emphasis>add(PropertyData data).</emphasis></entry>
- </row>
- <row>
- <entry> updateNodeData </entry>
- <entry> This is the alias for the method
<emphasis>update(NodeData data).</emphasis></entry>
- </row>
- <row>
- <entry> updatePropertyData </entry>
- <entry> This is the alias for the method
<emphasis>update(PropertyData data).</emphasis></entry>
- </row>
- <row>
- <entry> deleteNodeData </entry>
- <entry> This is the alias for the method
<emphasis>delete(NodeData data).</emphasis></entry>
- </row>
- <row>
- <entry> deletePropertyData </entry>
- <entry> This is the alias for the method
<emphasis>delete(PropertyData data).</emphasis></entry>
- </row>
- <row>
- <entry> renameNodeData </entry>
- <entry> This is the alias for the method
<emphasis>rename(NodeData data).</emphasis></entry>
- </row>
- <row>
- <entry> rollback </entry>
- <entry> This is the alias for the method
<emphasis>rollback().</emphasis></entry>
- </row>
- <row>
- <entry> isOpened </entry>
- <entry> This is the alias for the method
<emphasis>isOpened().</emphasis></entry>
- </row>
- <row>
- <entry> close </entry>
- <entry> This is the alias for the method
<emphasis>close().</emphasis></entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
- <section
id="sect-Reference_Guide-eXo_JCR_statistics-Statistics_on_the_JCR_API_accesses">
- <title>Statistics on the JCR API accesses</title>
- <para>
- In order to know exactly how your application uses eXo JCR, it can be
interesting to register all the JCR API accesses in order to easily create real life test
scenario based on pure JCR calls and also to tune your JCR to better fit your
requirements.
- </para>
- <para>
- In order to allow you to specify the configuration which part of eXo JCR
needs to be monitored without applying any changes in your code and/or building anything,
we choose to rely on the Load-time Weaving proposed by AspectJ.
- </para>
- <para>
- To enable this feature, you will have to add in your classpath the following
jar files:
- </para>
- <itemizedlist>
- <listitem>
- <para>
-
<emphasis>exo.jcr.component.statistics-X.Y.Z</emphasis>.jar corresponding to
your eXo JCR version that you can get from the JBoss maven repository <ulink
url="https://repository.jboss.org/nexus/content/groups/public/org/ex...;.
- </para>
- </listitem>
- <listitem>
- <para>
- aspectjrt-1.6.8.jar that you can get from the main maven repository
<ulink
url="http://repo2.maven.org/maven2/org/aspectj/aspectjrt">
-
<
uri>http://repo2.maven.org/maven2/org/aspectj/aspectjrt</uri>
- </ulink>.
- </para>
- </listitem>
- </itemizedlist>
- <para>
- You will also need to get
<filename>aspectjweaver-1.6.8.jar</filename> from the main maven repository
<ulink
url="http://repo2.maven.org/maven2/org/aspectj/aspectjweaver"&g...;.
- </para>
- <para>
- At this stage, to enable the statistics on the JCR API accesses, you will
need to add the JVM parameter
<parameter>-javaagent:${pathto}/aspectjweaver-1.6.8.jar</parameter> to your
command line, for more details please refer to <ulink
url="http://www.eclipse.org/aspectj/doc/released/devguide/ltw-config...;.
- </para>
- <para>
- By default, the configuration will collect statistics on all the methods of
the internal interfaces
<literal>org.exoplatform.services.jcr.core.ExtendedSession</literal> and
<literal>org.exoplatform.services.jcr.core.ExtendedNode</literal>, and the JCR
API interface <literal>javax.jcr.Property</literal>.
- </para>
- <para>
- To add and/or remove some interfaces to monitor, you have two configuration
files to change that are bundled into the jar
<literal>exo.jcr.component.statistics-X.Y.Z</literal>.jar, which are
<filename>conf/configuration.xml</filename> and
<filename>META-INF/aop.xml</filename>.
- </para>
- <para>
- The file content below is the content of
<filename>conf/configuration.xml</filename> that you will need to modify to
add and/or remove the full qualified name of the interfaces to monitor, into the list of
parameter values of the init param called
<literal>targetInterfaces</literal>.
- </para>
- <programlisting language="XML"
role="XML"><configuration
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.exoplaform.org/xml/ns/kernel_1_2.xsd
http://www.exoplaform.org/xml/ns/kernel_1_2.xsd"
-
xmlns="http://www.exoplaform.org/xml/ns/kernel_1_2.xsd"...
-
- <component>
-
<type>org.exoplatform.services.jcr.statistics.JCRAPIAspectConfig</type>
- <init-params>
- <values-param>
- <name>targetInterfaces</name>
-
<value>org.exoplatform.services.jcr.core.ExtendedSession</value>
-
<value>org.exoplatform.services.jcr.core.ExtendedNode</value>
- <value>javax.jcr.Property</value>
- </values-param>
- </init-params>
- </component>
-</configuration></programlisting>
- <para>
- The file content below is the content of
<filename>META-INF/aop.xml</filename> that you will to need to modify to add
and/or remove the full qualified name of the interfaces to monitor, into the expression
filter of the pointcut called <literal>JCRAPIPointcut</literal>.
- </para>
- <para>
- By default only JCR API calls from the
<literal>exoplatform</literal> packages are taken into account. This filter
can be modified to add other package names.
- </para>
- <programlisting language="XML"
role="XML"><aspectj>
- <aspects>
- <concrete-aspect
name="org.exoplatform.services.jcr.statistics.JCRAPIAspectImpl"
extends="org.exoplatform.services.jcr.statistics.JCRAPIAspect">
- <pointcut name="JCRAPIPointcut"
- expression="(target(org.exoplatform.services.jcr.core.ExtendedSession)
|| target(org.exoplatform.services.jcr.core.ExtendedNode) || target(javax.jcr.Property))
&& call(public * *(..))" />
- </concrete-aspect>
- </aspects>
- <weaver options="-XnoInline">
- <include within="org.exoplatform..*" />
- </weaver>
-</aspectj></programlisting>
- <para>
- The corresponding CSV files are of type
<filename>Statistics<replaceable>${interface-name}</replaceable>-<replaceable>${creation-timestamp}</replaceable>.csv</filename>
for more details about how the <emphasis>CSV</emphasis> files are managed,
please refer to the section dedicated to the statistics manager.
- </para>
- <para>
- The format of each column header is
<replaceable>${method-alias}</replaceable>-<replaceable>${metric-alias}</replaceable>.
The method alias will be of type
<replaceable>${method-name}(semicolon-delimited-list-of-parameter-types-to-be-compatible-with-the-CSV-format)</replaceable>.
- </para>
- <para>
- The metric alias are described in the statistics manager section.
- </para>
- <para>
- The name of the category of statistics corresponding to these statistics is
the simple name of the monitored interface (e.g.
<literal>ExtendedSession</literal> for
<literal>org.exoplatform.services.jcr.core.ExtendedSession</literal>), this
name is mostly needed to access to the statistics through JMX.
- </para>
- <note>
- <title>Performance Consideration</title>
- <para>
- Please note that this feature will affect the performances of eXo JCR so
it must be used with caution.
- </para>
- </note>
- </section>
- <section
id="sect-Reference_Guide-eXo_JCR_statistics-Statistics_Manager">
- <title>Statistics Manager</title>
- <para>
- The statistics manager manages all the statistics provided by eXo JCR, it is
responsible of printing the data into the CSV files and also exposing the statistics
through JMX and/or Rest.
- </para>
- <para>
- The statistics manager will create all the CSV files for each category of
statistics that it manages, the format of those files is
<emphasis>Statistics${category-name}-${creation-timestamp}.csv</emphasis>.
Those files will be created into the user directory if it is possible otherwise it will
create them into the temporary directory. The format of those files is
<envar>CSV</envar> (i.e. Comma-Separated Values), one new line will be added
regularly (every 5 seconds by default) and one last line will be added at JVM exit. Each
line, will be composed of the 5 figures described below for each method and globally for
all the methods.
- </para>
- <para>
- <table
id="tabl-Reference_Guide-Statistics_Manager-Metric_Alias">
- <title>Metric Alias</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry> Min </entry>
- <entry> The minimum time spent into the method expressed in
milliseconds. </entry>
- </row>
- <row>
- <entry> Max </entry>
- <entry> The maximum time spent into the method expressed in
milliseconds. </entry>
- </row>
- <row>
- <entry> Total </entry>
- <entry> The total amount of time spent into the method expressed
in milliseconds. </entry>
- </row>
- <row>
- <entry> Avg </entry>
- <entry> The average time spent into the method expressed in
milliseconds. </entry>
- </row>
- <row>
- <entry> Times </entry>
- <entry> The total amount of times the method has been called.
</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- You can disable the persistence of the statistics by setting the JVM
parameter called
<parameter>JCRStatisticsManager.persistence.enabled</parameter> to
<literal>false</literal>. It is set to <literal>true</literal> by
default.
- </para>
- <para>
- You can also define the period of time between each record (that is, line of
data into the file) by setting the JVM parameter called
<parameter>JCRStatisticsManager.persistence.timeout</parameter> to your
expected value expressed in milliseconds. It is set to <literal>5000</literal>
by default.
- </para>
- <para>
- You can also access to the statistics via JMX. The available methods are:
- </para>
- <para>
- <table
id="tabl-Reference_Guide-Statistics_Manager-JMX_Methods">
- <title>JMX Methods</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry> getMin </entry>
- <entry> Give the minimum time spent into the method corresponding
to the given category name and statistics name. The expected arguments are the name of the
category of statistics (<literal>JDBCStorageConnection</literal> for example)
and the name of the expected method or global for the global value. </entry>
- </row>
- <row>
- <entry> getMax </entry>
- <entry> Give the maximum time spent into the method corresponding
to the given category name and statistics name. The expected arguments are the name of the
category of statistics and the name of the expected method or global for the global value.
</entry>
- </row>
- <row>
- <entry> getTotal </entry>
- <entry> Give the total amount of time spent into the method
corresponding to the given category name and statistics name. The expected arguments are
the name of the category of statistics and the name of the expected method or global for
the global value. </entry>
- </row>
- <row>
- <entry> getAvg </entry>
- <entry> Give the average time spent into the method corresponding
to the given category name and statistics name. The expected arguments are the name of the
category of statistics and the name of the expected method or global for the global value.
</entry>
- </row>
- <row>
- <entry> getTimes </entry>
- <entry> Give the total amount of times the method has been called
corresponding to the given category name and statistics name. The expected arguments are
the name of the category of statistics (e.g. JDBCStorageConnection) and the name of the
expected method or global for the global value. </entry>
- </row>
- <row>
- <entry> reset </entry>
- <entry> Reset the statistics for the given category name and
statistics name. The expected arguments are the name of the category of statistics and the
name of the expected method or global for the global value. </entry>
- </row>
- <row>
- <entry> resetAll </entry>
- <entry> Reset all the statistics for the given category name. The
expected argument is the name of the category of statistics (e.g. JDBCStorageConnection).
</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- The full name of the related MBean is <literal>xo:service=statistic,
view=jcr</literal>.
- </para>
- </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-Checking_repository_integrity_and_consistency">
- <title>Checking repository integrity and consistency</title>
- <section
id="sect-Reference_Guide-Checking_repository_integrity_and_consistency-JMX_based_consistency_tool">
- <title>JMX-based consistency tool</title>
- <para>
- It is important to check the integrity and consistency of system regularly, especially
if there is no, or stale, backups. The JBoss Portal Platform JCR implementation offers an
innovative JMX-based complex checking tool.
- </para>
- <para>
- During an inspection, the tool checks every major JCR component, such as persistent
data layer and the index. The persistent layer includes JDBC Data Container and
Value-Storages if they are configured.
- </para>
- <para>
- The database is verified using the set of complex specialized domain-specific queries.
The Value Storage tool checks the existence of, and access to, each file.
- </para>
- <para>
- Access to the check tool is exposed via the JMX interface, with the following
operations available:
- </para>
- <table
id="tabl-Reference_Guide-JMX_based_consistency_tool-Available_methods">
- <title>Available methods</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>
- <code>checkRepositoryDataConsistency()</code>
- </entry>
- <entry> Inspect full repository data (db, value storage and search
index) </entry>
- </row>
- <row>
- <entry>
- <code>checkRepositoryDataBaseConsistency()</code>
- </entry>
- <entry> Inspect only DB </entry>
- </row>
- <row>
- <entry>
- <code>checkRepositoryValueStorageConsistency()</code>
- </entry>
- <entry> Inspect only ValueStorage </entry>
- </row>
- <row>
- <entry>
- <code>checkRepositorySearchIndexConsistency()</code>
- </entry>
- <entry> Inspect only SearchIndex </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <para>
- All inspection activities and corrupted data details are stored in a file in the
<filename>app</filename> directory and named as per the following convention:
<code> report-<replaceable><repository
name></replaceable>-<replaceable>dd-MMM-yy-HH-mm</replaceable>.txt
</code>.
- </para>
- <para>
- The path to the file will be returned in result message also at the end of the
inspection.
- </para>
- <note>
- <para>
- There are three types of inconsistency (Warning, Error and Index) and two of them are
critical (Errors and Index):
- </para>
- <itemizedlist>
- <listitem>
- <para>
- Index faults are marked as "Reindex" and can be fixed by
re-indexing the workspace.
- </para>
- </listitem>
- <listitem>
- <para>
- Errors can only be fixed manually.
- </para>
- </listitem>
- <listitem>
- <para>
- Warnings can be a normal situation in some cases and usually production system will
still remain fully functional.
- </para>
- </listitem>
- </itemizedlist>
- </note>
- </section>
- </section>
-<!-- tuning guide
- DOC NOTE: Could possibly be moved to a specific Tuning Guide later -->
<section xmlns=""
id="chap-Reference_Guide-JCR_Performance_Tuning_Guide">
- <title>JCR Performance Tuning Guide</title>
- <section
id="sect-Reference_Guide-JCR_Performance_Tuning_Guide-Introduction">
- <title>Introduction</title>
- <para>
- This section will show you various ways of improving JCR performance.
- </para>
- <para>
- It is intended for Administrators and others who want to use the JCR features more
efficiently.
- </para>
- </section>
- <section
id="sect-Reference_Guide-JCR_Performance_Tuning_Guide-JCR_Performance_and_Scalability">
- <title>JCR Performance and Scalability</title>
- <section
id="sect-Reference_Guide-JCR_Performance_and_Scalability-Cluster_configuration">
- <title>Cluster configuration</title>
- <para>
- The table below contains details about the configuration of the cluster used in
benchmark testing:
- </para>
- <table
id="tabl-Reference_Guide-Cluster_configuration-EC2_network_1Gbit">
- <title>EC2 network: 1Gbit</title>
- <tgroup cols="2">
- <colspec colname="1"/>
- <colspec colname="2"/>
- <spanspec namest="1" nameend="2"
spanname="hspan"/>
- <thead>
- <row>
- <entry> Servers hardware </entry>
- <entry> Specification </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> RAM </entry>
- <entry> 7.5 GB </entry>
- </row>
- <row>
- <entry> Processors </entry>
- <entry> 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute
Units each) </entry>
- </row>
- <row>
- <entry> Storage </entry>
- <entry> 850 GB (2×420 GB plus 10 GB root partition)
</entry>
- </row>
- <row>
- <entry> Architecture </entry>
- <entry> 64-bit </entry>
- </row>
- <row>
- <entry> I/O Performance </entry>
- <entry> High </entry>
- </row>
- <row>
- <entry> API name </entry>
- <entry>
- <literal>m1.large</literal>
- </entry>
- </row>
- <row>
- <entry spanname="hspan">
- <emphasis role="bold">Note:</emphasis>
- </entry>
- </row>
- <row>
- <entry spanname="hspan"> NFS and statistics (cacti
snmp) server were located on one physical server. </entry>
- </row>
- <row>
- <entry spanname="hspan">
- <emphasis role="bold">JBoss Enterprise Application
Platform 6 configuration:</emphasis>
- </entry>
- </row>
- <row>
- <entry spanname="hspan">
- <code>JAVA_OPTS: -Dprogram.name=run.sh -server -Xms4g -Xmx4g
-XX:MaxPermSize=512m -Dorg.jboss.resolver.warning=true
-Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000
-XX:+UseParallelGC -Djava.net.preferIPv4Stack=true</code>
- </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
- <section
id="sect-Reference_Guide-JCR_Performance_and_Scalability-JCR_Clustered_Performance">
- <title>JCR Clustered Performance</title>
- <para>
- Benchmark test using WebDAV (Complex read/write load test (benchmark)) with 20K same
file. To obtain per-operation results we have used custom output from the test case
threads to CSV file.
- </para>
- <para>
- <citetitle>Read operation</citetitle>:
- <simplelist>
- <member>Warm-up iterations: 100</member>
- <member>Run iterations: 2000</member>
- <member>Background writing threads: 25</member>
- <member>Reading threads: 225</member>
- </simplelist>
-
- </para>
- <figure>
- <title id="perf_EC2_result">EC2 Performance
Results</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/perf_EC2_results.jpg"/>
- </imageobject>
- </mediaobject>
- </figure>
- <table>
- <title/>
- <tgroup cols="4">
- <thead>
- <row>
- <entry> Nodes count </entry>
- <entry> tps </entry>
- <entry> Responses >2s </entry>
- <entry> Responses >4s </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> 1 </entry>
- <entry> 523 </entry>
- <entry> 6.87% </entry>
- <entry> 1.27% </entry>
- </row>
- <row>
- <entry> 2 </entry>
- <entry> 1754 </entry>
- <entry> 0.64% </entry>
- <entry> 0.08% </entry>
- </row>
- <row>
- <entry> 3 </entry>
- <entry> 2388 </entry>
- <entry> 0.49% </entry>
- <entry> 0.09% </entry>
- </row>
- <row>
- <entry> 4 </entry>
- <entry> 2706 </entry>
- <entry> 0.46% </entry>
- <entry> 0.1% </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- <para>
- <citetitle>Read operation with more threads</citetitle>:
- </para>
- <simplelist>
- <member>Warm-up iterations: 100</member>
- <member>Run iterations: 2000</member>
- <member>Background writing threads: 50</member>
- <member>Reading threads: 450</member>
- </simplelist>
- <figure>
- <title id="perf_EC2_result2">EC2 Performance Results
2</title>
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/perf_EC2_results_2.jpg"/>
- </imageobject>
- </mediaobject>
- </figure>
- <table>
- <title/>
- <tgroup cols="4">
- <thead>
- <row>
- <entry> Nodes count </entry>
- <entry> tps </entry>
- <entry> Responses >2s </entry>
- <entry> Responses >4s </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry> 1 </entry>
- <entry> 116 </entry>
- <entry> ? </entry>
- <entry> ? </entry>
- </row>
- <row>
- <entry> 2 </entry>
- <entry> 1558 </entry>
- <entry> 6.1% </entry>
- <entry> 0.6% </entry>
- </row>
- <row>
- <entry> 3 </entry>
- <entry> 2242 </entry>
- <entry> 3.1% </entry>
- <entry> 0.38% </entry>
- </row>
- <row>
- <entry> 4 </entry>
- <entry> 2756 </entry>
- <entry> 2.2% </entry>
- <entry> 0.41% </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
- </section>
- <section
id="sect-Reference_Guide-JCR_Performance_Tuning_Guide-Performance_Tuning_Guide">
- <title>Performance Tuning Guide</title>
- <section
id="sect-Reference_Guide-Performance_Tuning_Guide-JBoss_AS_Tuning">
- <title>JBoss Enterprise Application Platform 6 Tuning</title>
- <para>
- You can use <parameter>maxThreads</parameter> parameter to increase
maximum amount of threads that can be launched in AS instance. This can improve
performance if you need a high level of concurrency. also you can use
<code>-XX:+UseParallelGC</code> java directory to use parallel garbage
collector.
- </para>
- <note>
- <title>Note</title>
- <para>
- Beware of setting <parameter>maxThreads</parameter> too big, this can
cause <exceptionname>OutOfMemoryError</exceptionname>. We've got it
with <code>maxThreads=1250</code> on such machine:
- </para>
- <simplelist>
- <member>7.5 GB memory</member>
- <member>4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units
each)</member>
- <member>850 GB instance storage (2×420 GB plus 10 GB root
partition)</member>
- <member>64-bit platform</member>
- <member>I/O Performance: High</member>
- <member>API name: m1.large</member>
- <member>java -Xmx 4g</member>
- </simplelist>
- </note>
- </section>
- <section
id="sect-Reference_Guide-Performance_Tuning_Guide-JCR_Cache_Tuning">
- <title>JCR Cache Tuning</title>
- <para>
- <citetitle>Cache size</citetitle>
- </para>
- <para>
- JCR-cluster implementation is built using JBoss Cache as distributed, replicated
cache. But there is one particularity related to remove action in it. Speed of this
operation depends on the actual size of cache. As many nodes are currently in cache as
much time is needed to remove one particular node (subtree) from it.
- </para>
- <para>
- <citetitle>Eviction</citetitle>
- </para>
- <para>
- Manipulations with eviction <parameter>wakeUpInterval</parameter> value
does not affect on performance. Performance results with values from 500 up to 3000 are
approximately equal.
- </para>
- <para>
- <citetitle>Transaction Timeout</citetitle>
- </para>
- <para>
- Using short timeout for long transactions such as Export/Import, removing huge
subtree defined timeout may cause
<exceptionname>TransactionTimeoutException</exceptionname>.
- </para>
- </section>
- <section
id="sect-Reference_Guide-Performance_Tuning_Guide-Clustering">
- <title>Clustering</title>
- <para>
- For performance it is better to have a load-balancer, DB server and shared NFS on
different computers. If in some reasons you see that one node gets more load than others
you can decrease this load using load value in load balancer.
- </para>
- <para>
- <citetitle>JGroups configuration</citetitle>
- </para>
- <para>
- It's recommended to use "multiplexer stack" feature
present in JGroups. It is set by default in eXo JCR and offers higher performance in
cluster, using less network connections also. If there are two or more clusters in your
network, please check that they use different ports and different cluster names.
- </para>
- <para>
- <citetitle>Write performance in cluster</citetitle>
- </para>
- <para>
- Exo JCR implementation uses Lucene indexing engine to provide search capabilities.
But Lucene brings some limitations for write operations: it can perform indexing only in
one thread. That is why write performance in cluster is not higher than in singleton
environment. Data is indexed on coordinator node, so increasing write-load on cluster may
lead to ReplicationTimeout exception. It occurs because writing threads queue in the
indexer and under high load timeout for replication to coordinator will be exceeded.
- </para>
- <para>
- Taking in consideration this fact, it is recommended to exceed
<parameter>replTimeout</parameter> value in cache configurations in case of
high write-load.
- </para>
- <para>
- <citetitle>Replication timeout</citetitle>
- </para>
- <para>
- Some operations may take too much time. So if you get
<exceptionname>ReplicationTimeoutException</exceptionname> try increasing
replication timeout:
- </para>
- <programlisting language="XML" role="XML">
<clustering mode="replication"
clusterName="${jbosscache-cluster-name}">
- ...
- <sync replTimeout="60000" />
- </clustering>
-</programlisting>
- <para>
- value is set in milliseconds.
- </para>
- </section>
-<!-- <section
id="sect-Reference_Guide-Performance_Tuning_Guide-JVM_parameters">
- <title>JVM parameters</title>
- <para>
- <citetitle>PermGen space size</citetitle>
- </para>
- <para>
- If you intend to use Infinispan, you will have to increase the PermGen
size to at least 256 Mo due to the latest versions of JGroups that are needed by
Infinispan (please note that Infinspan is only dedicated to the community for now, no
support will be provided). In case, you intend to use JBoss Cache, you can keep on using
JGroups 2.6.13.GA which means that you don't need to increase the PermGen size.
- </para>
-
- </section> --> </section>
- </section>
- <section xmlns=""
id="chap-Reference_Guide-eXo_JCR_with_GateIn">
- <title>eXo JCR with JBoss Portal Platform</title>
- <section xmlns=""
id="sect-Reference_Guide-How_to_use_AS_Managed_DataSource_under_JBoss_AS">
- <title>How to use a Managed DataSource under JBoss Enterprise Application
Platform 6</title>
- <section
id="sect-Reference_Guide-How_to_use_AS_Managed_DataSource_under_JBoss_AS-Configurations_Steps">
- <title>Configurations Steps</title>
- <section
id="sect-Reference_Guide-Configurations_Steps-Declaring_the_datasources_in_the_AS">
- <title>Declaring the Datasources in the AS</title>
- <remark>NEEDINFO - FILE PATHS - I know this isn't right. Where
do these get deployed again?</remark>
- <para>
- To declare the datasources using a JBoss application server, deploy a
<literal>ds</literal> file
(<filename><replaceable>XXX</replaceable>-ds.xml</filename>) into
the <emphasis>deploy</emphasis> directory of the appropriate server profile
(<filename>/server/<replaceable>PROFILE</replaceable>/deploy</filename>,
for example).
- </para>
- <para>
- This file configures all datasources which JBoss Portal Platform will need (there
should be four specifically named: <emphasis>jdbcjcr_portal</emphasis>,
<emphasis>jdbcjcr_portal-sample</emphasis>,
<emphasis>jdbcidm_portal</emphasis> and
<emphasis>jdbcidm_sample-portal</emphasis>).
- </para>
- <para>
- For example:
- </para>
- <programlisting language="XML"
role="XML"><?xml version="1.0"
encoding="UTF-8"?>
-<datasources>
- <no-tx-datasource>
- <jndi-name>jdbcjcr_portal</jndi-name>
-
<connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbcjcr_portal</connection-url>
- <driver-class>org.hsqldb.jdbcDriver</driver-class>
- <user-name>sa</user-name>
- <password></password>
- </no-tx-datasource>
-
- <no-tx-datasource>
- <jndi-name>jdbcjcr_sample-portal</jndi-name>
-
<connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbcjcr_sample-portal</connection-url>
- <driver-class>org.hsqldb.jdbcDriver</driver-class>
- <user-name>sa</user-name>
- <password></password>
- </no-tx-datasource>
-
- <no-tx-datasource>
- <jndi-name>jdbcidm_portal</jndi-name>
-
<connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbcidm_portal</connection-url>
- <driver-class>org.hsqldb.jdbcDriver</driver-class>
- <user-name>sa</user-name>
- <password></password>
- </no-tx-datasource>
-
- <no-tx-datasource>
- <jndi-name>jdbcidm_sample-portal</jndi-name>
-
<connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbcidm_sample-portal</connection-url>
- <driver-class>org.hsqldb.jdbcDriver</driver-class>
- <user-name>sa</user-name>
- <password></password>
- </no-tx-datasource>
-</datasources></programlisting>
- <para>
- The properties can be set for datasource can be found here: <ulink
url="http://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/4...
JDBC DataSources - The non transactional DataSource configuration schema</ulink>
- </para>
- </section>
- <section
id="sect-Reference_Guide-Configurations_Steps-Do_not_bind_datasources_explicitly">
- <title>Do not bind datasources explicitly</title>
- <para>
- Do not let the portal explicitly bind datasources. </para>
- <remark>NEEDINFO - FILE PATHS - I think some of the values have changed
in the referenced file when I look at the new file below. New info
required?</remark>
- <para>Edit the
<filename><replaceable>JPP_HOME</replaceable>/standalone/configuration/gatein/configuration.properties</filename>
and comment out the following rows in the JCR section:
- </para>
- <programlisting>#gatein.jcr.datasource.driver=org.hsqldb.jdbcDriver
-#gatein.jcr.datasource.url=jdbc:hsqldb:file:${gatein.db.data.dir}/data/jdbcjcr_${name}
-#gatein.jcr.datasource.username=sa
-#gatein.jcr.datasource.password=</programlisting>
- <para>
- Comment out the following lines in the IDM section:
- </para>
- <programlisting>#gatein.idm.datasource.driver=org.hsqldb.jdbcDriver
-#gatein.idm.datasource.url=jdbc:hsqldb:file:${gatein.db.data.dir}/data/jdbcidm_${name}
-#gatein.idm.datasource.username=sa
-#gatein.idm.datasource.password=</programlisting>
- <para>
- Open the <filename>jcr-configuration.xml</filename> and
<filename>idm-configuration.xml</filename> files and comment out references to
the plug-in <literal>InitialContextInitializer</literal>.
- </para>
- <programlisting language="XML"
role="XML"><!-- Commented because, Datasources are declared and bound
by AS, not in eXo -->
-<!--
-<external-component-plugins>
- [...]
-</external-component-plugins>
---></programlisting>
- </section>
- </section>
- </section>
- </section>
- </appendix>
- <xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="appendix-Quickstarts.xml"/>
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="Revision_History.xml"/>
</book><?xxe-serial-numbers hgd3icsl jmorgan
(1z141z5 (-) (-) (-) (1z141z6 (1z141z7) (-) (-)) (1z141z8 (1z141z9)
Modified: epp/docs/JPP/trunk/Development_Guide/en-US/Load_Groups.xml
===================================================================
(Binary files differ)
Modified: epp/docs/JPP/trunk/Development_Guide/en-US/Localization.xml
===================================================================
(Binary files differ)
Modified: epp/docs/JPP/trunk/Development_Guide/en-US/Revision_History.xml
===================================================================
--- epp/docs/JPP/trunk/Development_Guide/en-US/Revision_History.xml 2013-05-08 05:35:47
UTC (rev 9274)
+++ epp/docs/JPP/trunk/Development_Guide/en-US/Revision_History.xml 2013-05-08 07:02:26
UTC (rev 9275)
@@ -7,7 +7,21 @@
<title>Revision History</title>
<simpara>
<revhistory>
- <revision>
+ <revision>
+ <revnumber>6.1.0-5</revnumber>
+ <date>Wed May 8 2013</date>
+ <author>
+ <firstname>Jared</firstname>
+ <surname>Morgan</surname>
+ <email/>
+ </author>
+ <revdescription>
+ <simplelist>
+ <member>Base eXo JCR content from Reference Guide ported over to this
guide. Next task is to add the missing content and atomise the content.</member>
+ </simplelist>
+ </revdescription>
+ </revision>
+ <revision>
<revnumber>6.1.0-2</revnumber>
<date>Fri Apr 19 2013</date>
<author>
Modified: epp/docs/JPP/trunk/Development_Guide/en-US/The_eXo_Kernel.xml
===================================================================
(Binary files differ)
Added: epp/docs/JPP/trunk/Development_Guide/en-US/eXo_JCR.xml
===================================================================
(Binary files differ)
Property changes on: epp/docs/JPP/trunk/Development_Guide/en-US/eXo_JCR.xml
___________________________________________________________________
Added: svn:mime-type
+ application/xml