[exo-jcr-commits] exo-jcr SVN: r2874 - in jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr: configuration and 1 other directory.
do-not-reply at jboss.org
do-not-reply at jboss.org
Wed Aug 4 09:51:23 EDT 2010
Author: dkatayev
Date: 2010-08-04 09:51:23 -0400 (Wed, 04 Aug 2010)
New Revision: 2874
Modified:
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-applications.xml
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-registry-service.xml
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetype-registration.xml
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetypes-and-namespaces.xml
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/exo-jcr-configuration.xml
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/multilanguage-support.xml
jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/search-configuration.xml
Log:
EXOJCR-869 documentation updated
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-applications.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-applications.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-applications.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -3,6 +3,7 @@
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="JCR.eXoJCRApplicationModel">
<?dbhtml filename="ch-jcr-applications.html"?>
+
<title>eXo JCR Application Model</title>
<para>A large picture of interaction between Applications and JCR looks as
@@ -34,8 +35,4 @@
specific Frameworks. It is possible to build a multi-layered (in framework
sense) JCR application, for example Web application uses Web framework that
uses Command framework underneath.</para>
-
- <para>To READ:</para>
-
- <para>Deployment JCR standalone (ver 1_5)</para>
</chapter>
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-registry-service.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-registry-service.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/jcr-registry-service.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -3,6 +3,7 @@
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="JCR.RegistryService">
<?dbhtml filename="ch-jcr-registry-service.html"?>
+
<title>Registry Service</title>
<section id="Concept">
@@ -27,8 +28,7 @@
<para>The proposed structure of the Registry Service storage.It is divided
into 3 logical groups: services, applications and users:</para>
- <programlisting>/
- exo:registry/ <-- registry "root" (exo:registry)
+ <programlisting> exo:registry/ <-- registry "root" (exo:registry)
exo:services/ <-- service data storage (exo:registryGroup)
service1/
Consumer data (exo:registryEntry)
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetype-registration.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetype-registration.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetype-registration.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -3,6 +3,7 @@
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="JCR.NodeTypeRegistration">
<?dbhtml filename="ch-nodetype-registration.html"?>
+
<title>NodeType Registration</title>
<para>eXo JCR implementation supports two ways of Nodetypes
@@ -357,9 +358,9 @@
<title>Node type registration</title>
<para>eXo JCR implementation supports various methods of the node-type
- registration. The most used is registration from xml file <ulink
- url="on JCR startup>Node+types+and+Namespaces">on JCR
- startup>Node+types+and+Namespaces</ulink>.</para>
+ registration. The most used is registration from <link
+ linkend="JCR.NodeTypesandNamespaces">xml file</link> on JCR
+ startup.</para>
<section>
<title>Run time registration from xml file.</title>
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetypes-and-namespaces.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetypes-and-namespaces.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/concepts/nodetypes-and-namespaces.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -3,6 +3,7 @@
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="JCR.NodeTypesandNamespaces">
<?dbhtml filename="ch-nodetypes-and-namespaces.html"?>
+
<title>Node Types and Namespaces</title>
<section>
@@ -11,10 +12,10 @@
<para>Support of node types and namespaces is required by the JSR-170
specification. Beyond the methods required by the specification, eXo JCR
has its own API extension for the <ulink
- url="JCR.NodeTypeRegistration">Node type
- registration>NodeType+registration</ulink> as well as the ability to
- declaratively define node types in the Repository at the start-up
- time.</para>
+ url="JCR.NodeTypeRegistration"><link
+ linkend="JCR.NodeTypeRegistration">Node type registration</link></ulink>
+ as well as the ability to declaratively define node types in the
+ Repository at the start-up time.</para>
</section>
<section>
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/exo-jcr-configuration.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/exo-jcr-configuration.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/exo-jcr-configuration.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -1,415 +1,413 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
-<chapter id="JCR.eXoJCRconfiguration">
- <?dbhtml filename="ch-exo-jcr-configuration.html"?>
- <title>eXo JCR configuration</title>
- <section>
- <title>Related documents</title>
-
- <itemizedlist>
- <listitem>
- <para><link linkend="JCR.ConfigurationPersister">Configuration
- persister</link></para>
- </listitem>
-
- <listitem>
- <para><link linkend="JCR.SearchConfiguration">Search
- Configuration</link></para>
- </listitem>
-
- <listitem>
- <para><link linkend="ch_jdbc_data_container">JDBC Data Container
- config</link></para>
- </listitem>
-
- <listitem>
- <para><link linkend="ch_external_value_storages">External Value
- Storages</link></para>
- </listitem>
-
- <listitem>
- <para><link linkend="none">Workspace SimpleDB storage</link></para>
- </listitem>
-
- <listitem>
- <para><link linkend="none">Workspace Persistence Storage</link></para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section id ="JCR.eXoJCRconfiguration.PortalAndStandaloneConfiguration">
- <title>Portal and Standalone configuration</title>
-
- <para>Like other eXo services eXo JCR can be configured and used in portal
- or embedded mode (as a service embedded in eXo Portal) and in standalone
- mode.</para>
-
- <para>In Embedded mode, JCR services are registered in the Portal
- container and the second option is to use a Standalone container. The main
- difference between these container types is that the first one is intended
- to be used in a Portal (Web) environment, while the second one can be used
- standalone (see the comprehensive page <link
- linkend="Kernel.ServiceConfigurationForBeginners">Service Configuration
- for Beginners</link> for more details).</para>
-
- <para>The following setup procedure is used to obtain a Standalone
- configuration (find more in <link
- linkend="Kernel.ContainerConfiguration">Container
- configuration</link>):</para>
-
- <para>* Configuration that is set explicitly using
- StandaloneContainer.addConfigurationURL(String url) or
- StandaloneContainer.addConfigurationPath(String path) before
- getInstance()</para>
-
- <para>* Configuration from $base:directory/exo-configuration.xml or
- $base:directory/conf/exo-configuration.xml file. Where $base:directory is
- either AS's home directory in case of J2EE AS environment or just the
- current directory in case of a standalone application.</para>
-
- <para>* /conf/exo-configuration.xml in the current classloader (e.g. war,
- ear archive)</para>
-
- <para>* Configuration from
- $service_jar_file/conf/portal/configuration.xml. WARNING: do not rely on
- some concrete jar's configuration if you have more than one jar containing
- conf/portal/configuration.xml file. In this case choosing a configuration
- is unpredictable.</para>
-
- <para>JCR service configuration looks like:</para>
-
- <programlisting><component>
- <key>org.exoplatform.services.jcr.RepositoryService</key>
- <type>org.exoplatform.services.jcr.impl.RepositoryServiceImpl</type>
-</component>
-<component>
- <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
- <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
- <init-params>
- <value-param>
- <name>conf-path</name>
- <description>JCR repositories configuration file</description>
- <value>jar:/conf/standalone/exo-jcr-config.xml</value>
- </value-param>
- <properties-param>
- <name>working-conf</name>
- <description>working-conf</description>
- <property name="source-name" value="jdbcjcr" />
- <property name="dialect" value="hsqldb" />
- <property name="persister-class-name" value="org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister" />
- </properties-param>
- </init-params>
-</component></programlisting>
-
- <para><emphasis role="bold">conf-path</emphasis> : a path to a
- RepositoryService JCR Configuration.</para>
-
- <para><emphasis role="bold">working-conf</emphasis> : optional; <link
- linkend="JCR.ConfigurationPersister">JCR configuration persister</link>
- configuration. If there isn't a working-conf the persister will be
- disabled.</para>
- </section>
-
- <section>
- <title>JCR Configuration</title>
-
- <para>The Configuration is defined in an XML file (see DTD below)</para>
-
- <para>JCR Service can use multiple <emphasis
- role="bold">Repositories</emphasis> and each repository can have multiple
- <emphasis role="bold">Workspaces</emphasis>.</para>
-
- <para>From v.1.9 JCR repositories configuration parameters support
- human-readable formats of values. They are all case-insensitive:</para>
-
- <itemizedlist>
- <listitem>
- <para>Numbers formats: K,KB - kilobytes, M,MB - megabytes, G,GB -
- gigabytes, T,TB - terabytes. Examples: 100.5 - digit 100.5, 200k - 200
- Kbytes, 4m - 4 Mbytes, 1.4G - 1.4 Gbytes, 10T - 10 Tbytes</para>
- </listitem>
- </itemizedlist>
-
- <itemizedlist>
- <listitem>
- <para>Time format endings: ms - milliseconds, m - minutes, h - hours,
- d - days, w - weeks, if no ending - seconds. Examples: 500ms - 500
- milliseconds, 20 - 20 seconds, 30m - 30 minutes, 12h - 12 hours, 5d -
- 5 days, 4w - 4 weeks.</para>
- </listitem>
- </itemizedlist>
- </section>
-
- <section>
- <title>Repository service configuration (JCR repositories
- configuration)</title>
-
- <para>Service configuration may be placed in
- jar:/conf/standalone/exo-jcr-config.xml for standalone mode. For portal
- mode it is located in the portal web application
- portal/WEB-INF/conf/jcr/repository-configuration.xml</para>
-
- <para><emphasis role="bold">default-repository</emphasis> - the name of a
- default repository (one returned by
- RepositoryService.getRepository())</para>
-
- <para><emphasis role="bold">repositories</emphasis> - the list of
- repositories</para>
- </section>
-
- <section>
- <title>Repository configuration:</title>
-
- <para><emphasis role="bold">name</emphasis> - the name of a
- repository</para>
-
- <para><emphasis role="bold">default-workspace</emphasis> - the name of a
- workspace obtained using Session's login() or login(Credentials) methods
- (ones without an explicit workspace name)</para>
-
- <para><emphasis role="bold">system-workspace</emphasis> - name of
- workspace where <emphasis role="bold">/jcr:system</emphasis> node is
- placed</para>
-
- <para><emphasis role="bold">security-domain</emphasis> - the name of a
- security domain for JAAS authentication</para>
-
- <para><emphasis role="bold">access-control</emphasis> - the name of an
- access control policy. There can be 3 types: optional - ACL is created
- on-demand(default), disable - no access control, mandatory - an ACL is
- created for each added node(not supported yet)</para>
-
- <para><emphasis role="bold">authentication-policy</emphasis> - the name of
- an authentication policy class</para>
-
- <para><emphasis role="bold">workspaces</emphasis> - the list of
- workspaces</para>
-
- <para><emphasis role="bold">session-max-age</emphasis> - the time after
- which an idle session will be removed (called logout). If not set, the
- idle session will never be removed.</para>
- </section>
-
- <section>
- <title>Workspace configuration:</title>
-
- <para><emphasis role="bold">name</emphasis> - the name of a
- workspace</para>
-
- <para><emphasis role="bold">auto-init-root-nodetype</emphasis> -
- DEPRECATED in JCR 1.9 (use initializer). The node type for root node
- initialization</para>
-
- <para><emphasis role="bold">container</emphasis> - workspace data
- container (physical storage) configuration</para>
-
- <para><emphasis role="bold">initializer</emphasis> - workspace initializer
- configuration</para>
-
- <para><emphasis role="bold">cache</emphasis> - workspace storage cache
- configuration</para>
-
- <para><emphasis role="bold">query-handler</emphasis> - query handler
- configuration</para>
-
- <para><emphasis role="bold">auto-init-permissions</emphasis> - DEPRECATED
- in JCR 1.9 (use initializer). Default permissions of the root node. It is
- defined as a set of semicolon-delimited permissions containing a group of
- space-delimited identities (user, group etc, see Organization service
- documentation for details) and the type of permission. For example any
- read; <emphasis role="bold">:/admin read;</emphasis>:/admin add_node;
- <emphasis role="bold">:/admin set_property;</emphasis>:/admin remove means
- that users from group <emphasis role="bold">admin</emphasis> have all
- permissions and other users have only a 'read' permission.</para>
- </section>
-
- <section>
- <title>Workspace data container configuration:</title>
-
- <para><emphasis role="bold">class</emphasis> - A workspace data container
- class name</para>
-
- <para><emphasis role="bold">properties</emphasis> - the list of properties
- (name-value pairs) for the concrete Workspace data container</para>
-
- <para><emphasis role="bold">value-storages</emphasis> - the list of value
- storage plugins</para>
- </section>
-
- <section id="JCR.ConfigurationPersister.ValueStoragePlugin">
- <title>Value Storage plugin configuration (for data container):</title>
-
- <note>
- <para>The value-storage element is optional. If you don't include it,
- the values will be stored as BLOBs inside the database.</para>
- </note>
-
- <para><emphasis role="bold">value-storage</emphasis> - Optional value
- Storage plugin definition:</para>
-
- <para><emphasis role="bold">class</emphasis>- a value storage plugin class
- name (attribute)</para>
-
- <para><emphasis role="bold">properties</emphasis> - the list of properties
- (name-value pairs) for a concrete Value Storage plugin</para>
-
- <para><emphasis role="bold">filters</emphasis> - the list of filters
- defining conditions when this plugin is applicable</para>
- </section>
-
- <section>
- <title>Initializer configuration (optional):</title>
-
- <para><emphasis role="bold">class</emphasis> - initializer implementation
- class.</para>
-
- <para><emphasis role="bold">properties</emphasis> - the list of properties
- (name-value pairs). Properties are supported:</para>
-
- <para><emphasis role="bold">root-nodetype</emphasis> - The node type for
- root node initialization</para>
-
- <para><emphasis role="bold">root-permissions</emphasis> - Default
- permissions of the root node. It is defined as a set of
- semicolon-delimited permissions containing a group of space-delimited
- identities (user, group etc, see Organization service documentation for
- details) and the type of permission. For example any read; <emphasis
- role="bold">:/admin read;</emphasis>:/admin add_node; <emphasis
- role="bold">:/admin set_property;</emphasis>:/admin remove means that
- users from group <emphasis role="bold">admin</emphasis> have all
- permissions and other users have only a 'read' permission.</para>
-
- <para>Configurable initializer adds a capability to override workspace
- initial startup procedure (used for Clustering). Also it replaces
- workspace element parameters auto-init-root-nodetype and
- auto-init-permissions with root-nodetype and root-permissions.</para>
- </section>
-
- <section>
- <title>Cache configuration:</title>
-
- <para><emphasis role="bold">enabled</emphasis> - if workspace cache is
- enabled</para>
-
- <para><emphasis role="bold">class</emphasis> - cache implementation class,
- optional from 1.9. Default value is
- org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl.</para>
-
- <para>Cache can be configured to use concrete implementation of
- WorkspaceStorageCache interface. JCR core has two implementation to
- use:</para>
-
- <itemizedlist>
- <listitem>
- <para>LinkedWorkspaceStorageCacheImpl - default, with configurable
- read behavior and statistic.</para>
- </listitem>
-
- <listitem>
- <para>WorkspaceStorageCacheImpl - pre 1.9, still can be used.</para>
- </listitem>
- </itemizedlist>
-
- <para><emphasis role="bold">properties</emphasis> - the list of properties
- (name-value pairs) for Workspace cache:</para>
-
- <para><emphasis role="bold">max-size</emphasis> - cache maximum size
- (maxSize prior to v.1.9).</para>
-
- <para><emphasis role="bold">live-time</emphasis> - cached item live time
- (liveTime prior to v.1.9).</para>
-
- <para>From 1.9 LinkedWorkspaceStorageCacheImpl supports additional
- optional parameters.</para>
-
- <para><emphasis role="bold">statistic-period</emphasis> - period (time
- format) of cache statistic thread execution, 5 minutes by default.</para>
-
- <para><emphasis role="bold">statistic-log</emphasis> - if true cache
- statistic will be printed to default logger (log.info), false by
- default.</para>
-
- <para><emphasis role="bold">statistic-clean</emphasis> - if true cache
- statistic will be cleaned after was gathered, false by default.</para>
-
- <para><emphasis role="bold">cleaner-period</emphasis> - period of eldest
- items remover execution, 20 minutes by default.</para>
-
- <para><emphasis role="bold">blocking-users-count</emphasis> - number of
- concurrent users allowed to read cache storage, 0 - unlimited by
- default.</para>
- </section>
-
- <section>
- <title>Query Handler configuration:</title>
-
- <para><emphasis role="bold">class</emphasis> - A Query Handler class
- name</para>
-
- <para><emphasis role="bold">properties</emphasis> - the list of properties
- (name-value pairs) for a Query Handler (indexDir)</para>
-
- <para>Properties and advanced features described in <link
- linkend="JCR.SearchConfiguration">Search Configuration</link>.</para>
- </section>
-
- <section>
- <title>Lock Manager configuration:</title>
-
- <para><emphasis role="bold">time-out</emphasis> - time after which the
- unused global lock will be removed.</para>
-
- <para><emphasis role="bold">persister</emphasis> - a class for storing
- lock information for future use. For example, remove lock after jcr
- restart.</para>
-
- <para><emphasis role="bold">path</emphasis> - a lock folder, each
- workspace has its own.</para>
-
- <para>Additional information about the configuration of the lock you can
- see in <link linkend="TODO.JCR.Locking">JCR Locks Implementation
- Specification</link>.</para>
-
- <programlisting><!ELEMENT repository-service (repositories)>
-<!ATTLIST repository-service default-repository NMTOKEN #REQUIRED>
-<!ELEMENT repositories (repository)>
-<!ELEMENT repository (security-domain,access-control,session-max-age,authentication-policy,workspaces)>
-<!ATTLIST repository
- default-workspace NMTOKEN #REQUIRED
- name NMTOKEN #REQUIRED
- system-workspace NMTOKEN #REQUIRED
->
-<!ELEMENT security-domain (#PCDATA)>
-<!ELEMENT access-control (#PCDATA)>
-<!ELEMENT session-max-age (#PCDATA)>
-<!ELEMENT authentication-policy (#PCDATA)>
-<!ELEMENT workspaces (workspace+)>
-<!ELEMENT workspace (container,initializer,cache,query-handler)>
-<!ATTLIST workspace name NMTOKEN #REQUIRED>
-<!ELEMENT container (properties,value-storages)>
-<!ATTLIST container class NMTOKEN #REQUIRED>
-<!ELEMENT value-storages (value-storage+)>
-<!ELEMENT value-storage (properties,filters)>
-<!ATTLIST value-storage class NMTOKEN #REQUIRED>
-<!ELEMENT filters (filter+)>
-<!ELEMENT filter EMPTY>
-<!ATTLIST filter property-type NMTOKEN #REQUIRED>
-<!ELEMENT initializer (properties)>
-<!ATTLIST initializer class NMTOKEN #REQUIRED>
-<!ELEMENT cache (properties)>
-<!ATTLIST cache
- enabled NMTOKEN #REQUIRED
- class NMTOKEN #REQUIRED
->
-<!ELEMENT query-handler (properties)>
-<!ATTLIST query-handler class NMTOKEN #REQUIRED>
-<!ELEMENT access-manager (properties)>
-<!ATTLIST access-manager class NMTOKEN #REQUIRED>
-<!ELEMENT lock-manager (time-out,persister)>
-<!ELEMENT time-out (#PCDATA)>
-<!ELEMENT persister (properties)>
-<!ELEMENT properties (property+)>
-<!ELEMENT property EMPTY></programlisting>
- </section>
-</chapter>
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<chapter id="JCR.eXoJCRconfiguration">
+ <?dbhtml filename="ch-exo-jcr-configuration.html"?>
+
+ <title>eXo JCR configuration</title>
+
+ <section>
+ <title>Related documents</title>
+
+ <itemizedlist>
+ <listitem>
+ <para><link linkend="JCR.ConfigurationPersister">Configuration
+ persister</link></para>
+ </listitem>
+
+ <listitem>
+ <para><link linkend="JCR.SearchConfiguration">Search
+ Configuration</link></para>
+ </listitem>
+
+ <listitem>
+ <para><link linkend="ch_jdbc_data_container">JDBC Data Container
+ config</link></para>
+ </listitem>
+
+ <listitem>
+ <para><link linkend="ch_external_value_storages">External Value
+ Storages</link></para>
+ </listitem>
+
+ <listitem>
+ <para><link linkend="none">Workspace SimpleDB storage</link></para>
+ </listitem>
+
+ <listitem>
+ <para><link linkend="none">Workspace Persistence Storage</link></para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section id="JCR.eXoJCRconfiguration.PortalAndStandaloneConfiguration">
+ <title>Portal and Standalone configuration</title>
+
+ <para>Like other eXo services eXo JCR can be configured and used in portal
+ or embedded mode (as a service embedded in eXo Portal) and in standalone
+ mode.</para>
+
+ <para>In Embedded mode, JCR services are registered in the Portal
+ container and the second option is to use a Standalone container. The main
+ difference between these container types is that the first one is intended
+ to be used in a Portal (Web) environment, while the second one can be used
+ standalone (see the comprehensive page <link
+ linkend="Kernel.ServiceConfigurationforBeginners">Service Configuration
+ for Beginners</link> for more details).</para>
+
+ <para>The following setup procedure is used to obtain a Standalone
+ configuration (find more in <link
+ linkend="Kernel.ContainerConfiguration">Container
+ configuration</link>):</para>
+
+ <para>* Configuration that is set explicitly using
+ StandaloneContainer.addConfigurationURL(String url) or
+ StandaloneContainer.addConfigurationPath(String path) before
+ getInstance()</para>
+
+ <para>* Configuration from $base:directory/exo-configuration.xml or
+ $base:directory/conf/exo-configuration.xml file. Where $base:directory is
+ either AS's home directory in case of J2EE AS environment or just the
+ current directory in case of a standalone application.</para>
+
+ <para>* /conf/exo-configuration.xml in the current classloader (e.g. war,
+ ear archive)</para>
+
+ <para>* Configuration from
+ $service_jar_file/conf/portal/configuration.xml. WARNING: do not rely on
+ some concrete jar's configuration if you have more than one jar containing
+ conf/portal/configuration.xml file. In this case choosing a configuration
+ is unpredictable.</para>
+
+ <para>JCR service configuration looks like:</para>
+
+ <programlisting><component>
+ <key>org.exoplatform.services.jcr.RepositoryService</key>
+ <type>org.exoplatform.services.jcr.impl.RepositoryServiceImpl</type>
+</component>
+<component>
+ <key>org.exoplatform.services.jcr.config.RepositoryServiceConfiguration</key>
+ <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceConfigurationImpl</type>
+ <init-params>
+ <value-param>
+ <name>conf-path</name>
+ <description>JCR repositories configuration file</description>
+ <value>jar:/conf/standalone/exo-jcr-config.xml</value>
+ </value-param>
+ <properties-param>
+ <name>working-conf</name>
+ <description>working-conf</description>
+ <property name="source-name" value="jdbcjcr" />
+ <property name="dialect" value="hsqldb" />
+ <property name="persister-class-name" value="org.exoplatform.services.jcr.impl.config.JDBCConfigurationPersister" />
+ </properties-param>
+ </init-params>
+</component></programlisting>
+
+ <para><emphasis role="bold">conf-path</emphasis> : a path to a
+ RepositoryService JCR Configuration.</para>
+
+ <para><emphasis role="bold">working-conf</emphasis> : optional; <link
+ linkend="JCR.ConfigurationPersister">JCR configuration persister</link>
+ configuration. If there isn't a working-conf the persister will be
+ disabled.</para>
+ </section>
+
+ <section>
+ <title>JCR Configuration</title>
+
+ <para>The Configuration is defined in an XML file (see DTD below)</para>
+
+ <para>JCR Service can use multiple <emphasis
+ role="bold">Repositories</emphasis> and each repository can have multiple
+ <emphasis role="bold">Workspaces</emphasis>.</para>
+
+ <para>From v.1.9 JCR repositories configuration parameters support
+ human-readable formats of values. They are all case-insensitive:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Numbers formats: K,KB - kilobytes, M,MB - megabytes, G,GB -
+ gigabytes, T,TB - terabytes. Examples: 100.5 - digit 100.5, 200k - 200
+ Kbytes, 4m - 4 Mbytes, 1.4G - 1.4 Gbytes, 10T - 10 Tbytes</para>
+ </listitem>
+ </itemizedlist>
+
+ <itemizedlist>
+ <listitem>
+ <para>Time format endings: ms - milliseconds, m - minutes, h - hours,
+ d - days, w - weeks, if no ending - seconds. Examples: 500ms - 500
+ milliseconds, 20 - 20 seconds, 30m - 30 minutes, 12h - 12 hours, 5d -
+ 5 days, 4w - 4 weeks.</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+
+ <section>
+ <title>Repository service configuration (JCR repositories
+ configuration)</title>
+
+ <para>Service configuration may be placed in
+ jar:/conf/standalone/exo-jcr-config.xml for standalone mode. For portal
+ mode it is located in the portal web application
+ portal/WEB-INF/conf/jcr/repository-configuration.xml</para>
+
+ <para><emphasis role="bold">default-repository</emphasis> - the name of a
+ default repository (one returned by
+ RepositoryService.getRepository())</para>
+
+ <para><emphasis role="bold">repositories</emphasis> - the list of
+ repositories</para>
+ </section>
+
+ <section>
+ <title>Repository configuration:</title>
+
+ <para><emphasis role="bold">name</emphasis> - the name of a
+ repository</para>
+
+ <para><emphasis role="bold">default-workspace</emphasis> - the name of a
+ workspace obtained using Session's login() or login(Credentials) methods
+ (ones without an explicit workspace name)</para>
+
+ <para><emphasis role="bold">system-workspace</emphasis> - name of
+ workspace where <emphasis role="bold">/jcr:system</emphasis> node is
+ placed</para>
+
+ <para><emphasis role="bold">security-domain</emphasis> - the name of a
+ security domain for JAAS authentication</para>
+
+ <para><emphasis role="bold">access-control</emphasis> - the name of an
+ access control policy. There can be 3 types: optional - ACL is created
+ on-demand(default), disable - no access control, mandatory - an ACL is
+ created for each added node(not supported yet)</para>
+
+ <para><emphasis role="bold">authentication-policy</emphasis> - the name of
+ an authentication policy class</para>
+
+ <para><emphasis role="bold">workspaces</emphasis> - the list of
+ workspaces</para>
+
+ <para><emphasis role="bold">session-max-age</emphasis> - the time after
+ which an idle session will be removed (called logout). If not set, the
+ idle session will never be removed.</para>
+ </section>
+
+ <section>
+ <title>Workspace configuration:</title>
+
+ <para><emphasis role="bold">name</emphasis> - the name of a
+ workspace</para>
+
+ <para><emphasis role="bold">auto-init-root-nodetype</emphasis> -
+ DEPRECATED in JCR 1.9 (use initializer). The node type for root node
+ initialization</para>
+
+ <para><emphasis role="bold">container</emphasis> - workspace data
+ container (physical storage) configuration</para>
+
+ <para><emphasis role="bold">initializer</emphasis> - workspace initializer
+ configuration</para>
+
+ <para><emphasis role="bold">cache</emphasis> - workspace storage cache
+ configuration</para>
+
+ <para><emphasis role="bold">query-handler</emphasis> - query handler
+ configuration</para>
+
+ <para><emphasis role="bold">auto-init-permissions</emphasis> - DEPRECATED
+ in JCR 1.9 (use initializer). Default permissions of the root node. It is
+ defined as a set of semicolon-delimited permissions containing a group of
+ space-delimited identities (user, group etc, see Organization service
+ documentation for details) and the type of permission. For example any
+ read; <emphasis role="bold">:/admin read;</emphasis>:/admin add_node;
+ <emphasis role="bold">:/admin set_property;</emphasis>:/admin remove means
+ that users from group <emphasis role="bold">admin</emphasis> have all
+ permissions and other users have only a 'read' permission.</para>
+ </section>
+
+ <section>
+ <title>Workspace data container configuration:</title>
+
+ <para><emphasis role="bold">class</emphasis> - A workspace data container
+ class name</para>
+
+ <para><emphasis role="bold">properties</emphasis> - the list of properties
+ (name-value pairs) for the concrete Workspace data container</para>
+
+ <para><emphasis role="bold">value-storages</emphasis> - the list of value
+ storage plugins</para>
+ </section>
+
+ <section id="JCR.ConfigurationPersister.ValueStoragePlugin">
+ <title>Value Storage plugin configuration (for data container):</title>
+
+ <note>
+ <para>The value-storage element is optional. If you don't include it,
+ the values will be stored as BLOBs inside the database.</para>
+ </note>
+
+ <para><emphasis role="bold">value-storage</emphasis> - Optional value
+ Storage plugin definition:</para>
+
+ <para><emphasis role="bold">class</emphasis>- a value storage plugin class
+ name (attribute)</para>
+
+ <para><emphasis role="bold">properties</emphasis> - the list of properties
+ (name-value pairs) for a concrete Value Storage plugin</para>
+
+ <para><emphasis role="bold">filters</emphasis> - the list of filters
+ defining conditions when this plugin is applicable</para>
+ </section>
+
+ <section>
+ <title>Initializer configuration (optional):</title>
+
+ <para><emphasis role="bold">class</emphasis> - initializer implementation
+ class.</para>
+
+ <para><emphasis role="bold">properties</emphasis> - the list of properties
+ (name-value pairs). Properties are supported:</para>
+
+ <para><emphasis role="bold">root-nodetype</emphasis> - The node type for
+ root node initialization</para>
+
+ <para><emphasis role="bold">root-permissions</emphasis> - Default
+ permissions of the root node. It is defined as a set of
+ semicolon-delimited permissions containing a group of space-delimited
+ identities (user, group etc, see Organization service documentation for
+ details) and the type of permission. For example any read; <emphasis
+ role="bold">:/admin read;</emphasis>:/admin add_node; <emphasis
+ role="bold">:/admin set_property;</emphasis>:/admin remove means that
+ users from group <emphasis role="bold">admin</emphasis> have all
+ permissions and other users have only a 'read' permission.</para>
+
+ <para>Configurable initializer adds a capability to override workspace
+ initial startup procedure (used for Clustering). Also it replaces
+ workspace element parameters auto-init-root-nodetype and
+ auto-init-permissions with root-nodetype and root-permissions.</para>
+ </section>
+
+ <section>
+ <title>Cache configuration:</title>
+
+ <para><emphasis role="bold">enabled</emphasis> - if workspace cache is
+ enabled</para>
+
+ <para><emphasis role="bold">class</emphasis> - cache implementation class,
+ optional from 1.9. Default value is
+ org.exoplatform.services.jcr.impl.dataflow.persistent.LinkedWorkspaceStorageCacheImpl.</para>
+
+ <para>Cache can be configured to use concrete implementation of
+ WorkspaceStorageCache interface. JCR core has two implementation to
+ use:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>LinkedWorkspaceStorageCacheImpl - default, with configurable
+ read behavior and statistic.</para>
+ </listitem>
+
+ <listitem>
+ <para>WorkspaceStorageCacheImpl - pre 1.9, still can be used.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para><emphasis role="bold">properties</emphasis> - the list of properties
+ (name-value pairs) for Workspace cache:</para>
+
+ <para><emphasis role="bold">max-size</emphasis> - cache maximum size
+ (maxSize prior to v.1.9).</para>
+
+ <para><emphasis role="bold">live-time</emphasis> - cached item live time
+ (liveTime prior to v.1.9).</para>
+
+ <para>From 1.9 LinkedWorkspaceStorageCacheImpl supports additional
+ optional parameters.</para>
+
+ <para><emphasis role="bold">statistic-period</emphasis> - period (time
+ format) of cache statistic thread execution, 5 minutes by default.</para>
+
+ <para><emphasis role="bold">statistic-log</emphasis> - if true cache
+ statistic will be printed to default logger (log.info), false by
+ default.</para>
+
+ <para><emphasis role="bold">statistic-clean</emphasis> - if true cache
+ statistic will be cleaned after was gathered, false by default.</para>
+
+ <para><emphasis role="bold">cleaner-period</emphasis> - period of eldest
+ items remover execution, 20 minutes by default.</para>
+
+ <para><emphasis role="bold">blocking-users-count</emphasis> - number of
+ concurrent users allowed to read cache storage, 0 - unlimited by
+ default.</para>
+ </section>
+
+ <section>
+ <title>Query Handler configuration:</title>
+
+ <para><emphasis role="bold">class</emphasis> - A Query Handler class
+ name</para>
+
+ <para><emphasis role="bold">properties</emphasis> - the list of properties
+ (name-value pairs) for a Query Handler (indexDir)</para>
+
+ <para>Properties and advanced features described in <link
+ linkend="JCR.SearchConfiguration">Search Configuration</link>.</para>
+ </section>
+
+ <section>
+ <title>Lock Manager configuration:</title>
+
+ <para><emphasis role="bold">time-out</emphasis> - time after which the
+ unused global lock will be removed.</para>
+
+ <para><emphasis role="bold">persister</emphasis> - a class for storing
+ lock information for future use. For example, remove lock after jcr
+ restart.</para>
+
+ <para><emphasis role="bold">path</emphasis> - a lock folder, each
+ workspace has its own.</para>
+
+ <programlisting><!ELEMENT repository-service (repositories)>
+<!ATTLIST repository-service default-repository NMTOKEN #REQUIRED>
+<!ELEMENT repositories (repository)>
+<!ELEMENT repository (security-domain,access-control,session-max-age,authentication-policy,workspaces)>
+<!ATTLIST repository
+ default-workspace NMTOKEN #REQUIRED
+ name NMTOKEN #REQUIRED
+ system-workspace NMTOKEN #REQUIRED
+>
+<!ELEMENT security-domain (#PCDATA)>
+<!ELEMENT access-control (#PCDATA)>
+<!ELEMENT session-max-age (#PCDATA)>
+<!ELEMENT authentication-policy (#PCDATA)>
+<!ELEMENT workspaces (workspace+)>
+<!ELEMENT workspace (container,initializer,cache,query-handler)>
+<!ATTLIST workspace name NMTOKEN #REQUIRED>
+<!ELEMENT container (properties,value-storages)>
+<!ATTLIST container class NMTOKEN #REQUIRED>
+<!ELEMENT value-storages (value-storage+)>
+<!ELEMENT value-storage (properties,filters)>
+<!ATTLIST value-storage class NMTOKEN #REQUIRED>
+<!ELEMENT filters (filter+)>
+<!ELEMENT filter EMPTY>
+<!ATTLIST filter property-type NMTOKEN #REQUIRED>
+<!ELEMENT initializer (properties)>
+<!ATTLIST initializer class NMTOKEN #REQUIRED>
+<!ELEMENT cache (properties)>
+<!ATTLIST cache
+ enabled NMTOKEN #REQUIRED
+ class NMTOKEN #REQUIRED
+>
+<!ELEMENT query-handler (properties)>
+<!ATTLIST query-handler class NMTOKEN #REQUIRED>
+<!ELEMENT access-manager (properties)>
+<!ATTLIST access-manager class NMTOKEN #REQUIRED>
+<!ELEMENT lock-manager (time-out,persister)>
+<!ELEMENT time-out (#PCDATA)>
+<!ELEMENT persister (properties)>
+<!ELEMENT properties (property+)>
+<!ELEMENT property EMPTY></programlisting>
+ </section>
+</chapter>
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/multilanguage-support.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/multilanguage-support.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/multilanguage-support.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -1,170 +1,171 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
-<chapter id="JCRMultilanguageSupport">
- <?dbhtml filename="ch-multilanguage-support.html"?>
-
- <title>Multilanguage support in eXo JCR RDB backend</title>
-
- <section>
- <title>Intro</title>
-
- <para>Whenever relational database is used to store multilingual text data
- of eXo Java Content Repository we need to adapt configuration in order to
- support UTF-8 encoding. Here is a short HOWTO instruction for several
- supported RDBMS with examples.</para>
-
- <para>The configuration file you have to modify:
- .../webapps/portal/WEB-INF/conf/jcr/repository-configuration.xml</para>
-
- <note>
- <para>Datasource <parameter>jdbcjcr</parameter> used in examples can be
- configured via <classname>InitialContextInitializer</classname>
- component.</para>
- </note>
- </section>
-
- <section>
- <title>Oracle</title>
-
- <para>In order to run multilanguage JCR on an Oracle backend Unicode
- encoding for characters set should be applied to the database. Other
- Oracle globalization parameters don't make any impact. The only property
- to modify is <constant>NLS_CHARACTERSET</constant>.</para>
-
- <para>We have tested <constant>NLS_CHARACTERSET</constant> =
- <constant>AL32UTF8</constant> and it's works well for many European and
- Asian languages.</para>
-
- <para>Example of database configuration (used for JCR
- testing):<programlisting>NLS_LANGUAGE AMERICAN
-NLS_TERRITORY AMERICA
-NLS_CURRENCY $
-NLS_ISO_CURRENCY AMERICA
-NLS_NUMERIC_CHARACTERS .,
-NLS_CHARACTERSET AL32UTF8
-NLS_CALENDAR GREGORIAN
-NLS_DATE_FORMAT DD-MON-RR
-NLS_DATE_LANGUAGE AMERICAN
-NLS_SORT BINARY
-NLS_TIME_FORMAT HH.MI.SSXFF AM
-NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
-NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
-NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
-NLS_DUAL_CURRENCY $
-NLS_COMP BINARY
-NLS_LENGTH_SEMANTICS BYTE
-NLS_NCHAR_CONV_EXCP FALSE
-NLS_NCHAR_CHARACTERSET AL16UTF16</programlisting></para>
-
- <warning>
- <para>JCR 1.12.x doesn't use NVARCHAR columns, so that the value of the
- parameter NLS_NCHAR_CHARACTERSET does not matter for JCR.</para>
- </warning>
-
- <para>Create database with Unicode encoding and use Oracle dialect for the
- Workspace Container:</para>
-
- <programlisting><workspace name="collaboration">
- <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="source-name" value="jdbcjcr" />
- <property name="dialect" value="oracle" />
- <property name="multi-db" value="false" />
- <property name="max-buffer-size" value="200k" />
- <property name="swap-directory" value="target/temp/swap/ws" />
- </properties>
- .....</programlisting>
- </section>
-
- <section>
- <title>DB2</title>
-
- <para>DB2 Universal Database (DB2 UDB) supports <ulink
- url="http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/c0004821.htm">UTF-8
- and UTF-16/UCS-2</ulink>. When a Unicode database is created, CHAR,
- VARCHAR, LONG VARCHAR data are stored in UTF-8 form. It's enough for JCR
- multi-lingual support.</para>
-
- <para>Example of UTF-8 database creation:<programlisting>DB2 CREATE DATABASE dbname USING CODESET UTF-8 TERRITORY US</programlisting></para>
-
- <para>Create database with UTF-8 encoding and use db2 dialect for
- Workspace Container on DB2 v.9 and higher:<programlisting><workspace name="collaboration">
- <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="source-name" value="jdbcjcr" />
- <property name="dialect" value="db2" />
- <property name="multi-db" value="false" />
- <property name="max-buffer-size" value="200k" />
- <property name="swap-directory" value="target/temp/swap/ws" />
- </properties>
- .....</programlisting></para>
-
- <note>
- <para>For DB2 v.8.x support change the property "dialect" to
- db2v8.</para>
- </note>
- </section>
-
- <section>
- <title>MySQL</title>
-
- <para>JCR MySQL-backend requires special dialect <ulink
- url="http://jira.exoplatform.org/browse/JCR-375">MySQL-UTF8</ulink> to be
- used for internationalization support. But the database default charset
- should be latin1 to use limited index space effectively (1000 bytes for
- MyISAM engine, 767 for InnoDB). If database default charset is multibyte,
- a JCR database initialization error is thrown concerning index creation
- failure. In other words JCR can work on any singlebyte default charset of
- database, with UTF8 supported by MySQL server. But we have tested it only
- on latin1 database default charset.</para>
-
- <para>Repository configuration, workspace container entry
- example:<programlisting><workspace name="collaboration">
- <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="source-name" value="jdbcjcr" />
- <property name="dialect" value="mysql-utf8" />
- <property name="multi-db" value="false" />
- <property name="max-buffer-size" value="200k" />
- <property name="swap-directory" value="target/temp/swap/ws" />
- </properties>
- .....</programlisting></para>
- </section>
-
- <section>
- <title>PostgreSQL</title>
-
- <para>On PostgreSQL-backend multilingual support can be enabled in <ulink
- url="http://www.postgresql.org/docs/8.3/interactive/charset.html">different
- ways</ulink>:<itemizedlist>
- <listitem>
- <para>Using the locale features of the operating system to provide
- locale-specific collation order, number formatting, translated
- messages, and other aspects. UTF-8 is widely used on Linux
- distributions by default, so it can be useful in such case.</para>
- </listitem>
-
- <listitem>
- <para>Providing a number of different character sets defined in the
- PostgreSQL server, including multiple-byte character sets, to
- support storing text any language, and providing character set
- translation between client and server. We recommend to use UTF-8
- database charset, it will allow any-to-any conversations and make
- this issue transparent for the JCR.</para>
- </listitem>
- </itemizedlist></para>
-
- <para>Create database with UTF-8 encoding and use PgSQL dialect for
- Workspace Container:<programlisting><workspace name="collaboration">
- <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
- <properties>
- <property name="source-name" value="jdbcjcr" />
- <property name="dialect" value="pgsql" />
- <property name="multi-db" value="false" />
- <property name="max-buffer-size" value="200k" />
- <property name="swap-directory" value="target/temp/swap/ws" />
- </properties>
- .....</programlisting></para>
- </section>
-</chapter>
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<chapter id="JCRMultilanguageSupport">
+ <?dbhtml filename="ch-multilanguage-support.html"?>
+
+ <title>Multilanguage support in eXo JCR RDB backend</title>
+
+ <section>
+ <title>Intro</title>
+
+ <para>Whenever relational database is used to store multilingual text data
+ of eXo Java Content Repository we need to adapt configuration in order to
+ support UTF-8 encoding. Here is a short HOWTO instruction for several
+ supported RDBMS with examples.</para>
+
+ <para>The configuration file you have to modify:
+ .../webapps/portal/WEB-INF/conf/jcr/repository-configuration.xml</para>
+
+ <note>
+ <para>Datasource <parameter>jdbcjcr</parameter> used in examples can be
+ configured via <classname>InitialContextInitializer</classname>
+ component.</para>
+ </note>
+ </section>
+
+ <section>
+ <title>Oracle</title>
+
+ <para>In order to run multilanguage JCR on an Oracle backend Unicode
+ encoding for characters set should be applied to the database. Other
+ Oracle globalization parameters don't make any impact. The only property
+ to modify is <constant>NLS_CHARACTERSET</constant>.</para>
+
+ <para>We have tested <constant>NLS_CHARACTERSET</constant> =
+ <constant>AL32UTF8</constant> and it's works well for many European and
+ Asian languages.</para>
+
+ <para>Example of database configuration (used for JCR
+ testing):<programlisting>NLS_LANGUAGE AMERICAN
+NLS_TERRITORY AMERICA
+NLS_CURRENCY $
+NLS_ISO_CURRENCY AMERICA
+NLS_NUMERIC_CHARACTERS .,
+NLS_CHARACTERSET AL32UTF8
+NLS_CALENDAR GREGORIAN
+NLS_DATE_FORMAT DD-MON-RR
+NLS_DATE_LANGUAGE AMERICAN
+NLS_SORT BINARY
+NLS_TIME_FORMAT HH.MI.SSXFF AM
+NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
+NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
+NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
+NLS_DUAL_CURRENCY $
+NLS_COMP BINARY
+NLS_LENGTH_SEMANTICS BYTE
+NLS_NCHAR_CONV_EXCP FALSE
+NLS_NCHAR_CHARACTERSET AL16UTF16</programlisting></para>
+
+ <warning>
+ <para>JCR 1.12.x doesn't use NVARCHAR columns, so that the value of the
+ parameter NLS_NCHAR_CHARACTERSET does not matter for JCR.</para>
+ </warning>
+
+ <para>Create database with Unicode encoding and use Oracle dialect for the
+ Workspace Container:</para>
+
+ <programlisting><workspace name="collaboration">
+ <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name="source-name" value="jdbcjcr" />
+ <property name="dialect" value="oracle" />
+ <property name="multi-db" value="false" />
+ <property name="max-buffer-size" value="200k" />
+ <property name="swap-directory" value="target/temp/swap/ws" />
+ </properties>
+ .....</programlisting>
+ </section>
+
+ <section>
+ <title>DB2</title>
+
+ <para>DB2 Universal Database (DB2 UDB) supports <ulink
+ url="http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/c0004821.htm">UTF-8
+ and UTF-16/UCS-2</ulink>. When a Unicode database is created, CHAR,
+ VARCHAR, LONG VARCHAR data are stored in UTF-8 form. It's enough for JCR
+ multi-lingual support.</para>
+
+ <para>Example of UTF-8 database creation:<programlisting>DB2 CREATE DATABASE dbname USING CODESET UTF-8 TERRITORY US</programlisting></para>
+
+ <para>Create database with UTF-8 encoding and use db2 dialect for
+ Workspace Container on DB2 v.9 and higher:<programlisting><workspace name="collaboration">
+ <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name="source-name" value="jdbcjcr" />
+ <property name="dialect" value="db2" />
+ <property name="multi-db" value="false" />
+ <property name="max-buffer-size" value="200k" />
+ <property name="swap-directory" value="target/temp/swap/ws" />
+ </properties>
+ .....</programlisting></para>
+
+ <note>
+ <para>For DB2 v.8.x support change the property "dialect" to
+ db2v8.</para>
+ </note>
+ </section>
+
+ <section>
+ <title>MySQL</title>
+
+ <para>JCR MySQL-backend requires special dialect <ulink
+ url="http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-utf8.html"><ulink
+ url="http://jira.exoplatform.org/browse/JCR-375">MySQL-UTF8</ulink></ulink>
+ to be used for internationalization support. But the database default
+ charset should be latin1 to use limited index space effectively (1000
+ bytes for MyISAM engine, 767 for InnoDB). If database default charset is
+ multibyte, a JCR database initialization error is thrown concerning index
+ creation failure. In other words JCR can work on any singlebyte default
+ charset of database, with UTF8 supported by MySQL server. But we have
+ tested it only on latin1 database default charset.</para>
+
+ <para>Repository configuration, workspace container entry
+ example:<programlisting><workspace name="collaboration">
+ <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name="source-name" value="jdbcjcr" />
+ <property name="dialect" value="mysql-utf8" />
+ <property name="multi-db" value="false" />
+ <property name="max-buffer-size" value="200k" />
+ <property name="swap-directory" value="target/temp/swap/ws" />
+ </properties>
+ .....</programlisting></para>
+ </section>
+
+ <section>
+ <title>PostgreSQL</title>
+
+ <para>On PostgreSQL-backend multilingual support can be enabled in <ulink
+ url="http://www.postgresql.org/docs/8.3/interactive/charset.html">different
+ ways</ulink>:<itemizedlist>
+ <listitem>
+ <para>Using the locale features of the operating system to provide
+ locale-specific collation order, number formatting, translated
+ messages, and other aspects. UTF-8 is widely used on Linux
+ distributions by default, so it can be useful in such case.</para>
+ </listitem>
+
+ <listitem>
+ <para>Providing a number of different character sets defined in the
+ PostgreSQL server, including multiple-byte character sets, to
+ support storing text any language, and providing character set
+ translation between client and server. We recommend to use UTF-8
+ database charset, it will allow any-to-any conversations and make
+ this issue transparent for the JCR.</para>
+ </listitem>
+ </itemizedlist></para>
+
+ <para>Create database with UTF-8 encoding and use PgSQL dialect for
+ Workspace Container:<programlisting><workspace name="collaboration">
+ <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name="source-name" value="jdbcjcr" />
+ <property name="dialect" value="pgsql" />
+ <property name="multi-db" value="false" />
+ <property name="max-buffer-size" value="200k" />
+ <property name="swap-directory" value="target/temp/swap/ws" />
+ </properties>
+ .....</programlisting></para>
+ </section>
+</chapter>
Modified: jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/search-configuration.xml
===================================================================
--- jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/search-configuration.xml 2010-08-04 12:33:17 UTC (rev 2873)
+++ jcr/branches/1.12.x/docs/reference/en/src/main/docbook/en-US/modules/jcr/configuration/search-configuration.xml 2010-08-04 13:51:23 UTC (rev 2874)
@@ -1,818 +1,837 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
-<chapter id="JCR.SearchConfiguration">
- <?dbhtml filename="ch-search-configuration.html"?>
-
- <title>Search Configuration</title>
-
- <section>
- <title>XML Configuration</title>
-
- <para>JCR index configuration. You can find this file here:
- <filename>.../portal/WEB-INF/conf/jcr/repository-configuration.xml</filename></para>
-
- <programlisting><repository-service default-repository="db1">
- <repositories>
- <repository name="db1" system-workspace="ws" default-workspace="ws">
- ....
- <workspaces>
- <workspace name="ws">
- ....
- <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <properties>
- <property name="index-dir" value="${java.io.tmpdir}/temp/index/db1/ws" />
- <property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
- <property name="synonymprovider-config-path" value="/synonyms.properties" />
- <property name="indexing-config-path" value="/indexing-configuration.xml" />
- <property name="query-class" value="org.exoplatform.services.jcr.impl.core.query.QueryImpl" />
- </properties>
- </query-handler>
- ...
- </workspace>
- </workspaces>
- </repository>
- </repositories>
-</repository-service></programlisting>
- </section>
-
- <section>
- <title>Configuration parameters</title>
-
- <table>
- <title></title>
-
- <tgroup cols="4">
- <thead>
- <row>
- <entry>Parameter</entry>
-
- <entry>Default</entry>
-
- <entry>Description</entry>
-
- <entry>Since</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry>index-dir</entry>
-
- <entry>none</entry>
-
- <entry>The location of the index directory. This parameter is
- mandatory. Up to 1.9 this parameter called "indexDir"</entry>
-
- <entry>1.0</entry>
- </row>
-
- <row>
- <entry>use-compoundfile</entry>
-
- <entry>true</entry>
-
- <entry>Advises lucene to use compound files for the index
- files.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>min-merge-docs</entry>
-
- <entry>100</entry>
-
- <entry>Minimum number of nodes in an index until segments are
- merged.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>volatile-idle-time</entry>
-
- <entry>3</entry>
-
- <entry>Idle time in seconds until the volatile index part is moved
- to a persistent index even though minMergeDocs is not
- reached.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>max-merge-docs</entry>
-
- <entry>Integer.MAX_VALUE</entry>
-
- <entry>Maximum number of nodes in segments that will be merged.
- The default value changed in JCR 1.9 to Integer.MAX_VALUE.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>merge-factor</entry>
-
- <entry>10</entry>
-
- <entry>Determines how often segment indices are merged.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>max-field-length</entry>
-
- <entry>10000</entry>
-
- <entry>The number of words that are fulltext indexed at most per
- property.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>cache-size</entry>
-
- <entry>1000</entry>
-
- <entry>Size of the document number cache. This cache maps uuids to
- lucene document numbers</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>force-consistencycheck</entry>
-
- <entry>false</entry>
-
- <entry>Runs a consistency check on every startup. If false, a
- consistency check is only performed when the search index detects
- a prior forced shutdown.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>auto-repair</entry>
-
- <entry>true</entry>
-
- <entry>Errors detected by a consistency check are automatically
- repaired. If false, errors are only written to the log.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>query-class</entry>
-
- <entry>QueryImpl</entry>
-
- <entry>Class name that implements the javax.jcr.query.Query
- interface.This class must also extend from the class:
- org.exoplatform.services.jcr.impl.core.query.AbstractQueryImpl.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>document-order</entry>
-
- <entry>true</entry>
-
- <entry>If true and the query does not contain an 'order by'
- clause, result nodes will be in document order. For better
- performance when queries return a lot of nodes set to
- 'false'.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>result-fetch-size</entry>
-
- <entry>Integer.MAX_VALUE</entry>
-
- <entry>The number of results when a query is executed. Default
- value: Integer.MAX_VALUE (-> all).</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>excerptprovider-class</entry>
-
- <entry>DefaultXMLExcerpt</entry>
-
- <entry>The name of the class that implements
- org.exoplatform.services.jcr.impl.core.query.lucene.ExcerptProvider
- and should be used for the rep:excerpt() function in a
- query.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>support-highlighting</entry>
-
- <entry>false</entry>
-
- <entry>If set to true additional information is stored in the
- index to support highlighting using the rep:excerpt()
- function.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>synonymprovider-class</entry>
-
- <entry>none</entry>
-
- <entry>The name of a class that implements
- org.exoplatform.services.jcr.impl.core.query.lucene.SynonymProvider.
- The default value is null (-> not set).</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>synonymprovider-config-path</entry>
-
- <entry>none</entry>
-
- <entry>The path to the synonym provider configuration file. This
- path interpreted relative to the path parameter. If there is a
- path element inside the SearchIndex element, then this path is
- interpreted relative to the root path of the path. Whether this
- parameter is mandatory depends on the synonym provider
- implementation. The default value is null (-> not set).</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>indexing-configuration-path</entry>
-
- <entry>none</entry>
-
- <entry>The path to the indexing configuration file.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>indexing-configuration-class</entry>
-
- <entry>IndexingConfigurationImpl</entry>
-
- <entry>The name of the class that implements
- org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfiguration.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>force-consistencycheck</entry>
-
- <entry>false</entry>
-
- <entry>If set to true a consistency check is performed depending
- on the parameter forceConsistencyCheck. If set to false no
- consistency check is performed on startup, even if a redo log had
- been applied.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>spellchecker-class</entry>
-
- <entry>none</entry>
-
- <entry>The name of a class that implements
- org.exoplatform.services.jcr.impl.core.query.lucene.SpellChecker.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>spellchecker-more-popular</entry>
-
- <entry>true</entry>
-
- <entry>If set true - spellchecker return only the suggest words
- that are as frequent or more frequent than the checked word. If
- set false, spellchecker return null (if checked word exit in
- dictionary), or spellchecker will return most close suggest
- word.</entry>
-
- <entry>1.10</entry>
- </row>
-
- <row>
- <entry>spellchecker-min-distance</entry>
-
- <entry>0.55f</entry>
-
- <entry>Minimal distance between checked word and proposed suggest
- word.</entry>
-
- <entry>1.10</entry>
- </row>
-
- <row>
- <entry>errorlog-size</entry>
-
- <entry>50(Kb)</entry>
-
- <entry>The default size of error log file in Kb.</entry>
-
- <entry>1.9</entry>
- </row>
-
- <row>
- <entry>upgrade-index</entry>
-
- <entry>false</entry>
-
- <entry>Allows JCR to convert an existing index into the new
- format. Also it is possible to set this property via system
- property, for example: -Dupgrade-index=true Indexes before JCR
- 1.12 will not run with JCR 1.12. Hence you have to run an
- automatic migration: Start JCR with -Dupgrade-index=true. The old
- index format is then converted in the new index format. After the
- conversion the new format is used. On the next start you don't
- need this option anymore. The old index is replaced and a back
- conversion is not possible - therefore better take a backup of the
- index before. (Only for migrations from JCR 1.9 and
- later.)</entry>
-
- <entry>1.12</entry>
- </row>
-
- <row>
- <entry>analyzer</entry>
-
- <entry>org.apache.lucene.analysis.standard.StandardAnalyzer</entry>
-
- <entry>Class name of a lucene analyzer to use for fulltext
- indexing of text.</entry>
-
- <entry>1.12</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
-
- <section>
- <title>Global Search Index</title>
-
- <section>
- <title>Global Search Index Configuration</title>
-
- <para>The global search index is configured in the above-mentioned
- configuration file
- (<filename>portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>)
- in the tag "query-handler".</para>
-
- <programlisting><query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex"></programlisting>
-
- <para>In fact when using Lucene you always should use the same analyzer
- for indexing and for querying - otherwise the results are unpredictable.
- You don't have to worry about this, eXo JCR does this for you
- automatically. If you don't like the StandardAnalyzer configured by
- default just replace it by your own.</para>
-
- <para>If you don't have a handy QueryHandler you will learn how create a
- customized Handler in 5 minutes.</para>
- </section>
-
- <section>
- <title>Customized Search Indexes and Analyzers</title>
-
- <para>By default Exo JCR uses the Lucene standard Analyzer to index
- contents. This analyzer uses some standard filters in the method that
- analyzes the content:<programlisting>public TokenStream tokenStream(String fieldName, Reader reader) {
- StandardTokenizer tokenStream = new StandardTokenizer(reader, replaceInvalidAcronym);
- tokenStream.setMaxTokenLength(maxTokenLength);
- TokenStream result = new StandardFilter(tokenStream);
- result = new LowerCaseFilter(result);
- result = new StopFilter(result, stopSet);
- return result;
- }</programlisting><itemizedlist>
- <listitem>
- <para>The first one (StandardFilter) removes 's (as 's in
- "Peter's") from the end of words and removes dots from
- acronyms.</para>
- </listitem>
-
- <listitem>
- <para>The second one (LowerCaseFilter) normalizes token text to
- lower case.</para>
- </listitem>
-
- <listitem>
- <para>The last one (StopFilter) removes stop words from a token
- stream. The stop set is defined in the analyzer.</para>
- </listitem>
- </itemizedlist></para>
-
- <para>For specific cases, you may wish to use additional filters like
- <phrase>ISOLatin1AccentFilter</phrase>, which replaces accented
- characters in the ISO Latin 1 character set (ISO-8859-1) by their
- unaccented equivalents.</para>
-
- <para>In order to use a different filter, you have to create a new
- analyzer, and a new search index to use the analyzer. You put it in a
- jar, which is deployed with your application.</para>
-
- <section>
- <title>Create the filter</title>
-
- <para>The ISOLatin1AccentFilter is not present in the current Lucene
- version used by Exo. You can use the attached file. You can also
- create your own filter, the relevant method is<programlisting>public final Token next(final Token reusableToken) throws java.io.IOException</programlisting>which
- defines how chars are read and used by the filter.</para>
- </section>
-
- <section>
- <title>Create the analyzer</title>
-
- <para>The analyzer have to extends
- org.apache.lucene.analysis.standard.StandardAnalyzer, and overload the
- method<programlisting>public TokenStream tokenStream(String fieldName, Reader reader)</programlisting>to
- put your own filters. You can have a glance at the example analyzer
- attached to this article.</para>
- </section>
-
- <section>
- <title>Create the search index</title>
-
- <para>Now, we have the analyzer, we have to write the SearchIndex,
- which will use the analyzer. Your have to extends
- org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex. You
- have to write the constructor, to set the right analyzer, and the
- method<programlisting>public Analyzer getAnalyzer() {
- return MyAnalyzer;
- }</programlisting>to return your analyzer. You can see the attached
- SearchIndex.</para>
-
- <note>
- <para>Since 1.12 version we can set Analyzer directly in
- configuration. So, creation new SearchIndex only for new Analyzer is
- redundant.</para>
- </note>
- </section>
-
- <section>
- <title>Configure your application to use your SearchIndex</title>
-
- <para>In
- <filename>portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>,
- you have to replace each<programlisting><query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex"></programlisting>by
- your own class<programlisting><query-handler class="mypackage.indexation.MySearchIndex"></programlisting></para>
- </section>
-
- <section>
- <title>Configure your application to use your Analyzer</title>
-
- <para>In
- <filename>portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>,
- you have to add parameter "analyzer" to each query-handler
- config:<programlisting><query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
- <properties>
- ...
- <property name="analyzer" value="org.exoplatform.services.jcr.impl.core.MyAnalyzer"/>
- ...
- </properties>
-</query-handler></programlisting></para>
-
- <para>When you start exo, your SearchIndex will start to index
- contents with the specified filters.</para>
- </section>
- </section>
- </section>
-
- <section>
- <title>Index Adjustments</title>
-
- <section>
- <title>IndexingConfiguration</title>
-
- <para>Starting with version 1.9, the default search index implementation
- in JCR allows you to control which properties of a node are indexed. You
- also can define different analyzers for different nodes.</para>
-
- <para>The configuration parameter is called indexingConfiguration and
- per default is not set. This means all properties of a node are
- indexed.</para>
-
- <para>If you wish to configure the indexing behavior you need to add a
- parameter to the query-handler element in your configuration
- file.</para>
-
- <programlisting><param name="indexing-configuration-path" value="/indexing_configuration.xml"/></programlisting>
- </section>
-
- <section>
- <title>Index rules</title>
-
- <section>
- <title>Node Scope Limit</title>
-
- <para>To optimize the index size you can limit the node scope so that
- <phrase>only certain properties</phrase> of a node type are
- indexed.</para>
-
- <para>With the below configuration only properties named Text are
- indexed for nodes of type nt:unstructured. This configuration also
- applies to all nodes whose type extends from nt:unstructured.</para>
-
- <programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured">
- <property>Text</property>
- </index-rule>
-</configuration></programlisting>
-
- <para>Please note that you have to declare the <phrase>namespace
- prefixes</phrase> in the configuration element that you are using
- throughout the XML file!</para>
- </section>
-
- <section>
- <title>Index Boost Value</title>
-
- <para>It is also possible to configure a <phrase>boost value</phrase>
- for the nodes that match the index rule. The default boost value is
- 1.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield
- a higher score value and appear as more relevant.</para>
-
- <programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured"
- boost="2.0">
- <property>Text</property>
- </index-rule>
-</configuration></programlisting>
-
- <para>If you do not wish to boost the complete node but only certain
- properties you can also provide a boost value for the listed
- properties:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured">
- <property boost="3.0">Title</property>
- <property boost="1.5">Text</property>
- </index-rule>
-</configuration></programlisting></para>
- </section>
-
- <section>
- <title>Conditional Index Rules</title>
-
- <para>You may also add a <phrase>condition</phrase> to the index rule
- and have multiple rules with the same nodeType. The first index rule
- that matches will apply and all remaining ones are
- ignored:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured"
- boost="2.0"
- condition="@priority = 'high'">
- <property>Text</property>
- </index-rule>
- <index-rule nodeType="nt:unstructured">
- <property>Text</property>
- </index-rule>
-</configuration></programlisting></para>
-
- <para>In the above example the first rule only applies if the
- nt:unstructured node has a priority property with a value 'high'. The
- condition syntax supports only the equals operator and a string
- literal.</para>
-
- <para>You may also reference properties in the condition that are not
- on the current node:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured"
- boost="2.0"
- condition="ancestor::*/@priority = 'high'">
- <property>Text</property>
- </index-rule>
- <index-rule nodeType="nt:unstructured"
- boost="0.5"
- condition="parent::foo/@priority = 'low'">
- <property>Text</property>
- </index-rule>
- <index-rule nodeType="nt:unstructured"
- boost="1.5"
- condition="bar/@priority = 'medium'">
- <property>Text</property>
- </index-rule>
- <index-rule nodeType="nt:unstructured">
- <property>Text</property>
- </index-rule>
-</configuration></programlisting></para>
-
- <para>The indexing configuration also allows you to specify the type
- of a node in the condition. Please note however that the type match
- must be exact. It does not consider sub types of the specified node
- type.</para>
-
- <programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured"
- boost="2.0"
- condition="element(*, nt:unstructured)/@priority = 'high'">
- <property>Text</property>
- </index-rule>
-</configuration></programlisting>
- </section>
-
- <section>
- <title>Exclusion from the Node Scope Index</title>
-
- <para>Per default all configured properties are fulltext indexed if
- they are of type STRING and included in the node scope index. A node
- scope search finds normally all nodes of an index. That is, the select
- jcr:contains(., 'foo') returns all nodes that have a string property
- containing the word 'foo'. You can exclude explicitly a property from
- the node scope index:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <index-rule nodeType="nt:unstructured">
- <property nodeScopeIndex="false">Text</property>
- </index-rule>
-</configuration></programlisting></para>
- </section>
- </section>
-
- <section>
- <title>Index Aggregates</title>
-
- <para>Sometimes it is useful to include the contents of descendant nodes
- into a single node to easier search on content that is scattered across
- multiple nodes.</para>
-
- <para>JCR allows you to define index aggregates based on relative path
- patterns and primary node types.</para>
-
- <para>The following example creates an index aggregate on nt:file that
- includes the content of the jcr:content node:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
- xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <aggregate primaryType="nt:file">
- <include>jcr:content</include>
- </aggregate>
-</configuration></programlisting></para>
-
- <para>You can also restrict the included nodes to a certain
- type:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
- xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <aggregate primaryType="nt:file">
- <include primaryType="nt:resource">jcr:content</include>
- </aggregate>
-</configuration></programlisting></para>
-
- <para>You may also use the * to match all child nodes:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
- xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <aggregate primaryType="nt:file">http://wiki.exoplatform.com/xwiki/bin/edit/JCR/Search+Configuration
- <include primaryType="nt:resource">*</include>
- </aggregate>
-</configuration></programlisting></para>
-
- <para>If you wish to include nodes up to a certain depth below the
- current node you can add multiple include elements. E.g. the nt:file
- node may contain a complete XML document under
- jcr:content:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
- xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <aggregate primaryType="nt:file">
- <include>*</include>
- <include>*/*</include>
- <include>*/*/*</include>
- </aggregate>
-</configuration></programlisting></para>
- </section>
-
- <section>
- <title>Property-Level Analyzers</title>
-
- <section>
- <title>Example</title>
-
- <para>In this configuration section you define how a property has to
- be analyzed. If there is an analyzer configuration for a property,
- this analyzer is used for indexing and searching of this property. For
- example:<programlisting><?xml version="1.0"?>
-<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
-<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
- <analyzers>
- <analyzer class="org.apache.lucene.analysis.KeywordAnalyzer">
- <property>mytext</property>
- </analyzer>
- <analyzer class="org.apache.lucene.analysis.WhitespaceAnalyzer">
- <property>mytext2</property>
- </analyzer>
- </analyzers>
-</configuration></programlisting></para>
-
- <para>The configuration above means that the property "mytext" for the
- entire workspace is indexed (and searched) with the Lucene
- KeywordAnalyzer, and property "mytext2" with the WhitespaceAnalyzer.
- Using different analyzers for different languages is particularly
- useful.</para>
-
- <para>The WhitespaceAnalyzer tokenizes a property, the KeywordAnalyzer
- takes the property as a whole.</para>
- </section>
-
- <section>
- <title>Characteristics of Node Scope Searches</title>
-
- <para>When using analyzers, you may encounter an unexpected behavior
- when searching within a property compared to searching within a node
- scope. The reason is that the node scope always uses the global
- analyzer.</para>
-
- <para>Let's suppose that the property "mytext" contains the text :
- "testing my analyzers" and that you haven't configured any analyzers
- for the property "mytext" (and not changed the default analyzer in
- SearchIndex).</para>
-
- <para>If your query is for example:<programlisting>xpath = "//*[jcr:contains(mytext,'analyzer')]"</programlisting></para>
-
- <para>This xpath does not return a hit in the node with the property
- above and default analyzers.</para>
-
- <para>Also a search on the node scope<programlisting>xpath = "//*[jcr:contains(.,'analyzer')]"</programlisting>won't
- give a hit. Realize, that you can only set specific analyzers on a
- node property, and that the node scope indexing/analyzing is always
- done with the globally defined analyzer in the SearchIndex
- element.</para>
-
- <para>Now, if you change the analyzer used to index the "mytext"
- property above to<programlisting><analyzer class="org.apache.lucene.analysis.Analyzer.GermanAnalyzer">
- <property>mytext</property>
-</analyzer></programlisting>and you do the same search again, then
- for<programlisting>xpath = "//*[jcr:contains(mytext,'analyzer')]"</programlisting>you
- would get a hit because of the word stemming (analyzers -
- analyzer).</para>
-
- <para>The other search,<programlisting>xpath = "//*[jcr:contains(.,'analyzer')]"</programlisting>still
- would not give a result, since the node scope is indexed with the
- global analyzer, which in this case does not take into account any
- word stemming.</para>
-
- <para>In conclusion, be aware that when using analyzers for specific
- properties, you might find a hit in a property for some search text,
- and you do not find a hit with the same search text in the node scope
- of the property!</para>
-
- <note>
- <para>Both index rules and index aggregates influence how content is
- indexed in JCR. If you change the configuration the existing content
- is not automatically re-indexed according to the new rules. You
- therefore have to manually re-index the content when you change the
- configuration!</para>
- </note>
- </section>
- </section>
- <section>
- <title>Advanced features</title>
- <para>Exo JCR supports some advanced features, which are not specified in JSR 170:
- * Get a text excerpt with
- <emphasis role="bold">highlighted words</emphasis> that matches the query:
- <ulink url="ExcerptProvider>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">ExcerptProvider</ulink>.
- * Search for a term and its
- <emphasis role="bold">synonyms</emphasis>:
- <ulink url="SynonymSearch>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">SynonymSearch</ulink>
- * Search for
- <emphasis role="bold">similar</emphasis> nodes:
- <ulink url="SimilaritySearch>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">SimilaritySearch</ulink>
- * Check
- <emphasis role="bold">spelling</emphasis> of a fulltext query statement:
- <ulink url="SpellChecker>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">SpellChecker</ulink>
- * Define index
- <emphasis role="bold">aggregates and rules</emphasis>: IndexingConfiguration (see this article)
- </para>
- </section>
- </section>
-</chapter>
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<chapter id="JCR.SearchConfiguration">
+ <?dbhtml filename="ch-search-configuration.html"?>
+
+ <title>Search Configuration</title>
+
+ <section>
+ <title>XML Configuration</title>
+
+ <para>JCR index configuration. You can find this file here:
+ <filename>.../portal/WEB-INF/conf/jcr/repository-configuration.xml</filename></para>
+
+ <programlisting><repository-service default-repository="db1">
+ <repositories>
+ <repository name="db1" system-workspace="ws" default-workspace="ws">
+ ....
+ <workspaces>
+ <workspace name="ws">
+ ....
+ <query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
+ <properties>
+ <property name="index-dir" value="${java.io.tmpdir}/temp/index/db1/ws" />
+ <property name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider" />
+ <property name="synonymprovider-config-path" value="/synonyms.properties" />
+ <property name="indexing-config-path" value="/indexing-configuration.xml" />
+ <property name="query-class" value="org.exoplatform.services.jcr.impl.core.query.QueryImpl" />
+ </properties>
+ </query-handler>
+ ...
+ </workspace>
+ </workspaces>
+ </repository>
+ </repositories>
+</repository-service></programlisting>
+ </section>
+
+ <section>
+ <title>Configuration parameters</title>
+
+ <table>
+ <title></title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Parameter</entry>
+
+ <entry>Default</entry>
+
+ <entry>Description</entry>
+
+ <entry>Since</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry>index-dir</entry>
+
+ <entry>none</entry>
+
+ <entry>The location of the index directory. This parameter is
+ mandatory. Up to 1.9 this parameter called "indexDir"</entry>
+
+ <entry>1.0</entry>
+ </row>
+
+ <row>
+ <entry>use-compoundfile</entry>
+
+ <entry>true</entry>
+
+ <entry>Advises lucene to use compound files for the index
+ files.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>min-merge-docs</entry>
+
+ <entry>100</entry>
+
+ <entry>Minimum number of nodes in an index until segments are
+ merged.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>volatile-idle-time</entry>
+
+ <entry>3</entry>
+
+ <entry>Idle time in seconds until the volatile index part is moved
+ to a persistent index even though minMergeDocs is not
+ reached.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>max-merge-docs</entry>
+
+ <entry>Integer.MAX_VALUE</entry>
+
+ <entry>Maximum number of nodes in segments that will be merged.
+ The default value changed in JCR 1.9 to Integer.MAX_VALUE.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>merge-factor</entry>
+
+ <entry>10</entry>
+
+ <entry>Determines how often segment indices are merged.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>max-field-length</entry>
+
+ <entry>10000</entry>
+
+ <entry>The number of words that are fulltext indexed at most per
+ property.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>cache-size</entry>
+
+ <entry>1000</entry>
+
+ <entry>Size of the document number cache. This cache maps uuids to
+ lucene document numbers</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>force-consistencycheck</entry>
+
+ <entry>false</entry>
+
+ <entry>Runs a consistency check on every startup. If false, a
+ consistency check is only performed when the search index detects
+ a prior forced shutdown.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>auto-repair</entry>
+
+ <entry>true</entry>
+
+ <entry>Errors detected by a consistency check are automatically
+ repaired. If false, errors are only written to the log.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>query-class</entry>
+
+ <entry>QueryImpl</entry>
+
+ <entry>Class name that implements the javax.jcr.query.Query
+ interface.This class must also extend from the class:
+ org.exoplatform.services.jcr.impl.core.query.AbstractQueryImpl.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>document-order</entry>
+
+ <entry>true</entry>
+
+ <entry>If true and the query does not contain an 'order by'
+ clause, result nodes will be in document order. For better
+ performance when queries return a lot of nodes set to
+ 'false'.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>result-fetch-size</entry>
+
+ <entry>Integer.MAX_VALUE</entry>
+
+ <entry>The number of results when a query is executed. Default
+ value: Integer.MAX_VALUE (-> all).</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>excerptprovider-class</entry>
+
+ <entry>DefaultXMLExcerpt</entry>
+
+ <entry>The name of the class that implements
+ org.exoplatform.services.jcr.impl.core.query.lucene.ExcerptProvider
+ and should be used for the rep:excerpt() function in a
+ query.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>support-highlighting</entry>
+
+ <entry>false</entry>
+
+ <entry>If set to true additional information is stored in the
+ index to support highlighting using the rep:excerpt()
+ function.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>synonymprovider-class</entry>
+
+ <entry>none</entry>
+
+ <entry>The name of a class that implements
+ org.exoplatform.services.jcr.impl.core.query.lucene.SynonymProvider.
+ The default value is null (-> not set).</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>synonymprovider-config-path</entry>
+
+ <entry>none</entry>
+
+ <entry>The path to the synonym provider configuration file. This
+ path interpreted relative to the path parameter. If there is a
+ path element inside the SearchIndex element, then this path is
+ interpreted relative to the root path of the path. Whether this
+ parameter is mandatory depends on the synonym provider
+ implementation. The default value is null (-> not set).</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>indexing-configuration-path</entry>
+
+ <entry>none</entry>
+
+ <entry>The path to the indexing configuration file.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>indexing-configuration-class</entry>
+
+ <entry>IndexingConfigurationImpl</entry>
+
+ <entry>The name of the class that implements
+ org.exoplatform.services.jcr.impl.core.query.lucene.IndexingConfiguration.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>force-consistencycheck</entry>
+
+ <entry>false</entry>
+
+ <entry>If set to true a consistency check is performed depending
+ on the parameter forceConsistencyCheck. If set to false no
+ consistency check is performed on startup, even if a redo log had
+ been applied.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>spellchecker-class</entry>
+
+ <entry>none</entry>
+
+ <entry>The name of a class that implements
+ org.exoplatform.services.jcr.impl.core.query.lucene.SpellChecker.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>spellchecker-more-popular</entry>
+
+ <entry>true</entry>
+
+ <entry>If set true - spellchecker return only the suggest words
+ that are as frequent or more frequent than the checked word. If
+ set false, spellchecker return null (if checked word exit in
+ dictionary), or spellchecker will return most close suggest
+ word.</entry>
+
+ <entry>1.10</entry>
+ </row>
+
+ <row>
+ <entry>spellchecker-min-distance</entry>
+
+ <entry>0.55f</entry>
+
+ <entry>Minimal distance between checked word and proposed suggest
+ word.</entry>
+
+ <entry>1.10</entry>
+ </row>
+
+ <row>
+ <entry>errorlog-size</entry>
+
+ <entry>50(Kb)</entry>
+
+ <entry>The default size of error log file in Kb.</entry>
+
+ <entry>1.9</entry>
+ </row>
+
+ <row>
+ <entry>upgrade-index</entry>
+
+ <entry>false</entry>
+
+ <entry>Allows JCR to convert an existing index into the new
+ format. Also it is possible to set this property via system
+ property, for example: -Dupgrade-index=true Indexes before JCR
+ 1.12 will not run with JCR 1.12. Hence you have to run an
+ automatic migration: Start JCR with -Dupgrade-index=true. The old
+ index format is then converted in the new index format. After the
+ conversion the new format is used. On the next start you don't
+ need this option anymore. The old index is replaced and a back
+ conversion is not possible - therefore better take a backup of the
+ index before. (Only for migrations from JCR 1.9 and
+ later.)</entry>
+
+ <entry>1.12</entry>
+ </row>
+
+ <row>
+ <entry>analyzer</entry>
+
+ <entry>org.apache.lucene.analysis.standard.StandardAnalyzer</entry>
+
+ <entry>Class name of a lucene analyzer to use for fulltext
+ indexing of text.</entry>
+
+ <entry>1.12</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </section>
+
+ <section>
+ <title>Global Search Index</title>
+
+ <section>
+ <title>Global Search Index Configuration</title>
+
+ <para>The global search index is configured in the above-mentioned
+ configuration file
+ (<filename>portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>)
+ in the tag "query-handler".</para>
+
+ <programlisting><query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex"></programlisting>
+
+ <para>In fact when using Lucene you always should use the same analyzer
+ for indexing and for querying - otherwise the results are unpredictable.
+ You don't have to worry about this, eXo JCR does this for you
+ automatically. If you don't like the StandardAnalyzer configured by
+ default just replace it by your own.</para>
+
+ <para>If you don't have a handy QueryHandler you will learn how create a
+ customized Handler in 5 minutes.</para>
+ </section>
+
+ <section>
+ <title>Customized Search Indexes and Analyzers</title>
+
+ <para>By default Exo JCR uses the Lucene standard Analyzer to index
+ contents. This analyzer uses some standard filters in the method that
+ analyzes the content:<programlisting>public TokenStream tokenStream(String fieldName, Reader reader) {
+ StandardTokenizer tokenStream = new StandardTokenizer(reader, replaceInvalidAcronym);
+ tokenStream.setMaxTokenLength(maxTokenLength);
+ TokenStream result = new StandardFilter(tokenStream);
+ result = new LowerCaseFilter(result);
+ result = new StopFilter(result, stopSet);
+ return result;
+ }</programlisting><itemizedlist>
+ <listitem>
+ <para>The first one (StandardFilter) removes 's (as 's in
+ "Peter's") from the end of words and removes dots from
+ acronyms.</para>
+ </listitem>
+
+ <listitem>
+ <para>The second one (LowerCaseFilter) normalizes token text to
+ lower case.</para>
+ </listitem>
+
+ <listitem>
+ <para>The last one (StopFilter) removes stop words from a token
+ stream. The stop set is defined in the analyzer.</para>
+ </listitem>
+ </itemizedlist></para>
+
+ <para>For specific cases, you may wish to use additional filters like
+ <phrase>ISOLatin1AccentFilter</phrase>, which replaces accented
+ characters in the ISO Latin 1 character set (ISO-8859-1) by their
+ unaccented equivalents.</para>
+
+ <para>In order to use a different filter, you have to create a new
+ analyzer, and a new search index to use the analyzer. You put it in a
+ jar, which is deployed with your application.</para>
+
+ <section>
+ <title>Create the filter</title>
+
+ <para>The ISOLatin1AccentFilter is not present in the current Lucene
+ version used by Exo. You can use the attached file. You can also
+ create your own filter, the relevant method is<programlisting>public final Token next(final Token reusableToken) throws java.io.IOException</programlisting>which
+ defines how chars are read and used by the filter.</para>
+ </section>
+
+ <section>
+ <title>Create the analyzer</title>
+
+ <para>The analyzer have to extends
+ org.apache.lucene.analysis.standard.StandardAnalyzer, and overload the
+ method<programlisting>public TokenStream tokenStream(String fieldName, Reader reader)</programlisting>to
+ put your own filters. You can have a glance at the example analyzer
+ attached to this article.</para>
+ </section>
+
+ <section>
+ <title>Create the search index</title>
+
+ <para>Now, we have the analyzer, we have to write the SearchIndex,
+ which will use the analyzer. Your have to extends
+ org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex. You
+ have to write the constructor, to set the right analyzer, and the
+ method<programlisting>public Analyzer getAnalyzer() {
+ return MyAnalyzer;
+ }</programlisting>to return your analyzer. You can see the attached
+ SearchIndex.</para>
+
+ <note>
+ <para>Since 1.12 version we can set Analyzer directly in
+ configuration. So, creation new SearchIndex only for new Analyzer is
+ redundant.</para>
+ </note>
+ </section>
+
+ <section>
+ <title>Configure your application to use your SearchIndex</title>
+
+ <para>In
+ <filename>portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>,
+ you have to replace each<programlisting><query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex"></programlisting>by
+ your own class<programlisting><query-handler class="mypackage.indexation.MySearchIndex"></programlisting></para>
+ </section>
+
+ <section>
+ <title>Configure your application to use your Analyzer</title>
+
+ <para>In
+ <filename>portal/WEB-INF/conf/jcr/repository-configuration.xml</filename>,
+ you have to add parameter "analyzer" to each query-handler
+ config:<programlisting><query-handler class="org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex">
+ <properties>
+ ...
+ <property name="analyzer" value="org.exoplatform.services.jcr.impl.core.MyAnalyzer"/>
+ ...
+ </properties>
+</query-handler></programlisting></para>
+
+ <para>When you start exo, your SearchIndex will start to index
+ contents with the specified filters.</para>
+ </section>
+ </section>
+ </section>
+
+ <section>
+ <title>Index Adjustments</title>
+
+ <section>
+ <title>IndexingConfiguration</title>
+
+ <para>Starting with version 1.9, the default search index implementation
+ in JCR allows you to control which properties of a node are indexed. You
+ also can define different analyzers for different nodes.</para>
+
+ <para>The configuration parameter is called indexingConfiguration and
+ per default is not set. This means all properties of a node are
+ indexed.</para>
+
+ <para>If you wish to configure the indexing behavior you need to add a
+ parameter to the query-handler element in your configuration
+ file.</para>
+
+ <programlisting><param name="indexing-configuration-path" value="/indexing_configuration.xml"/></programlisting>
+ </section>
+
+ <section>
+ <title>Index rules</title>
+
+ <section>
+ <title>Node Scope Limit</title>
+
+ <para>To optimize the index size you can limit the node scope so that
+ <phrase>only certain properties</phrase> of a node type are
+ indexed.</para>
+
+ <para>With the below configuration only properties named Text are
+ indexed for nodes of type nt:unstructured. This configuration also
+ applies to all nodes whose type extends from nt:unstructured.</para>
+
+ <programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured">
+ <property>Text</property>
+ </index-rule>
+</configuration></programlisting>
+
+ <para>Please note that you have to declare the <phrase>namespace
+ prefixes</phrase> in the configuration element that you are using
+ throughout the XML file!</para>
+ </section>
+
+ <section>
+ <title>Index Boost Value</title>
+
+ <para>It is also possible to configure a <phrase>boost value</phrase>
+ for the nodes that match the index rule. The default boost value is
+ 1.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield
+ a higher score value and appear as more relevant.</para>
+
+ <programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured"
+ boost="2.0">
+ <property>Text</property>
+ </index-rule>
+</configuration></programlisting>
+
+ <para>If you do not wish to boost the complete node but only certain
+ properties you can also provide a boost value for the listed
+ properties:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured">
+ <property boost="3.0">Title</property>
+ <property boost="1.5">Text</property>
+ </index-rule>
+</configuration></programlisting></para>
+ </section>
+
+ <section>
+ <title>Conditional Index Rules</title>
+
+ <para>You may also add a <phrase>condition</phrase> to the index rule
+ and have multiple rules with the same nodeType. The first index rule
+ that matches will apply and all remaining ones are
+ ignored:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured"
+ boost="2.0"
+ condition="@priority = 'high'">
+ <property>Text</property>
+ </index-rule>
+ <index-rule nodeType="nt:unstructured">
+ <property>Text</property>
+ </index-rule>
+</configuration></programlisting></para>
+
+ <para>In the above example the first rule only applies if the
+ nt:unstructured node has a priority property with a value 'high'. The
+ condition syntax supports only the equals operator and a string
+ literal.</para>
+
+ <para>You may also reference properties in the condition that are not
+ on the current node:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured"
+ boost="2.0"
+ condition="ancestor::*/@priority = 'high'">
+ <property>Text</property>
+ </index-rule>
+ <index-rule nodeType="nt:unstructured"
+ boost="0.5"
+ condition="parent::foo/@priority = 'low'">
+ <property>Text</property>
+ </index-rule>
+ <index-rule nodeType="nt:unstructured"
+ boost="1.5"
+ condition="bar/@priority = 'medium'">
+ <property>Text</property>
+ </index-rule>
+ <index-rule nodeType="nt:unstructured">
+ <property>Text</property>
+ </index-rule>
+</configuration></programlisting></para>
+
+ <para>The indexing configuration also allows you to specify the type
+ of a node in the condition. Please note however that the type match
+ must be exact. It does not consider sub types of the specified node
+ type.</para>
+
+ <programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured"
+ boost="2.0"
+ condition="element(*, nt:unstructured)/@priority = 'high'">
+ <property>Text</property>
+ </index-rule>
+</configuration></programlisting>
+ </section>
+
+ <section>
+ <title>Exclusion from the Node Scope Index</title>
+
+ <para>Per default all configured properties are fulltext indexed if
+ they are of type STRING and included in the node scope index. A node
+ scope search finds normally all nodes of an index. That is, the select
+ jcr:contains(., 'foo') returns all nodes that have a string property
+ containing the word 'foo'. You can exclude explicitly a property from
+ the node scope index:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <index-rule nodeType="nt:unstructured">
+ <property nodeScopeIndex="false">Text</property>
+ </index-rule>
+</configuration></programlisting></para>
+ </section>
+ </section>
+
+ <section>
+ <title>Index Aggregates</title>
+
+ <para>Sometimes it is useful to include the contents of descendant nodes
+ into a single node to easier search on content that is scattered across
+ multiple nodes.</para>
+
+ <para>JCR allows you to define index aggregates based on relative path
+ patterns and primary node types.</para>
+
+ <para>The following example creates an index aggregate on nt:file that
+ includes the content of the jcr:content node:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
+ xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <aggregate primaryType="nt:file">
+ <include>jcr:content</include>
+ </aggregate>
+</configuration></programlisting></para>
+
+ <para>You can also restrict the included nodes to a certain
+ type:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
+ xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <aggregate primaryType="nt:file">
+ <include primaryType="nt:resource">jcr:content</include>
+ </aggregate>
+</configuration></programlisting></para>
+
+ <para>You may also use the * to match all child nodes:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
+ xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <aggregate primaryType="nt:file">http://wiki.exoplatform.com/xwiki/bin/edit/JCR/Search+Configuration
+ <include primaryType="nt:resource">*</include>
+ </aggregate>
+</configuration></programlisting></para>
+
+ <para>If you wish to include nodes up to a certain depth below the
+ current node you can add multiple include elements. E.g. the nt:file
+ node may contain a complete XML document under
+ jcr:content:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:jcr="http://www.jcp.org/jcr/1.0"
+ xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <aggregate primaryType="nt:file">
+ <include>*</include>
+ <include>*/*</include>
+ <include>*/*/*</include>
+ </aggregate>
+</configuration></programlisting></para>
+ </section>
+
+ <section>
+ <title>Property-Level Analyzers</title>
+
+ <section>
+ <title>Example</title>
+
+ <para>In this configuration section you define how a property has to
+ be analyzed. If there is an analyzer configuration for a property,
+ this analyzer is used for indexing and searching of this property. For
+ example:<programlisting><?xml version="1.0"?>
+<!DOCTYPE configuration SYSTEM "http://www.exoplatform.org/dtd/indexing-configuration-1.0.dtd">
+<configuration xmlns:nt="http://www.jcp.org/jcr/nt/1.0">
+ <analyzers>
+ <analyzer class="org.apache.lucene.analysis.KeywordAnalyzer">
+ <property>mytext</property>
+ </analyzer>
+ <analyzer class="org.apache.lucene.analysis.WhitespaceAnalyzer">
+ <property>mytext2</property>
+ </analyzer>
+ </analyzers>
+</configuration></programlisting></para>
+
+ <para>The configuration above means that the property "mytext" for the
+ entire workspace is indexed (and searched) with the Lucene
+ KeywordAnalyzer, and property "mytext2" with the WhitespaceAnalyzer.
+ Using different analyzers for different languages is particularly
+ useful.</para>
+
+ <para>The WhitespaceAnalyzer tokenizes a property, the KeywordAnalyzer
+ takes the property as a whole.</para>
+ </section>
+
+ <section>
+ <title>Characteristics of Node Scope Searches</title>
+
+ <para>When using analyzers, you may encounter an unexpected behavior
+ when searching within a property compared to searching within a node
+ scope. The reason is that the node scope always uses the global
+ analyzer.</para>
+
+ <para>Let's suppose that the property "mytext" contains the text :
+ "testing my analyzers" and that you haven't configured any analyzers
+ for the property "mytext" (and not changed the default analyzer in
+ SearchIndex).</para>
+
+ <para>If your query is for example:<programlisting>xpath = "//*[jcr:contains(mytext,'analyzer')]"</programlisting></para>
+
+ <para>This xpath does not return a hit in the node with the property
+ above and default analyzers.</para>
+
+ <para>Also a search on the node scope<programlisting>xpath = "//*[jcr:contains(.,'analyzer')]"</programlisting>won't
+ give a hit. Realize, that you can only set specific analyzers on a
+ node property, and that the node scope indexing/analyzing is always
+ done with the globally defined analyzer in the SearchIndex
+ element.</para>
+
+ <para>Now, if you change the analyzer used to index the "mytext"
+ property above to<programlisting><analyzer class="org.apache.lucene.analysis.Analyzer.GermanAnalyzer">
+ <property>mytext</property>
+</analyzer></programlisting>and you do the same search again, then
+ for<programlisting>xpath = "//*[jcr:contains(mytext,'analyzer')]"</programlisting>you
+ would get a hit because of the word stemming (analyzers -
+ analyzer).</para>
+
+ <para>The other search,<programlisting>xpath = "//*[jcr:contains(.,'analyzer')]"</programlisting>still
+ would not give a result, since the node scope is indexed with the
+ global analyzer, which in this case does not take into account any
+ word stemming.</para>
+
+ <para>In conclusion, be aware that when using analyzers for specific
+ properties, you might find a hit in a property for some search text,
+ and you do not find a hit with the same search text in the node scope
+ of the property!</para>
+
+ <note>
+ <para>Both index rules and index aggregates influence how content is
+ indexed in JCR. If you change the configuration the existing content
+ is not automatically re-indexed according to the new rules. You
+ therefore have to manually re-index the content when you change the
+ configuration!</para>
+ </note>
+ </section>
+ </section>
+
+ <section>
+ <title>Advanced features</title>
+
+ <para>Exo JCR supports some advanced features, which are not specified
+ in JSR 170:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Get a text excerpt with <emphasis role="bold">highlighted
+ words</emphasis> that matches the query: <ulink
+ url="ExcerptProvider>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">ExcerptProvider</ulink>.</para>
+ </listitem>
+
+ <listitem>
+ <para>Search for a term and its <emphasis
+ role="bold">synonyms</emphasis>: <ulink
+ url="SynonymSearch>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">SynonymSearch</ulink></para>
+ </listitem>
+
+ <listitem>
+ <para>Search for <emphasis role="bold">similar</emphasis> nodes:
+ <ulink
+ url="SimilaritySearch>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">SimilaritySearch</ulink></para>
+ </listitem>
+
+ <listitem>
+ <para>Check <emphasis role="bold">spelling</emphasis> of a fulltext
+ query statement: <ulink
+ url="SpellChecker>http://wiki.exoplatform.com/xwiki/bin/view/JCR/Searching+Repository+Content">SpellChecker</ulink></para>
+ </listitem>
+
+ <listitem>
+ <para>Define index <emphasis role="bold">aggregates and
+ rules</emphasis>: IndexingConfiguration (see this article)</para>
+ </listitem>
+ </itemizedlist>
+ </section>
+ </section>
+</chapter>
More information about the exo-jcr-commits
mailing list