From do-not-reply at jboss.org Wed May 8 01:35:05 2013
Content-Type: multipart/mixed; boundary="===============2260912015684865299=="
MIME-Version: 1.0
From: do-not-reply at jboss.org
To: gatein-commits at lists.jboss.org
Subject: [gatein-commits] gatein SVN: r9273 - in
epp/docs/JPP/trunk/Reference_Guide/en-US: modules and 1 other directory.
Date: Wed, 08 May 2013 01:35:05 -0400
Message-ID: <201305080535.r485Z5f4004398@svn01.web.mwc.hst.phx2.redhat.com>
--===============2260912015684865299==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Author: jaredmorgs
Date: 2013-05-08 01:35:04 -0400 (Wed, 08 May 2013)
New Revision: 9273
Modified:
epp/docs/JPP/trunk/Reference_Guide/en-US/Reference_Guide.xml
epp/docs/JPP/trunk/Reference_Guide/en-US/modules/eXoJCR.xml
Log:
eXo JCR portion commented out of the guide
Modified: epp/docs/JPP/trunk/Reference_Guide/en-US/Reference_Guide.xml
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- epp/docs/JPP/trunk/Reference_Guide/en-US/Reference_Guide.xml 2013-05-08=
04:22:10 UTC (rev 9272)
+++ epp/docs/JPP/trunk/Reference_Guide/en-US/Reference_Guide.xml 2013-05-08=
05:35:04 UTC (rev 9273)
@@ -8,14 +8,11 @@
-
- =
-
+ Web Services for Remote Portlets (WSRP)
-
-
+
Modified: epp/docs/JPP/trunk/Reference_Guide/en-US/modules/eXoJCR.xml
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- epp/docs/JPP/trunk/Reference_Guide/en-US/modules/eXoJCR.xml 2013-05-08 =
04:22:10 UTC (rev 9272)
+++ epp/docs/JPP/trunk/Reference_Guide/en-US/modules/eXoJCR.xml 2013-05-08 =
05:35:04 UTC (rev 9273)
@@ -5,29 +5,5629 @@
]>
The Java Content Repository (JCR)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+
+ Introduction
+
+ eXo JCR usage
+
+ The JBoss Portal Platform is using a JCR API to store its info=
rmation for internal usage. We do not support usage of the JCR to store app=
lication information.
+
+
+ The information below is intended to assist users to understan=
d particular low level details on how the JBoss Portal Platform works and h=
ow it can be fine-tuned.
+
+
+
+ The term JCR refers to the Java=
Content Repository. The JCR is the data store of JBoss Portal Platform. Al=
l content is stored and managed via the JCR.
+
+
+ The eXo JCR included with JBoss Portal Platform &VY; is a (JSR-170) compliant implementation of the JCR 1.0 specification. The JCR provid=
es versioning, textual search, access control, content event monitoring, an=
d is used to storing text and binary data for the portal internal usage. Th=
e back-end storage of the JCR is configurable and can be a file system or a=
database.
+
+
+ Concepts
+
+
+ Repository
+
+
+ A repository is a form of data storage device. A &=
apos;repository' differs from a 'database' in the nature of =
the information contained. While a database holds hard data in rigid tables=
, a repository may access the data on a database by using less rigid meta-data. In this sense a repository operates as an 'i=
nterpreter' between the database(s) and the user.
+
+
+
+ The data model for the interface (the reposito=
ry) is rarely the same as the data model used by the repository's unde=
rlying storage subsystems (such as a database), however the repository is a=
ble to make persistent data changes in the storage subsystem.
+
+
+
+
+
+ Workspace
+
+
+ The eXo JCR uses 'workspaces' as the mai=
n data abstraction in its data model. The content is stored in a workspace =
as a hierarchy of items and each workspace has its own=
hierarchy of items.
+
+
+ Repositories access one or more workspaces. Persis=
tent JCR workspaces consist of a directed acyclic graph of items<=
/emphasis> where the edges represent the parent-child relation.
+
+
+
+
+ Items
+
+
+ An item is either a node or a property. Properties contain the=
data (either simple values or binary data). The nodes of a workspace give =
it its structure while the properties hold the data itself.
+
+
+
+ Nodes
+
+
+ Nodes are identified using accepted namespacing conventions. Changed nodes may be versioned =
through an associated version graph to preserve data integrity.
+
+
+ Nodes can have various properties or c=
hild nodes associated to them.
+
+
+
+
+ Properties
+
+
+ Properties hold data as values of pred=
efined types, such as: String, Binary, Long, =
Boolean, Double<=
/emphasis>, Date, Reference and Path.
+
+
+
+
+
+
+
+ The Data Model
+
+
+ The core of any Content Repository is the data mod=
el. The data model defines the 'data elements' (fields, columns, =
attributes, etc.) that are stored in the CR and the relationships between t=
hese elements.
+
+
+ Data elements can be singular pieces of informatio=
n (the value 3.14, for example), or compound values ('pi' =3D 3.14). A data model uses concepts like 'nodes',=
'arrays' and 'links' to define relationships between d=
ata elements.
+
+
+ The use and structure of these elements forms the =
content repository's 'data model'.
+
+
+
+
+ Data Abstraction
+
+
+ Data abstraction describes the separation between =
abstract and concrete properties =
of data stored in a repository. The concrete propertie=
s of the data refer to its implementation details.
+
+
+ The concrete properties of th=
e data implementation may be changed without affecting the abstra=
ct properties of the data itself, which are read by the data cli=
ent.
+
+
+ Consider the presentation of data in a list, graph=
or table. While the information implementation may ch=
ange, the data itself is unaffected, and readers to whom the data is presen=
ted can perform a mental abstraction to interpret it correctly, regardless =
of the implementation.
+
+
+
+
+
+
+
+ Multi-language Support
+
+ Whenever a relational database is used to store multilingual text =
data in the eXo Java Content Repository the configuration must be adapted t=
o support UTF-8 encoding. Dialect is automatically detected for certified d=
atabase. You can still enforce it in case of failure, see below.
+
+
+ The following sections describe enabling UTF-8 support with variou=
s databases.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ NEEDINFO - FILE PATHS - The path needs to be updated wit=
h the equivalent path for JBoss Portal Platform instead of gatein, please s=
ee below para. New info required?
+
+ The configuration file to be modified for these change=
s is JPP_HOME/gatein/gatein.ear/portal=
.war/WEB-INF/conf/jcr/repository-configuration.xml.
+
+
+
+
+ The datasource jdbcjcr used in =
the following examples can be configured via the InitialContextIni=
tializer component.
+
+
+
+
+
+ Oracle
+
+ In order to run multilanguage JCR on an Oracle backend Unicode=
encoding for characters set should be applied to the database. Other Oracl=
e globalization parameters do not have any effect. The property to modify i=
s NLS_CHARACTERSET.
+
+
+ The NLS_CHARACTERSET =3D AL32UTF8 entry has=
been successfully tested with many European and Asian languages.
+
+
+ Example of database configuration:
+
+ NLS_LANGUAGE AMERICAN
+NLS_TERRITORY AMERICA
+NLS_CURRENCY $
+NLS_ISO_CURRENCY AMERICA
+NLS_NUMERIC_CHARACTERS .,
+NLS_CHARACTERSET AL32UTF8
+NLS_CALENDAR GREGORIAN
+NLS_DATE_FORMAT DD-MON-RR
+NLS_DATE_LANGUAGE AMERICAN
+NLS_SORT BINARY
+NLS_TIME_FORMAT HH.MI.SSXFF AM
+NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
+NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
+NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
+NLS_DUAL_CURRENCY $
+NLS_COMP BINARY
+NLS_LENGTH_SEMANTICS BYTE
+NLS_NCHAR_CONV_EXCP FALSE
+NLS_NCHAR_CHARACTERSET AL16UTF16
+
+ Create database with Unicode encoding and use Oracle dialect f=
or the Workspace Container:
+
+
+
+
+ DB2
+
+ DB2 Universal Database (DB2 UDB) supports UTF-8 and UTF-16/UCS-2. When a Uni=
code database is created, CHAR, VARCHAR=
parameter> and LONG VARCHAR data are stored in UTF-8=
form.
+
+
+ This enables JCR multi-lingual support.
+
+
+ Below is an example of creating a UTF-8 database using the db2 dialect for a workspace container with DB2 version =
9 and higher:
+
+ DB2 CREATE DATABASE dbname USING CODESET UTF-8 TERRI=
TORY US
+
+
+
+
+ For DB2 version 8.x support cha=
nge the property "dialect" to db2v8.
+
+
+
+
+ MySQL
+
+ Using JCR with a MySQL-back end requires a special dialect
MySQL-UTF8 t=
o be used for internationalization support.
+
+
+ The database default charset should be latin1 so as to use limited index space effectively (767 for InnoD=
B).
+
+
+ If the database default charset is multibyte, a JCR database i=
nitialization error is encountered concerning index creation failure.
+
+
+ JCR can work on any single byte default charset of database, w=
ith UTF8 supported by MySQL server. However it has only been tested using t=
he latin1 charset.
+
+
+ An example entry:
+
+
+
+
+ PostgreSQL
+
+ Multilingual support can be enabled with a PostgreSQL-back end=
in different ways:
+
+
+
+
+ Using the locale features of the operating system to p=
rovide locale-specific collation order, number formatting, translated messa=
ges, and other aspects.
+
+
+ UTF-8 is widely used on Linux distributions by default=
, so it can be useful in such cases.
+
+
+
+
+ Providing a number of different character sets defined=
in the PostgreSQL server, including multiple-byte character sets, to suppo=
rt storing text any language, and providing character set translation betwe=
en client and server.
+
+
+ Using UTF-8 database charset is recommended as it will=
allow any-to-any conversations and make this issue transparent for the JCR.
+
+
+
+
+ Example of a database with UTF-8 encoding using PgSQL dialect =
for the Workspace Container:
+
+
+
+
+
+ Configuring Search
+
+ The search function in JCR can be configured to perform in specifi=
c ways. This section will discuss configuring the search function to improv=
e search performance and results.
+
+
+ Below is an example of the configuration file that governs search =
behaviors. Refer to for how searching operates in JCR and informatio=
n about customized searches.
+
+
+ The JCR index configuration file is located at JPP_HOME/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/re=
pository-configuration.xml.
+
+
+ A code example is included below with a list of the configuration =
parameters shown below that.
+
+
+
+ The table below outlines some o=
f the Configuration Parameters available, their default setting, which vers=
ion of eXo JCR they were implemented in and other useful information (furth=
er parameters are explained in ):
+
+
+ Configuration=
parameters
+
+
+
+
+
+
+
+
+
+ Parameter
+
+
+
+
+ Default
+
+
+
+
+ Description
+
+
+ Implemented in Version
+
+
+
+
+
+
+ index-dir
+
+
+
+
+ none
+
+
+
+
+ The location of the index directory. This para=
meter is mandatory. It is called "indexDir" in=
versions prior to eXo JCR version 1.9.
+
+
+ 1.0
+
+
+
+
+ use-compoundfile
+
+
+
+
+ true
+
+
+
+
+ Advises lucene to use compound files for the i=
ndex files.
+
+
+ 1.9
+
+
+
+
+ min-merge-docs
+
+
+
+
+ 100
+
+
+
+
+ The minimum number of nodes in an index until =
segments are merged.
+
+
+ 1.9
+
+
+
+
+ volatile-idle-time
+
+
+ 3
+
+
+ Idle time in seconds until the volatile index =
part is moved to a persistent index even though minMergeDocs is not reached.
+
+
+ 1.9
+
+
+
+
+ max-merge-docs
+
+
+
+
+ Integer.MAX_VALUE
+
+
+
+
+ The maximum number of nodes in segments that w=
ill be merged. The default value changed to Integer.MAX_VALUE in eXo JCR version 1.9.
+
+
+ 1.9
+
+
+
+
+ merge-factor
+
+
+
+
+ 10
+
+
+
+
+ Determines how often segment indices are merge=
d.
+
+
+ 1.9
+
+
+
+
+ max-field-length
+
+
+
+
+ 10000
+
+
+
+
+ The number of words that are full-text indexed=
at most per property.
+
+
+ 1.9
+
+
+
+
+ cache-size
+
+
+
+
+ 1000
+
+
+
+
+ Size of the document number cache. This cache =
maps UUID to lucene document numbers
+
+
+ 1.9
+
+
+
+
+ force-consistencycheck
+
+
+
+
+ false
+
+
+
+
+ Runs a consistency check on every start up. If=
false, a consistency check is only performed when the search index detects=
a prior forced shutdown.
+
+
+ 1.9
+
+
+
+
+ auto-repair
+
+
+
+
+ true
+
+
+
+
+ Errors detected by a consistency check are aut=
omatically repaired. If false, errors are only written to the log.
+
+
+ 1.9
+
+
+ query-class
+ QueryImpl
+
+
+ Classname that implements the javax.jcr.query.=
Query interface.
+
+
+ This class must also extend from the class: org.exoplatform.services.jcr.impl.core. query.AbstractQueryImpl.
+
+
+ 1.9
+
+
+
+
+ document-order
+
+
+
+
+ true
+
+
+
+
+ If true and the query does not contain an &apo=
s;order by' clause, result nodes will be in document order. For better=
performance set to 'false' when queries return many nodes.
+
+
+ 1.9
+
+
+
+
+ result-fetch-size
+
+
+
+
+ Integer.MAX_VALUE
+
+
+
+
+ The number of results when a query is executed=
. Default value: Integer.MAX_VALUE.
+
+
+ 1.9
+
+
+
+
+ excerptprovider-class
+
+
+
+
+ DefaultXMLExcerpt
+
+
+
+
+ The name of the class that implements org.exoplatform.services.jcr.impl.core. query.lucene.ExcerptProvider.
+
+
+ This should be used for the rep:excer=
pt() function in a query.
+
+
+ 1.9
+
+
+
+
+ support-highlighting
+
+
+
+
+ false
+
+
+
+
+ If set to true additional information is store=
d in the index to support highlighting using the rep:excerpt() function.
+
+
+ 1.9
+
+
+
+
+ synonymprovider-class
+
+
+
+
+ none
+
+
+
+
+ The name of a class that implements o=
rg.exoplatform.services.jcr.impl.core. query.lucene.SynonymProvider.
+
+
+ The default value is null.
+
+
+ 1.9
+
+
+
+
+ synonymprovider-config-path
+
+
+
+
+ none
+
+
+
+
+ The path to the synonym provider configuration=
file. This path is interpreted relative to the path parameter. If there is=
a path element inside the SearchIndex element, then thi=
s path is interpreted relative to the root path of the path. Whether this p=
arameter is mandatory depends on the synonym provider implementation. The d=
efault value is null.
+
+
+ 1.9
+
+
+
+
+ indexing-configuration-path
+
+
+
+
+ none
+
+
+
+
+ The path to the indexing configuration file.
+
+
+ 1.9
+
+
+
+
+ indexing-configuration-class
+
+
+
+
+ IndexingConfigurationImpl
+
+
+
+
+ The name of the class that implements org.exoplatform.services.jcr.impl.core. query.lucene.IndexingConfiguration=
.
+
+
+ 1.9
+
+
+
+
+ force-consistencycheck
+
+
+
+
+ false
+
+
+
+
+ If set to true a consistency check is performe=
d depending on the parameter forceConsistencyCheck. If s=
et to false no consistency check is performed on start up, even if a redo l=
og had been applied.
+
+
+ 1.9
+
+
+
+
+ spellchecker-class
+
+
+
+
+ none
+
+
+
+
+ The name of a class that implements o=
rg.exoplatform.services.jcr.impl.core. query.lucene.SpellChecker.
+
+
+ 1.9
+
+
+
+
+ errorlog-size
+
+
+
+
+ 50(KB)
+
+
+
+
+ The default size of error log file in KB.
+
+
+ 1.9
+
+
+
+
+ upgrade-index
+
+
+
+
+ false
+
+
+
+
+ Allows JCR to convert an existing index into t=
he new format. It is also possible to set this property via system property.
+
+
+ Indexes prior to eXo JCR 1.12 will not run wit=
h eXo JCR 1.12. You must run an automatic migration.
+
+
+ Start eXo JCR with:
+
+ -Dupgrade-index=3Dtrue=
programlisting>
+
+ The old index format is then converted in the =
new index format. After the conversion the new format is used.
+
+
+ On subsequent starts this option is no longer =
needed. The old index is replaced and a back conversion is not possible
+
+
+ It is recommended that a backup of the index b=
e made before conversion. (Only for migrations from JCR 1.9 and later.)
+
+
+ 1.12
+
+
+
+
+ analyzer
+
+
+
+
+ org.apache.lucene.analysis. standard.StandardA=
nalyzer
+
+
+
+
+ Class name of a lucene analyzer to use for ful=
l-text indexing of text.
+
+
+ 1.12
+
+
+
+
+
+ Global Search Index
+
+ By default eXo JCR uses the Lucene standard Analyzer to index =
contents. This analyzer uses some standard filters in the method that analy=
zes the content
+
+
+ Standard Analyzed Filters
+
+
+ Comment #1: The first filter (StandardFilter) remo=
ves possessive apostrophes ('s) fro=
m the end of words and removes periods (.) from acronyms.
+
+
+ Comment #2: The second filter (LowerCaseFilter) no=
rmalizes token text to lower case.
+
+
+ Comment #3: The last filter (StopFilter) removes s=
top words from a token stream. The stop set is defined in the analyzer.
+
+
+
+ The global search index is configured in the JPP_HOME/gatein/gatein.ear/portal.war/WEB-INF/conf/jcr/=
repository-configuration.xml configuration file within the "=
;query-handler" tag.
+
+ <query-handler clas=
s=3D"org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex&q=
uot;>
+
+
+ The same analyzer should always be used for indexing and for q=
uerying in lucene otherwise results may be unpredictable. eXo JCR does this=
automatically. The StandardAnalyzer (configured by default) can, however, =
be replaced with another.
+
+
+ A customized QueryHandler can also be easily created.
+
+
+ Customized Search Indexes and Analyzers
+
+ By default Exo JCR uses the Lucene standard Analyzer to in=
dex contents. This analyzer uses some standard filters in the method that a=
nalyzes the content:
+
+
+ public TokenStream t=
okenStream(String fieldName, Reader reader) {
+ StandardTokenizer tokenStream =3D new StandardTokenizer(reader, replac=
eInvalidAcronym);
+ tokenStream.setMaxTokenLength(maxTokenLength);
+ TokenStream result =3D new StandardFilter(tokenStream);
+ result =3D new LowerCaseFilter(result);
+ result =3D new StopFilter(result, stopSet);
+ return result;
+ }
+
+
+
+ The first one (StandardFilter) removes 's (as &ap=
os;s in "Peter's") from the end of words and removes dots fr=
om acronyms.
+
+
+
+
+ The second one (LowerCaseFilter) normalizes token text=
to lower case.
+
+
+
+
+ The last one (StopFilter) removes stop words from a to=
ken stream. The stop set is defined in the analyzer.
+
+
+
+
+ Additional filters can be used in specific cases. The =
ISOLatin1AccentFilter filter, for example, which replaces accented=
characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccent=
ed equivalents.
+
+
+ The ISOLatin1AccentFilter is not present in t=
he current lucene version used by eXo.
+
+
+ In order to use a different filter, a new analyzer must be cre=
ated, as well as new search index to use the analyzer. These are packaged i=
nto a jar file, which is then deployed with the application.
+
+
+ Create a new filter, analyzer and search index
+
+
+ Create a new filter with the method:
+
+ public final Tok=
en next(final Token reusableToken) throws java.io.IOException
+
+
+ This defines how characters are read and used by the f=
ilter.
+
+
+
+
+ Create the analyzer.
+
+
+ The analyzer must extend org.apache.lucene.an=
alysis.standard.StandardAnalyzer and overload the method.
+
+
+ Use the following to use new filters.
+
+ public TokenStre=
am tokenStream(String fieldName, Reader reader)
+
+
+
+
+ To create the new search index, extend org.ex=
oplatform.services.jcr.impl.core.query.lucene.SearchIndex and wri=
te the constructor to set the correct analyzer.
+
+
+ Use the method below to return your analyzer:
+
+ public Analyzer =
getAnalyzer() {
+return MyAnalyzer;
+}
+
+
+
+
+
+ In eXo JCR version 1.12 (and later) the analyzer can be di=
rectly set in the configuration. For users with this version the creation o=
f a new SearchIndex for new analyzers is redundant.
+
+
+
+ To configure an application to use a new SearchIndex<=
/literal>, replace the following code:
+
+ <query-handler clas=
s=3D"org.exoplatform.services.jcr.impl.core.query.lucene.SearchIndex&q=
uot;>
+
+
+
+ in JPP_HOME/gatein/gatein=
.ear/portal.war/WEB-INF/conf/jcr/repository-configuration.xml wi=
th the new class:
+
+ <query-handler clas=
s=3D"mypackage.indexation.MySearchIndex>
+
+
+
+ To configure an application to use a new analyzer, add the analyzer parameter to each query-handler configuration =
in JPP_HOME/gatein/gatein.ear/portal.w=
ar/WEB-INF/conf/jcr/repository-configuration.xml:
+
+
+
+ The new SearchIndex will start to index con=
tents with the specified filters when the JCR is next started.
+
+
+
+ IndexingConfiguration
+
+ From version 1.9, the default search index implementation in J=
CR allows user control over which properties of a node are indexed. Differe=
nt analyzers can also be set for different nodes.
+
+
+ The configuration parameter is called indexingConfigu=
ration and is not set by default. This means all properties of a =
node are indexed.
+
+
+ To configure the indexing behavior add a parameter to the quer=
y-handler element in your configuration file.
+
+ <param name=3D"=
;indexing-configuration-path" value=3D"/indexing_configuration.xm=
l"/>
+
+
+
+ Node Scope Limit
+
+ The node scope can be limited so that only certain propert=
ies of a node type are indexed. This can optimize the index size.
+
+
+
+ With the configuration below only properties named =
Text are indexed for nt:unstructured nod=
e types. This configuration also applies to all nodes whose type extends fr=
om nt:unstructured.
+
+
+
+ Namespace Prefixes
+
+ The namespace prefixes must be declared t=
hroughout the XML file in the configuration element that is being used.
+
+
+
+ Indexing Boost Value
+
+ It is also possible to configure a boost value for the nodes that match the index rule. The default boost value is 1=
.0. Higher boost values (a reasonable range is 1.0 - 5.0) will yield a high=
er score value and appear as more relevant.
+
+
+
+
+ If you do not wish to boost the complete node, but only certai=
n properties, you can also provide a boost value for the listed properties:
+
+
+
+ Conditional Index Rules
+
+ You may also add a condition to the index=
rule and have multiple rules with the same nodeType. The first index rule =
that matches will apply and all remaining ones are ignored:
+
+
+
+
+ In the above example the first rule only applies if the nt:unstructured node has a priority property with a value =
high. The condition syntax only supports the equals =
operator and a string literal.
+
+
+ Properties may also be referenced on the condition that are no=
t on the current node:
+
+
+
+ The indexing configuration allows the type of a node in the co=
ndition to be specified. Please note however that the type match must be ex=
act. It does not consider sub types of the specified node type.
+
+
+
+ Exclusion from the Node Scope Index
+
+ All configured properties are full-text indexed by default=
(if they are of type STRING and included in the node scope index).
+
+
+
+ A node scope search normally finds all nodes of an index. That=
is to say; jcr:contains(., 'foo') returns all=
nodes that have a string property containing the word 'f=
oo'.
+
+
+ Properties can be explicitly excluded from the node scope inde=
x with:
+
+
+
+ Index Aggregates
+
+ Sometimes it is useful to include the contents of descenda=
nt nodes into a single node to more easily search on content that is scatte=
red across multiple nodes.
+
+
+
+ JCR allows the definition of index aggregates based on relativ=
e path patterns and primary node types.
+
+
+ The following example creates an index aggregate on n=
t:file that includes the content of the jcr:content node:
+
+
+
+ Included nodes can also be restricted to a certain type:
+
+
+
+ The * wild-card can be used=
to match all child nodes:
+
+
+
+ Nodes to a certain depth below the current node can be include=
d by adding multiple include elements. The nt:file n=
ode may contain a complete XML document under jcr:content for example:
+
+
+
+ Property-Level Analyzers
+
+ How a property has to be analyzed can be defined in the fo=
llowing configuration section. If there is an analyzer configuration for a =
property, this analyzer is used for indexing and searching of this property=
. For example:
+
+
+
+
+ The configuration above sets lucene Ke=
ywordAnalyzer to index and search the property "mytext" across the entire workspace while the "mytext2" property is searched with the WhitespaceAnalyzer.
+
+
+ The WhitespaceAnalyzer toke=
nizes a property, the KeywordAnalyzer ta=
kes the property as a whole.
+
+
+ Using different analyzers for different languages can be parti=
cularly useful.
+
+
+ Characteristics of Node Scope Searches
+
+ Unexpected behavior may be encountered when using analyzer=
s to search within a property compared to searching wi=
thin a node scope. This is because the node scope alwa=
ys uses the global analyzer.
+
+
+
+ For example: the property "mytext&=
quot; contains the text; "testing my analyzers&qu=
ot; but no analyzers have been configured for this property (and the defaul=
t analyzer in SearchIndex has not been changed).
+
+
+ If the query is:
+
+ xpath =3D "//*[=
jcr:contains(mytext,'analyzer')]"
+
+
+ The xpath does not return a result in the n=
ode with the property above and default analyzers.
+
+
+ Also, if a search is done on the node scope as follows:
+
+ xpath =3D "//*[=
jcr:contains(.,'analyzer')]"
+
+
+ No result will be returned.
+
+
+ Only specific analyzers can be set on a node property, and the=
node scope indexing and analyzing is always done with the globally defined=
analyzer in the SearchIndex element.
+
+
+ If the analyzer used to index the "mytext" property =
above is changed to:
+
+ <analyzer class=3D&=
quot;org.apache.lucene.analysis.Analyzer.GermanAnalyzer">
+<property>mytext</property>
+</analyzer>
+
+
+ The search below would return a result because of the word ste=
mming (analyzers - analyzer).
+
+ xpath =3D "//*[=
jcr:contains(mytext,'analyzer')]"
+
+
+ The second search in the example:
+
+ xpath =3D "//*[=
jcr:contains(.,'analyzer')]"
+
+
+ Would still not give a result, since the node scope is indexed=
with the global analyzer, which in this case does not take into account an=
y word stemming.
+
+
+ Be aware that when using analyzers for specific properties, a =
result may be found in a property for certain search text, but the same sea=
rch text in the node scope of the property may not find a result.
+
+
+
+ Both index rules and index aggregates influence how conten=
t is indexed in JCR. If the configuration is changed, the existing content =
is not automatically re-indexed according to the new rules.
+
+
+ Content must be manually re-indexed when the configuration=
is changed.
+
+
+
+
+ Advanced features
+
+ eXo JCR supports some advanced features, which are not specifi=
ed in JSR 170:
+
+
+
+
+ Get a text excerpt with highli=
ghted words that matches the query: >.
+
+
+
+
+ Search a term and its synonyms=
: .
+
+
+
+
+ Search similar node=
s: .
+
+
+
+
+ Check spelling of a=
full text query statement: .
+
+
+
+
+ Define index aggregates and ru=
les: IndexingConfiguration.
+
+
+
+
+
+
+ Configuring the JDBC Data Container
+
+ Introduction
+
+ eXo JCR persistent data container can work in two configuratio=
n modes:
+
+
+
+
+ Multi-database: One database for each=
workspace (used in standalone eXo JCR service mode)
+
+
+
+
+ Single-database: All workspaces persi=
sted in one database (used in embedded eXo JCR service mode, e.g. in eXo po=
rtal)
+
+
+
+
+ The data container uses the JDBC driver to communicate with th=
e actual database software, i.e. any JDBC-enabled data storage can be used =
with eXo JCR implementation.
+
+
+ Currently the data container is tested with the following RDBM=
S:
+
+
+ Supported databases
+
+
+
+ Database
+ Driver Version
+
+
+
+
+ IBM DB2 9.7 (FP5)
+ IBM DB2 JDBC Universal Driver Architecture 4.13.80 <=
/entry>
+
+
+ Oracle 11g R1 (11.1.0.7.0)
+ Oracle JDBC Driver 11.1.0.7
+
+
+ Oracle 11g R1 RAC (11.1.0.7.0)
+ Oracle JDBC Driver 11.1.0.7
+
+
+ Oracle 11g R2 (11.2.0.3.0)
+ Oracle JDBC Driver v11.2.0.3.0
+
+
+ Oracle 11g R2 RAC (11.2.0.3.0)
+ Oracle JDBC Driver v11.2.0.3.0
+
+
+ MySQL 5.1
+ MySQL Connector/J 5.1.21
+
+
+ MySQL 5.5
+ MySQL Connector/J 5.1.21
+
+
+ Microsoft SQL Server 2008
+ Microsoft SQL Server JDBC Driver 3.0.1301.101, Micro=
soft SQL Server JDBC Driver 4.0.2206.100
+
+
+ Microsoft SQL Server 2008 R2
+ Microsoft SQL Server JDBC Driver 3.0.1301.101, Micro=
soft SQL Server JDBC Driver 4.0.2206.100
+
+
+ PostgreSQL 8.4.8
+ JDBC4 Postgresql Driver, Version 8.4-703
+
+
+ PostgreSQL 9.1.0
+ JDBC4 Postgresql Driver, Version 9.1-903
+
+
+ Sybase ASE 15.7
+ Sybase jConnect JDBC driver v7
+
+
+
+
+
+ Isolation Levels
+
+ The JCR requires at least the READ_COMMITED isolation level and other RDBMS configurations can cause some side=
-effects and issues. So, please, make sure proper isolation level is config=
ured on database server side.
+
+
+
+
+ One more mandatory JCR requirement for underlying database=
s is a case sensitive collation. Microsoft SQL Server both 2005 and 2008 cu=
stomers must configure their server with collation corresponding to persona=
l needs and requirements, but obligatorily case sensitive. For more informa=
tion please refer to Microsoft SQL Server documentation page "Selectin=
g a SQL Server Collation" here.
+
+
+
+
+ Be aware that JCR does not support MyISAM storage engine f=
or the MySQL relational database management system.
+
+
+
+ Each database software supports ANSI SQL standards but also ha=
s its own specifics. Therefore each database has its own configuration sett=
ing in the eXo JCR as a database dialect parameter. More detailed configura=
tion of the database can be set by editing the metadata SQL-script files.
+
+ NEEDINFO - FILE PATHS - The path needs to be updated with th=
e equivalent path for JBoss Portal Platform instead of gatein, please see b=
elow para. New info required?
+
+ You can find SQL-scripts in conf/storage/=
directory of the JPP_HOME/modules/org=
/gatein/lib/main/exo.jcr.component.core-&JCR_VERSION;.jar file .
+
+
+ The following tables show the correspondence between the scrip=
ts and databases:
+
+
+
+ If a non-ANSI node name is used, you must use a database with =
MultiLanguage support. Some JDBC drivers need additional parameters for est=
ablishing a Unicode friendly connection. For example under mysql it is nece=
ssary to add an additional parameter for the JDBC driver at the end of JDBC=
URL:
+
+
+ There are preconfigured configuration files for HSQLDB. Look f=
or these files in /conf/portal and /conf/standalone folders of the jar-file=
exo.jcr.component.core-&JCR_VERSION;.jar or source-dist=
ribution of eXo JCR implementation.
+
+
+ Example Parameter
+ jdbc:mysql://exoua.dnsalias.net/portal?chara=
cterEncoding=3Dutf8
+
+
+ The configuration files are located in service jars =
/conf/portal/configuration.xml (eXo services including JCR Repos=
itory Service) and exo-jcr-config.xml (repositories co=
nfiguration) by default. In JBoss Portal Platform, the JCR is configured in=
portal web application portal/WEB-INF/conf/jcr/jcr-configuration=
.xml (JCR Repository Service and related services) and repository-configuration.xml (repositories configuration).
+
+
+ Read more about .
+
+
+
+ Multi-database Configuration
+
+ You need to configure each workspace in a repository as part o=
f multi-database configuration. Databases may reside on remote servers as r=
equired.
+
+
+
+
+
+ Configure the data containers in the org.exop=
latform.services.naming.InitialContextInitializer service. It&apo=
s;s the JNDI context initializer which registers (binds) naming resources (=
DataSources) for data containers.
+
+
+ For example (two data containers jdbcjcr
- local HSQLDB, jdbcjcr1 - remote MySQL):
+
+
+
+
+
+
+ Configure the database connection parameters:
+
+
+
+
+ driverClassName=
, e.g. "org.hsqldb.jdbcDriver", "com.mysql.jdbc.Driver"=
, "org.postgresql.Driver"
+
+
+
+
+ url, e.g. "=
;jdbc:hsqldb:file:target/temp/data/portal", "jdbc:mysql://exoua.d=
nsalias.net/jcr"
+
+
+
+
+ username, e.g. =
"sa", "exoadmin"
+
+
+
+
+ password, e.g. =
"", "exo12321"
+
+
+
+
+
+
+ There can be connection pool configuration parameters =
(org.apache.commons.dbcp.BasicDataSourceFactory):
+
+
+
+
+ maxActive, e.g. 50
+
+
+
+
+ maxIdle, e.g. 5
+
+
+
+
+ initialSize, e.g. 5
+
+
+
+
+ and other according to Apache DBCP configuration=
+
+
+
+
+
+
+ Configure the repository service. Each workspace will =
be configured for its own data container.
+
+
+ For example (two workspaces ws =
- jdbcjcr, ws1 - jdbcjcr1):
+
+
+
+
+
+
+ source-name: A javax.sq=
l.DataSource name configured in InitialContextInitializer component (was
sourceName prior JCR 1.9);
+
+
+
+
+ dialect: A database dia=
lect, one of hsqldb, mysql, =
mysql-utf8, pgsql, oracle, =
oracle-oci, mssql, sybase, derby, db2, db2v8=
literal> or auto for dialect autodetection;
+
+
+
+
+ multi-db: Enable multi-=
database container with this parameter (set value "true");
+
+
+
+
+ max-buffer-size: A a th=
reshold (in bytes) after which a javax.jcr.Value content=
will be swapped to a file in a temporary storage. A swap for pending chang=
es, for example.
+
+
+
+
+ swap-directory: A path =
in the file system used to swap the pending changes.
+
+
+
+
+
+
+ This procedure configures two workspace which will be persiste=
nt in two different databases (ws in HSQLDB and ws1 in MySQL).
+
+
+
+ Single-database Configuration
+
+ Configuring a single-database data container is easier than co=
nfiguring a multi-database data container as only one naming resource must =
be configured.
+
+
+ jdbcjcr Data Container
+
+
+
+
+ Configure repository workspaces with this one database. The
multi-db parameter must be set as false.
+
+
+ For example (two workspaces ws - jdbcjcr, ws1 - jdbcjcr):
+
+
+ Example
+
+
+
+
+ This configures two persistent workspaces in one database (Pos=
tgreSQL).
+
+
+ Configuration without DataSource
+
+ It is possible to configure the repository without binding=
javax.sql.DataSource in the JNDI service if you have a =
dedicated JDBC driver implementation with special features like XA transact=
ions, statements/connections pooling etc:
+
+
+
+
+
+ Remove the configuration in InitialContex=
tInitializer for your database and configure a new one directly i=
n the workspace container.
+
+
+
+
+ Remove parameter source-name and add next lines instead. Describe your values for a JDBC driver, datab=
ase URL and username.
+
+
+
+
+ Connection Pooling
+
+ Ensure the JDBC driver provides connection pooling. Co=
nnection pooling is strongly recommended for use with the JCR to prevent a =
database overload.
+
+
+ <workspace name=
=3D"ws" auto-init-root-nodetype=3D"nt:unstructured">
+ <container class=3D"org.exoplatform.services.jcr.impl.storage.jd=
bc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name=3D"dialect" value=3D"hsqldb"/&=
gt;
+ <property name=3D"driverliteral" value=3D"org.hsql=
db.jdbcDriver"/>
+ <property name=3D"url" value=3D"jdbc:hsqldb:file:t=
arget/temp/data/portal"/>
+ <property name=3D"username" value=3D"su"/>
+ <property name=3D"password" value=3D""/> =
+ ......
+
+
+ Dynamic Workspace Creation
+
+ Workspaces can be added dynamically during runtime.
+
+
+ This can be performed in two steps:
+
+
+
+
+
+ ManageableRepository.configWorkspace(Work=
spaceEntry wsConfig): Register a new configuration in RepositoryC=
ontainer and create a WorkspaceContainer.
+
+
+
+
+ ManageableRepository.createWorkspace(Stri=
ng workspaceName): Creation a new workspace.
+
+
+
+
+
+
+ Simple and Complex queries
+
+ eXo JCR provides two ways to interact with the database;
+
+
+
+
+
+ JDBCStorageConnection
+
+
+
+ Which uses simple queries. Simple queries do not u=
se sub queries, left or right joins. They are implemented in such a way as =
to support as many database dialects as possible.
+
+
+
+
+
+ CQJDBCStorageConection
+
+
+
+ Which uses complex queries. Complex queries are op=
timized to reduce the number of database calls.
+
+
+
+
+
+ Simple queries will be used if you chose org.exoplatf=
orm.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer:
+
+ <workspaces>
+ <workspace name=3D"ws" auto-init-root-nodetype=3D"nt:u=
nstructured">
+ <container class=3D"org.exoplatform.services.jcr.impl.storage.=
jdbc.JDBCWorkspaceDataContainer">
+ ...
+ </workspace>
+</worksapces>
+
+
+ Complex queries will be used if you chose org.exoplat=
form.services.jcr.impl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContain=
er:
+
+ <workspaces>
+ <workspace name=3D"ws" auto-init-root-nodetype=3D"nt:u=
nstructured">
+ <container class=3D"org.exoplatform.services.jcr.impl.storage.=
jdbc.optimisation.CQJDBCWorkspaceDataContainer">
+ ...
+ </workspace>
+</worksapces>
+
+
+ Force Query Hints
+
+ Some databases, such as Oracle and MySQL, support hints to inc=
rease query performance. The eXo JCR has separate Complex Query implementat=
ions for the Orcale database dialect, which uses query hints to increase pe=
rformance for few important queries.
+
+
+ To enable this option, use the following configuration propert=
y:
+
+ <workspace name=3D&=
quot;ws" auto-init-root-nodetype=3D"nt:unstructured">
+ <container class=3D"org.exoplatform.services.jcr.impl.storage.jd=
bc.JDBCWorkspaceDataContainer">
+ <properties>
+ <property name=3D"dialect" value=3D"oracle"/&=
gt;
+ <property name=3D"force.query.hints" value=3D"true=
" />
+ ......
+
+ Query hints are only used for Complex Queries with the Oracle =
dialect. For all other dialects this parameter is ignored.
+
+
+
+ Notes for Microsoft Windows users
+
+ The current configuration of eXo JCR uses Apache DBCP connection pool (or=
g.apache.commons.dbcp.BasicDataSourceFactory).
+
+
+ It is possible to set a high value for the maxActiv=
e parameter in the configuration.xml file.=
This creates a high use of TCP/IP ports from a client machine inside the p=
ool (the JDBC driver, for example). As a result, the data container can thr=
ow exceptions like "Address already in use".
+
+
+ To solve this problem, you must configure the client's ma=
chine networking software to use shorter timeouts for open TCP/IP ports.
+
+
+ This is done by editing two registry keys within the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters node. Both of these keys are unset by default. To set the keys as =
required:
+
+
+
+
+
+ Set the MaxUserPort registry ke=
y to =3Ddword:00001b58. This sets the maximum of ope=
n ports to 7000 or higher (the default is 5000).
+
+
+
+
+ Set TcpTimedWaitDelay to =3Ddword:0000001e. This sets TIME_WAIT parameter to 30 seconds (the default is 240).
+
+
+
+
+ Sample Registry File
+ Windows Registry Editor Version 5.00
+
+[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
+"MaxUserPort"=3Ddword:00001b58
+"TcpTimedWaitDelay"=3Ddword:0000001e
+
+
+
+
+ External Value Storages
+
+ Introduction
+
+ JCR values are stored in the Workspace Data container by defau=
lt. The eXo JCR offers an additional option of storing JCR values separatel=
y from the Workspace Data container which can help keep Binary Large Object=
s (BLOBs) separate.
+
+
+ Tree-based storage is recommended in most cases.
+
+
+
+ Tree File Value Storage
+
+ Tree File Value Storage holds values in tree-like file system =
files. Path property points to the root directory to s=
tore the files.
+
+
+ This is a recommended type of external storage because it can =
contain large amount of files limited only by disk/volume free space.
+
+
+ However, using Tree File Value Storage can result in a higher =
time on value deletion, due to the removal of unused tree-nodes.
+
+
+ Tree File Value Storage Configuration
+
+
+ Comment #1: The id is the value storage unique identifier, used for linking with propertie=
s stored in a workspace container.
+
+
+ Comment #2: the path is a location where value files will be stored.
+
+
+
+ Each file value storage can have the filters for incoming values. A filter can match values by property-ty=
pe, property-name, ancestor-path<=
/property>. It can also match the size of values stored (min-valu=
e-size) in bytes.
+
+
+ In the previous example a filter with property-type<=
/property> and min-value-size has been used. This resu=
lts in storage for binary values with size greater of 1MB.
+
+
+ It is recommended that properties with large values are stored=
in file value storage only.
+
+
+ The example below shows a value storage with different locatio=
ns for large files (min-value-size a 20Mb-sized filter=
).
+
+
+ A value storage uses ORed logic in the process of filter selec=
tion. This means the first filter in the list will be called first and if i=
t is not matched the next will be called, and so on.
+
+
+ In this example a value matches the 20MB filter min-=
value-size and will be stored in the path "data/20=
Mvalues". All other filters will be stored in "data/values".
+
+
+
+
+ Disabling value storage
+
+ The JCR allows you to disable value storage by adding the foll=
owing property into its configuration.
+
+ <property name=3D"enabled&q=
uot; value=3D"false" />
+
+ Warning
+
+ It is recommended that this functionality be used for inte=
rnal and testing purpose only, and with caution, as all stored values will =
be inaccessible.
+
+
+
+
+
+ Workspace Data Container
+
+ Each Workspace of the JCR has its own persistent storage to hold t=
hat workspace's items data. The eXo JCR can be configured so that it c=
an use one or more workspaces that are logical units of the repository cont=
ent.
+
+
+ The physical data storage mechanism is configured using mandatory =
element container. The type of container=
is described in the attribute class =3D fully_qual=
ified_name_of_org.exoplatform.services.jcr.storage.WorkspaceDataContainer_s=
ubclass.
+
+
+ Physical Data Storage Configuration
+ <container class=3D=
"org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataConta=
iner">
+ <properties>
+ <property name=3D"source-name" value=3D"jdbcjcr1&quo=
t;/>
+ <property name=3D"dialect" value=3D"hsqldb"/>
+ <property name=3D"multi-db" value=3D"true"/>
+ <property name=3D"max-buffer-size" value=3D"200K&quo=
t;/>
+ <property name=3D"swap-directory" value=3D"target/te=
mp/swap/ws"/>
+ <property name=3D"lazy-node-iterator-page-size" value=3D&=
quot;50"/>
+ <property name=3D"acl-bloomfilter-false-positive-probability&q=
uot; value=3D"0.1d"/>
+ <property name=3D"acl-bloomfilter-elements-number" value=
=3D"1000000"/>
+ </properties>
+
+ source-name: The JDBC data source n=
ame which is registered in JDNI by InitialContextInitializer. This was know=
n as sourceName in versions prior to 1.9.
+
+
+ dialect: The database dialect. Must=
be one of the following: hsqldb, mysql, mysql-utf8, pgsql, oracl=
e, oracle-oci, mssql, sybase, derby, db2 or <=
literal>db2v8).
+
+
+ multi-db: This parameter, if true, enables multi-database container.
+
+
+ max-buffer-size: A threshold in byt=
es. If a value size is greater than this setting, then it will be spooled t=
o a temporary file.
+
+
+ swap-directory: A location where th=
e value will be spooled if no value storage is configured but a ma=
x-buffer-size is exceeded.
+
+
+ lazy-node-iterator-page-size: "=
;Lazy" child nodes iterator settings. Defines size of page, the number=
of nodes that are retrieved from persistent storage at once.
+
+
+ acl-bloomfilter-false-positive-probability: ACL Bloom-filter settings. ACL Bloom-filter desired false positive=
probability. Range [0..1]. Default value 0.1d.
+
+
+ acl-bloomfilter-elements-number: AC=
L Bloom-filter settings. Expected number of ACL-elements in the Bloom-filte=
r. Default value 1000000.
+
+
+
+
+ Bloom filters are not supported by all the cache implementatio=
ns so far only the inplementation for infinispan supports it.
+
+
+ Bloom-filter used to avoid read nodes that definitely do not h=
ave ACL. acl-bloomfilter-false-positive-probability=
and acl-bloomfilter-elements-number used to configure such filters. Bloom filters are not supported by =
all the cache implementations so far only the inplementation for infinispan=
supports it.
+
+
+ More about Bloom filters you can read here http://en.wikipedia.org/wiki/Bloom_fi=
lter.
+
+
+
+ The eXo JCR has a JDBC-based, relational database, production read=
y Workspace Data Container.
+
+
+ Workspace Data Container may support external=
storages for javax.jcr.Value (which can be the case for=
BLOB values for example) using the optional element value-storage=
s.
+
+
+ The Data Container will try to read or write a Value using the und=
erlying value storage plug-in if the filter criteria (see below) match the =
current property.
+
+
+ External Value Storage Configuration
+ <value-storages>
+ <value-storage id=3D"Storage #1" class=3D"org.exoplatf=
orm.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
+ <properties>
+ <property name=3D"path" value=3D"data/values"=
/>
+ </properties>
+ <filters>
+ <filter property-type=3D"Binary" min-value-size=3D"=
1M"/><!-- Values large of 1Mbyte -->
+ </filters>
+.........
+</value-storages>
+
+ value-storage is the subclass of org.exoplatform.services.jcr.storage.value.ValueStoragePlugin and properties are optional plug-in specific paramet=
ers.
+
+
+ filters: Each file value storage ca=
n have the filter(s) for incoming values. If there are several filter crite=
ria, they all have to match (AND-Condition).
+
+
+
+ A filter can match values by property type (property-t=
ype), property name (property-name), ancestor path (ancestor-path) and/or t=
he size of values stored (min-value-size, e.g. 1M, 4.2G, 100 (bytes)).
+
+
+ In a code sample, we use a filter with property-type a=
nd min-value-size only. That means that the storage is only for binary valu=
es whose size is greater than 1Mbyte.
+
+
+ It is recommended that you store properties with large=
values in a file value storage only.
+
+
+
+ Configuring Cluster
+
+ Launching Cluster
+
+ Configuring JCR to use external configuration
+
+
+
+ To manually configure a repository, create a new c=
onfiguration file (exo-jcr-configuration.xml for examp=
le). For details, see .
+
+
+ The configuration file must be formatted as follow=
s:
+
+
+ External Configuration
+ <repository=
-service default-repository=3D"repository1">
+ <repositories>
+ <repository name=3D"repository1" system-workspace=3D&qu=
ot;ws1" default-workspace=3D"ws1">
+ <security-domain>exo-domain</security-domain>
+ <access-control>optional</access-control>
+ <authentication-policy>org.exoplatform.services.jcr.impl.co=
re.access.JAASAuthenticator</authentication-policy>
+ <workspaces>
+ <workspace name=3D"ws1">
+ <container class=3D"org.exoplatform.services.jcr.im=
pl.storage.jdbc.optimisation.CQJDBCWorkspaceDataContainer">
+ <properties>
+ <property name=3D"source-name" value=3D&=
quot;jdbcjcr" />
+ <property name=3D"dialect" value=3D"=
;oracle" />
+ <property name=3D"multi-db" value=3D&quo=
t;false" />
+ <property name=3D"update-storage" value=
=3D"false" />
+ <property name=3D"max-buffer-size" value=
=3D"200k" />
+ <property name=3D"swap-directory" value=
=3D"../temp/swap/production" />
+ </properties>
+ <value-storages>
+ ]]>
+ </value-storages>
+ </container>
+ <initializer class=3D"org.exoplatform.services.jcr.=
impl.core.ScratchWorkspaceInitializer">
+ <properties>
+ <property name=3D"root-nodetype" value=
=3D"nt:unstructured" />
+ </properties>
+ </initializer>
+ <cache enabled=3D"true" class=3D"org.exop=
latform.services.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspac=
eStorageCache">
+ ]]> =
+ </cache>
+ <query-handler class=3D"org.exoplatform.services.jc=
r.impl.core.query.lucene.SearchIndex">
+ ]]>
+ </query-handler>
+ <lock-manager class=3D"org.exoplatform.services.jcr=
.impl.core.lock.jbosscache.CacheableLockManagerImpl">
+ ]]> =
+ </lock-manager>
+ </workspace>
+ <workspace name=3D"ws2">
+ ...
+ </workspace>
+ <workspace name=3D"wsN">
+ ...
+ </workspace>
+ </workspaces>
+ </repository>
+ </repositories>
+</repository-service>
+
+ Comment #1: Refer to .
+
+
+ Comment #3: Refer to .
+
+
+ Comment #4: Refer to .
+
+
+
+
+
+ Then, update RepositoryServiceConfigura=
tion configuration in the exo-configuration.xml to reference your file:
+
+ <component>
+ <key>org.exoplatform.services.jcr.config.RepositoryServiceConfigu=
ration</key>
+ <type>org.exoplatform.services.jcr.impl.config.RepositoryServiceC=
onfigurationImpl</type>
+ <init-params>
+ <value-param>
+ <name>conf-path</name>
+ <description>JCR configuration file</description>
+ <value>exo-jcr-configuration.xml</value>
+ </value-param>
+ </init-params>
+</component>
+
+
+
+
+
+ Requirements
+
+ Environment requirements
+
+
+
+ Every node of the cluster =
must have the same mounted Network File System (NFS) with the read and write permissions on it.
+
+
+
+
+ Every node of cluster must=
use the same database.
+
+
+
+
+ The same Clusters on different nodes must have the same names.
+
+
+ Example
+
+ If the Indexer cluster in=
the production workspace on the first node is named <=
literal>production_indexer_cluster, then indexer clusters in the production workspace on all other=
nodes must also be named produ=
ction_indexer_cluster.
+
+
+
+
+
+
+ Configuration requirements
+
+ The configuration of every workspace in the repository mus=
t contain the following elements:
+
+
+ Value Storage configuration
+ <value-storages=
>
+ <value-storage id=3D"system" class=3D"org.exoplatform=
.services.jcr.impl.storage.value.fs.TreeFileValueStorage">
+ <properties>
+ <property name=3D"path" value=3D"/mnt/tornado/t=
emp/values/production" /> <!--path within NFS where ValueStor=
age will hold it's data-->
+ </properties>
+ <filters>
+ <filter property-type=3D"Binary" />
+ </filters>
+ </value-storage>
+</value-storages>
+
+
+ Cache configuration
+ <cache enabled=
=3D"true" class=3D"org.exoplatform.services.jcr.impl.dataflo=
w.persistent.jbosscache.JBossCacheWorkspaceStorageCache">
+ <properties>
+ <property name=3D"jbosscache-configuration" value=3D&qu=
ot;jar:/conf/portal/test-jbosscache-data.xml" /> <!-- pat=
h to JBoss Cache configuration for data storage -->
+ <property name=3D"jgroups-configuration" value=3D"=
jar:/conf/portal/udp-mux.xml" /> <!-- pat=
h to JGroups configuration -->
+ <property name=3D"jbosscache-cluster-name" value=3D&quo=
t;JCR_Cluster_cache_production" /> <!-- JBo=
ss Cache data storage cluster name -->
+ <property name=3D"jgroups-multiplexer-stack" value=3D&q=
uot;true" />
+ </properties>
+</cache>
+
+
+ Indexer configuration
+ <query-handler =
class=3D"org.exoplatform.services.jcr.impl.core.query.lucene.SearchInd=
ex">
+ <properties>
+ <property name=3D"changesfilter-class" value=3D"or=
g.exoplatform.services.jcr.impl.core.query.jbosscache.JBossCacheIndexChange=
sFilter" />
+ <property name=3D"index-dir" value=3D"/mnt/tornado=
/temp/jcrlucenedb/production" /> <!-- p=
ath within NFS where ValueStorage will hold it's data -->
+ <property name=3D"jbosscache-configuration" value=3D&qu=
ot;jar:/conf/portal/test-jbosscache-indexer.xml" /> <!-- p=
ath to JBoss Cache configuration for indexer -->
+ <property name=3D"jgroups-configuration" value=3D"=
jar:/conf/portal/udp-mux.xml" /> <!-- p=
ath to JGroups configuration -->
+ <property name=3D"jbosscache-cluster-name" value=3D&quo=
t;JCR_Cluster_indexer_production" /> <!-- J=
Boss Cache indexer cluster name -->
+ <property name=3D"jgroups-multiplexer-stack" value=3D&q=
uot;true" />
+ </properties>
+</query-handler>
+
+
+ Lock Manager configuration
+ <lock-manager c=
lass=3D"org.exoplatform.services.jcr.impl.core.lock.jbosscache.Cacheab=
leLockManagerImpl">
+ <properties>
+ <property name=3D"time-out" value=3D"15m" /&g=
t;
+ <property name=3D"jbosscache-configuration" value=3D&qu=
ot;jar:/conf/portal/test-jbosscache-lock.xml" /> <!-- p=
ath to JBoss Cache configuration for lock manager -->
+ <property name=3D"jgroups-configuration" value=3D"=
jar:/conf/portal/udp-mux.xml" /> <!-- p=
ath to JGroups configuration -->
+ <property name=3D"jgroups-multiplexer-stack" value=3D&q=
uot;true" />
+ <property name=3D"jbosscache-cluster-name" value=3D&quo=
t;JCR_Cluster_lock_production" /> <!-- J=
Boss Cache locks cluster name -->
+ =
+ <property name=3D"jbosscache-cl-cache.jdbc.table.name" =
value=3D"jcrlocks_production"/> <!-- t=
he name of the DB table where lock's data will be stored -->
+ <property name=3D"jbosscache-cl-cache.jdbc.table.create"=
; value=3D"true"/>
+ <property name=3D"jbosscache-cl-cache.jdbc.table.drop" =
value=3D"false"/>
+ <property name=3D"jbosscache-cl-cache.jdbc.table.primarykey&=
quot; value=3D"jcrlocks_production_pk"/>
+ <property name=3D"jbosscache-cl-cache.jdbc.fqn.column" =
value=3D"fqn"/>
+ <property name=3D"jbosscache-cl-cache.jdbc.node.column"=
value=3D"node"/>
+ <property name=3D"jbosscache-cl-cache.jdbc.parent.column&quo=
t; value=3D"parent"/>
+ <property name=3D"jbosscache-cl-cache.jdbc.datasource" =
value=3D"jdbcjcr"/>
+ </properties>
+</lock-manager>
+
+
+
+
+
+ Configuring JBoss Cache
+
+ Indexer, lock manager and data container configuration
+
+ Each mentioned component uses instances of the JBoss Cache pro=
duct for caching in clustered environment. So every element has its own tra=
nsport and has to be configured correctly. As usual, workspaces have simila=
r configuration differing only in cluster-names (and, possibly, some other =
parameters). The simplest way to configure them is to define their own conf=
iguration files for each component in each workspace:
+
+ <property name=3D&q=
uot;jbosscache-configuration" value=3D"conf/standalone
+ /test-jbosscache-lock-db1-ws1.xml" />
+
+ But if there are few workspaces, configuring them in such a wa=
y can be painful and hard-manageable. eXo JCR offers a template-based confi=
guration for JBoss Cache instances. You can have one template for Lock Mana=
ger, one for Indexer and one for data container and use them in all the wor=
kspaces, defining the map of substitution parameters in a main configuratio=
n file. Just simply define ${jbosscache-<parameter name>} inside xml-=
template and list correct value in JCR configuration file just below "=
jbosscache-configuration", as shown:
+
+
+ Template:
+
+ ...
+<clustering mode=3D"replication" clusterName=3D"${jbossc=
ache-cluster-name}">
+ <stateRetrieval timeout=3D"20000" fetchInMemoryState=3D&quo=
t;false" />
+...
+
+ and JCR configuration file:
+
+ ...
+<property name=3D"jbosscache-configuration" value=3D"jar=
:/conf/portal/jbosscache-lock.xml" />
+<property name=3D"jbosscache-cluster-name" value=3D"JCR-=
cluster-locks-db1-ws" />
+...
+
+
+ JGroups configuration
+
+ JGroups is used by JBoss Cache for network communications and =
transport in a clustered environment. If the property is defined in compone=
nt configuration, it will be injected into the JBoss Cache instance on star=
t up.
+
+ <property name=3D&q=
uot;jgroups-configuration" value=3D"your/path/to/modified-udp.xml=
" />
+
+ As outlined above, each component (lock manager, data containe=
r and query handler) for each workspace requires its own clustered environm=
ent. In other words, they have their own clusters with unique names.
+
+
+ Each cluster should, by default, perform multi-casts on a sepa=
rate port. This configuration leads to much unnecessary overhead on cluster=
. This is why JGroups offers a multiplexer feature, providing ability to us=
e one single channel for set of clusters.
+
+
+ The multiplexer reduces network overheads and increase perform=
ance and stability of application. To enable multiplexer stack, you should =
define appropriate configuration file (upd-mux.xml is =
pre-shipped one with eXo JCR) and set "jgroups-multiplexer-stack"=
into "true".
+
+ <property name=3D&q=
uot;jgroups-configuration" value=3D"jar:/conf/portal/udp-mux.xml&=
quot; />
+<property name=3D"jgroups-multiplexer-stack" value=3D"tr=
ue" />
+
+
+ Sharing JBoss Cache instances
+
+ As a single JBoss Cache instance can be demanding on resources=
, and the default setup will have an instance each for the indexer, the loc=
k manager and the data container on each workspace, an environment that use=
s multiple workspace may benefit from sharing a JBoss Cache instance betwee=
n several instances of the same type (the lock manager instance, for exampl=
e).
+
+
+ This feature is disabled by default and can be enabled at the =
component configuration level by setting the jbosscache-shareabl=
e property to true:
+
+ <property name=3D&q=
uot;jbosscache-shareable" value=3D"true" />
+
+ Once enabled, this feature will allow the JBoss Cache instance=
used by a component to be re-used by another components of the same type w=
ith the same JBoss Cache configuration (with the exception of the eviction =
configuration, which can differ).
+
+
+ This means that all the parameters of type jbosscac=
he-<PARAM_NAME> must be identi=
cal between the components of same type of different workspaces.
+
+
+ Therefore, if you can use the same values for the parameters i=
n each workspace, you only need three JBoss Cache instances (one instance e=
ach for the indexer, lock manager and data container) running at once. This=
can relieve resource stress significantly.
+
+
+
+ Shipped JBoss Cache configuration templates
+
+ The eXo JCR implementation is shipped with ready-to-use JBoss =
Cache configuration templates for JCR's components. They are located i=
n JPP_HOME/gatein/gatein.ear/portal.wa=
r/WEB-INF/conf/jcr/jbosscache directory, inside either the cluster or local directory.
+
+
+ Data container template
+
+ The data container template is config.xml:
+
+ <?xml version=3D&=
quot;1.0" encoding=3D"UTF-8"?>
+<jbosscache xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance=
" xmlns=3D"urn:jboss:jbosscache-core:config:3.1">
+
+ <locking useLockStriping=3D"false" concurrencyLevel=3D&quo=
t;50000" lockParentForChildInsertRemove=3D"false"
+ lockAcquisitionTimeout=3D"20000" />
+
+ <clustering mode=3D"replication" clusterName=3D"${jbo=
sscache-cluster-name}">
+ <stateRetrieval timeout=3D"20000" fetchInMemoryState=3D=
"false" />
+ <jgroupsConfig multiplexerStack=3D"jcr.stack" />
+ <sync />
+ </clustering>
+
+ <!-- Eviction configuration -->
+ <eviction wakeUpInterval=3D"5000">
+ <default algorithmClass=3D"org.jboss.cache.eviction.LRUAlgor=
ithm"
+ actionPolicyClass=3D"org.exoplatform.services.jcr.impl.dataf=
low.persistent.jbosscache.ParentNodeEvictionActionPolicy"
+ eventQueueSize=3D"1000000">
+ <property name=3D"maxNodes" value=3D"1000000&qu=
ot; />
+ <property name=3D"timeToLive" value=3D"120000&q=
uot; />
+ </default>
+ </eviction>
+</jbosscache>
+
+
+ Lock manager template
+
+ The lock manager template is lock-config.xml:
+
+
+
+
+ Query handler (indexer) template
+
+ The query handler template is called indexer-con=
fig.xml:
+
+
+
+
+
+
+ LockManager
+
+ The LockManager stores lock objects. It can lock or release object=
s as required. It is also responsible for removing stale locks.
+
+
+ The LockManager in JBoss Portal Platform is implemented with org.exoplatform.services.jcr.impl.core.lock.jbosscache.CacheableLockM=
anagerImpl.
+
+
+ It is enabled by adding lock-manager-configuration to workspace-configuration.
+
+
+ For example:
+
+
+
+ CacheableLockManagerImpl
+
+ CacheableLockManagerImpl stores lock ob=
jects in JBoss-cache (which implements JDBCCacheLoader to store locks in a =
database). This means its locks are replicable and can affect an entire clu=
ster rather than just a single node.
+
+
+ The length of time LockManager allows a lock to remain in plac=
e can be configured with the "time-out" proper=
ty.
+
+
+ The LockRemover thread periodically polls LockManager for lock=
s that have passed the time-out limit and must be removed.
+
+
+ The time-out for LockRemover is set as follows (the default va=
lue is 30m):
+
+
+
+ There are a number of ways to configure CacheableLo=
ckManagerImpl. Each involves configuring JBoss Cache and JDBCCa=
cheLoader.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Refer to http://community.jboss.org/wiki/JBossCacheJDBCCacheLoad=
er for more information about JBoss Cache and JDBCCacheLoader.
+
+
+ Simple JBoss Cache Configuration
+
+ One method to configure the LockManager is to put a JBoss =
Cache configuration file path into CacheableLockManagerImpl.
+
+
+
+ This is not the most efficient method for configuring =
the LockManager as it requires a JBoss Cache configuration file for each Lo=
ckManager configuration in each workspace of each repository. The configura=
tion set up can subsequently become quite difficult to manage.
+
+
+ This method is useful, however, if a single, specially=
configured LockManager is required.
+
+
+
+ The required configuration is shown in the example below:
+
+
+
+ Sample content of the jbosscache-=
lock-config.xml file specified in the jbosscache=
-configuration property is shown in the code example below.
+
+
+ Sample Content of the jbosscache-lock-config.xml File
+
+
+ Comment #1: The cluster name at clu=
stering mode=3D"replication" clusterName=3D"JBoss-Cache-Lock=
-Cluster_Name" must be unique;
+
+
+ Comment #2: The cache.jdbc.table.na=
me must be unique per datasource.
+
+
+ Comment #3: The cache.jdbc.node.typ=
e and cache.jdbc.fqn.type parameters mus=
t be configured according to the database in use. Refer to the table below =
for information about data types.
+
+
+
+ Data Types in Different Databases
+
+
+
+ DataBase name
+ Node data type
+ FQN data type
+
+
+
+
+ default
+ BLOB
+ VARCHAR(512)
+
+
+ HSSQL
+ OBJECT
+ VARCHAR(512)
+
+
+ MySQL
+ LONGBLOB
+ VARCHAR(512)
+
+
+ ORACLE
+ BLOB
+ VARCHAR2(512)
+
+
+ PostgreSQL
+ bytea
+ VARCHAR(512)
+
+
+ MSSQL
+ VARBINARY(MAX)
+ VARCHAR(512)
+
+
+ DB2
+ BLOB
+ VARCHAR(512)
+
+
+ Sybase
+ IMAGE
+ VARCHAR(512)
+
+
+
+
+
+
+ Template JBoss Cache Configuration
+
+ Another method to configure LockManager is to use a JBoss =
Cache configuration template for all LockManagers.
+
+
+ Below is an example test-jbosscache-lock.xml template file:
+
+
+
+ The parameters that will populate the above file are shown=
below:
+
+
+ JBoss Cache Configuration Parameters
+
+
+ Comment #1: The jgroups-configuration=
has been moved to a separate configuration file (udp-m=
ux.xml, shown below). In this case the udp-mux.xml is a common configuration for all JGroup components (QueryHandler, =
cache, LockManager), but this is not a requirement of the configuration met=
hod.
+
+
+ Comment #2: The jbosscache-cl-cache=
.jdbc.fqn.column and jbosscache-cl-cache.jdbc.node.t=
ype parameters are not explicitly defined as cache.j=
dbc.fqn.type and cache.jdbc.node.type ar=
e defined in the JBoss Cache configuration.
+
+
+
+ Refer to for =
information about setting these parameters or set them as AUTO=
parameter> and the data type will by detected automatically.
+
+
+ udp-mux.xml:
+
+
+
+
+ Lock Migration
+
+ There are three options available:
+
+
+ Lock Migration Options
+
+ When new Shareable Cache feature is not going to be used=
and all locks should be kept after migration.
+
+
+
+
+
+ Ensure that the same lock tables are u=
sed in configuration
+
+
+
+
+ Start the server
+
+
+
+
+
+
+ When new Shareable Cache feature is not going to be used=
and all locks should be removed after migration.
+
+
+
+
+
+ Ensure that the same lock tables used =
in configuration
+
+
+
+
+ Start the sever WITH system property:
+
+ -Dorg.exoplatform.jcr.locks.force.remove=
=3Dtrue
+
+
+
+
+ Stop the server
+
+
+
+
+ Start the server WITHOUT system proper=
ty:
+
+ -Dorg.exoplatform.jcr.locks.force.remove
+
+
+
+
+
+
+ When new Shareable Cache feature will be used (in this c=
ase all locks are removed after migration).
+
+
+
+
+
+ Start the sever WITH system property:
+
+ -Dorg.exoplatform.jcr.locks.force.remove=
=3Dtrue
+
+
+
+
+ Stop the server.
+
+
+
+
+ Start the server WITHOUT system proper=
ty:
+
+ -Dorg.exoplatform.jcr.locks.force.remove
+
+
+
+ Optional:
+
+ Manually remove old tables for lock.
+
+
+
+
+
+
+
+
+
+
+ Configuring QueryHandler
+
+ Indexing in clustered environment
+
+ JCR offers indexing strategies for clustered environments usin=
g the advantages of running in a single JVM or doing the best to use all re=
sources available in cluster. JCR uses Lucene library as underlying search =
and indexing engine, but it has several limitations that greatly reduce pos=
sibilities and limits the usage of cluster advantages. That's why eXo =
JCR offers two strategies that are suitable for it's own usecases. The=
y are clustered with shared index and clustered with local indexes. Each on=
e has it's pros and cons.
+
+
+ Clustered implementation with local indexes combines in-memory=
buffer index directory with delayed file-system flushing. This index is ca=
lled "Volatile" and it is invoked in searches also. Within some c=
onditions volatile index is flushed to the persistent storage (file system)=
as new index directory. This allows to achieve great results for write ope=
rations.
+
+
+ Local Index Diagram
+
+
+
+
+
+
+
+ As this implementation designed for clustered environment it h=
as additional mechanisms for data delivery within cluster. Actual text extr=
action jobs done on the same node that does content operations (i.e. write =
operation). Prepared "documents" (Lucene term that means block of=
data ready for indexing) are replicated withing cluster nodes and processe=
d by local indexes. So each cluster instance has the same index content. Wh=
en new node joins the cluster it has no initial index, so it must be create=
d. There are some supported ways of doing this operation. The simplest is t=
o simply copy the index manually but this is not intended for use. If no in=
itial index found JCR uses automated scenarios. They are controlled via con=
figuration (see "index-recovery-mode" parameter) offering full re=
-indexing from database or copying from another cluster node.
+
+
+ For some reasons having a multiple index copies on each instan=
ce can be costly. So shared index can be used instead (see diagram below).
+
+
+ Shared Index Diagram
+
+
+
+
+
+
+
+ This indexing strategy combines advantages of in-memory index =
along with shared persistent index offering "near" real time sear=
ch capabilities. This means that newly added content is accessible via sear=
ch practically immediately. This strategy allows nodes to index data in the=
ir own volatile (in-memory) indexes, but persistent indexes are managed by =
single "coordinator" node only. Each cluster instance has a read =
access for shared index to perform queries combining search results found i=
n own in-memory index also. Take in account that shared folder must be conf=
igured in your system environment (i.e. mounted NFS folder). But this strat=
egy in some extremely rare cases can have a bit different volatile indexes =
within cluster instances for a while. In a few seconds they will be up2date.
+
+
+ See more about .
+
+
+
+ Configuration
+
+ Query-handler configuration overview
+
+ Configuration example:
+
+ <workspace name=
=3D"ws">
+ <query-handler class=3D"org.exoplatform.services.jcr.impl.core.=
query.lucene.SearchIndex">
+ <properties>
+ <property name=3D"index-dir" value=3D"shareddir=
/index/db1/ws" />
+ <property name=3D"changesfilter-class"
+ value=3D"org.exoplatform.services.jcr.impl.core.query.jbo=
sscache.JBossCacheIndexChangesFilter" />
+ <property name=3D"jbosscache-configuration" value=3D=
"jbosscache-indexer.xml" />
+ <property name=3D"jgroups-configuration" value=3D&qu=
ot;udp-mux.xml" />
+ <property name=3D"jgroups-multiplexer-stack" value=
=3D"true" />
+ <property name=3D"jbosscache-cluster-name" value=3D&=
quot;JCR-cluster-indexer-ws" />
+ <property name=3D"max-volatile-time" value=3D"6=
0" />
+ <property name=3D"rdbms-reindexing" value=3D"tr=
ue" />
+ <property name=3D"reindexing-page-size" value=3D&quo=
t;1000" />
+ <property name=3D"index-recovery-mode" value=3D"=
;from-coordinator" />
+ <property name=3D"index-recovery-filter" value=3D&qu=
ot;org.exoplatform.services.jcr.impl.core.query.lucene.DocNumberRecoveryFil=
ter" />
+ </properties>
+ </query-handler>
+</workspace>
+
+
+ Configuration properties
+
+
+
+ Property name
+ Description
+
+
+
+
+ index-dir
+ path to index
+
+
+ changesfilter-class
+ template of JBoss-cache configuration for all quer=
y-handlers in repository
+
+
+ jbosscache-configuration
+ template of JBoss-cache configuration for all quer=
y-handlers in repository
+
+
+ jgroups-configuration
+ jgroups-configuration is template configuration fo=
r all components (search, cache, locks) [Add link to document describing te=
mplate configurations]
+
+
+ jgroups-multiplexer-stack
+ [TODO about jgroups-multiplexer-stack - add link t=
o JBoss doc]
+
+
+ jbosscache-cluster-name
+ cluster name (must be unique)
+
+
+ max-volatile-time
+ max time to live for Volatile Index
+
+
+ rdbms-reindexing
+ indicate that need to use rdbms reindexing mechani=
sm if possible, the default value is true
+
+
+ reindexing-page-size
+ maximum amount of nodes which can be retrieved fro=
m storage for re-indexing purpose, the default value is 100
+
+
+ index-recovery-mode
+ If the parameter has been set to from-ind=
exing, so a full indexing will be automatically launched (default=
behavior), if the parameter has been set to from-coordinator, the index will be retrieved from coordinator
+
+
+ index-recovery-filter
+ Defines implementation class or classes of Recover=
yFilters, the mechanism of index synchronization for Local Index strategy. =
+
+
+ async-reindexing
+ Controls the process of re-indexing on JCR's =
startup. If this flag is set, indexing will be launched asynchronously, wit=
hout blocking the JCR. Default is "false".
+
+
+
+
+
+ Improving Query Performance With postgreSQL and rdbms-reindexing
+
+ If you use postgreSQL and rdbms-reindexing is set to true, the perfo=
rmance of the queries used while indexing can be improved by:
+
+
+
+
+
+
+ Set the parameter "enable_seqscan<=
/parameter>" to "off"
+
+
+ OR
+
+
+ Set "default_statistics_target" to at least "50".
+
+
+
+
+ Restart DB server and make analyze of the JCR_SVAL=
UE (or JCR_MVALUE) table.
+
+
+
+
+ Improving Query Performance With DB2 a=
nd rdbms-reindexing
+
+ If you use DB2 and rdbms=
-reindexing is set to true, the performance =
of the queries used while indexing can be improved by:
+
+
+
+
+
+
+ Make statistics on tables by running the following=
for JCR_SITEM (or JCR_MITEM) and
JCR_SVALUE (or JCR_MVALUE) tables:
+
+ RUNSTATS ON TABLE <scheme>.<tab=
le> WITH DISTRIBUTION AND INDEXES ALL
+
+
+
+
+ Cluster-ready indexing
+
+ For both cluster-ready implementations JBoss Cache, JGroup=
s and Changes Filter values must be defined. Shared index requires some kin=
d of remote or shared file system to be attached in a system (i.e. NFS, SMB=
or etc). Indexing directory ("indexDir" value) must point to it.=
Setting "changesfilter-class" to "org.exoplatform.services.=
jcr.impl.core.query.jbosscache.JBossCacheIndexChangesFilter" will enab=
le shared index implementation.
+
+ <workspace name=
=3D"ws">
+ <query-handler class=3D"org.exoplatform.services.jcr.impl.core.=
query.lucene.SearchIndex">
+ <properties>
+ <property name=3D"index-dir" value=3D"/mnt/nfs_=
drive/index/db1/ws" />
+ <property name=3D"changesfilter-class"
+ value=3D"org.exoplatform.services.jcr.impl.core.query.jbo=
sscache.JBossCacheIndexChangesFilter" />
+ <property name=3D"jbosscache-configuration" value=3D=
"jbosscache-indexer.xml" />
+ <property name=3D"jgroups-configuration" value=3D&qu=
ot;udp-mux.xml" />
+ <property name=3D"jgroups-multiplexer-stack" value=
=3D"true" />
+ <property name=3D"jbosscache-cluster-name" value=3D&=
quot;JCR-cluster-indexer-ws" />
+ <property name=3D"max-volatile-time" value=3D"6=
0" />
+ <property name=3D"rdbms-reindexing" value=3D"tr=
ue" />
+ <property name=3D"reindexing-page-size" value=3D&quo=
t;1000" />
+ <property name=3D"index-recovery-mode" value=3D"=
;from-coordinator" />
+ </properties>
+ </query-handler>
+</workspace>
+
+ In order to use cluster-ready strategy based on local inde=
xes, when each node has own copy of index on local file system, the followi=
ng configuration must be applied. Indexing directory must point to any fold=
er on local file system and "changesfilter-class" must be set to =
"org.exoplatform.services.jcr.impl.core.query.jbosscache.LocalIndexCha=
ngesFilter".
+
+ <workspace name=
=3D"ws">
+ <query-handler class=3D"org.exoplatform.services.jcr.impl.core.=
query.lucene.SearchIndex">
+ <properties>
+ <property name=3D"index-dir" value=3D"/mnt/nfs_=
drive/index/db1/ws" />
+ <property name=3D"changesfilter-class"
+ value=3D"org.exoplatform.services.jcr.impl.core.query.jbo=
sscache.LocalIndexChangesFilter" />
+ <property name=3D"jbosscache-configuration" value=3D=
"jbosscache-indexer.xml" />
+ <property name=3D"jgroups-configuration" value=3D&qu=
ot;udp-mux.xml" />
+ <property name=3D"jgroups-multiplexer-stack" value=
=3D"true" />
+ <property name=3D"jbosscache-cluster-name" value=3D&=
quot;JCR-cluster-indexer-ws" />
+ <property name=3D"max-volatile-time" value=3D"6=
0" />
+ <property name=3D"rdbms-reindexing" value=3D"tr=
ue" />
+ <property name=3D"reindexing-page-size" value=3D&quo=
t;1000" />
+ <property name=3D"index-recovery-mode" value=3D"=
;from-coordinator" />
+ </properties>
+ </query-handler>
+</workspace>
+
+
+
+ Local Index Recovery Filters
+
+ A common usecase for all cluster-ready applications is a h=
ot joining and leaving of processing units. All nodes that are joining a cl=
uster for the first time or nodes joining after some downtime, must be in a=
synchronized state.
+
+
+ When using shared value storages, databases and indexes, c=
luster nodes are synchronized at any given time. But is not the case when a=
local index strategy is used.
+
+
+ If a new node joins a cluster, without an index it is retr=
ieved or recreated. Nodes can be also be restarted and thus the index is no=
t empty. By default, even though the existing index is thought to be up to =
date, it can be outdated.
+
+
+ The JBoss Portal Platform JCR offers a mechanism called RecoveryFilters that will automatically retrieve index for=
the joining node on start up. This feature is a set of filters that can be=
defined via QueryHandler configuration:
+
+ <property name=3D"index-r=
ecovery-filter" value=3D"org.exoplatform.services.jcr.impl.core.q=
uery.lucene.DocNumberRecoveryFilter" />
+
+ Filter numbers are not limited so they can be combined:
+
+ <property name=3D"index-r=
ecovery-filter" value=3D"org.exoplatform.services.jcr.impl.core.q=
uery.lucene.DocNumberRecoveryFilter" />
+ <property name=3D"index-recovery-filter" value=3D"or=
g.exoplatform.services.jcr.impl.core.query.lucene.SystemPropertyRecoveryFil=
ter" />
+
+
+ If any one returns fires, the index is re-synchronized. Th=
is feature uses standard index recovery mode defined by previously describe=
d parameter (can be "from-indexing" (default) or "from-coord=
inator")
+
+ <property name=3D"index-r=
ecovery-mode" value=3D"from-coordinator" />
+
+
+ There are multiple filter implementations:
+
+ <property name=3D"i=
ndex-recovery-filter" value=3D"org.exoplatform.services.jcr.impl.=
core.query.lucene.ConfigurationPropertyRecoveryFilter" />
+ <property name=3D"index-recovery-filter-forcereindexing" =
value=3D"true" />
+
+
+
+
+ org.exoplatform.services.jcr.impl.core.query.lucene.DocN=
umberRecoveryFilter
+
+
+ Checks the number of documents in index on coo=
rdinator side and self-side. It returns true if the coun=
t differs.
+
+
+ The advantage of this filter compared to other=
s, is that it will skip reindexing for workspaces where the index was not m=
odified.
+
+
+ For example; if there is ten repositories with=
three workspaces in each and only one is heavily used in the cluster, this=
filter will only reindex those workspaces that have been changed, without =
affecting other indexes.
+
+
+ This greatly reduces start up time.
+
+
+
+
+
+
+ JBoss-Cache template configuration
+
+ JBoss-Cache template configuration for query handler is ab=
out the same for both clustered strategies.
+
+
+ jbosscache-indexer.xml
+ <?xml version=
=3D"1.0" encoding=3D"UTF-8"?>
+<jbosscache xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance=
" xmlns=3D"urn:jboss:jbosscache-core:config:3.1">
+ <locking useLockStriping=3D"false" concurrencyLevel=3D&quo=
t;50000" lockParentForChildInsertRemove=3D"false"
+ lockAcquisitionTimeout=3D"20000" />
+ <!-- Configure the TransactionManager -->
+ <transaction transactionManagerLookupClass=3D"org.jboss.cache.t=
ransaction.JBossStandalone
+ JTAManagerLookup" />
+ <clustering mode=3D"replication" clusterName=3D"${jbo=
sscache-cluster-name}">
+ <stateRetrieval timeout=3D"20000" fetchInMemoryState=3D=
"false" />
+ <jgroupsConfig multiplexerStack=3D"jcr.stack" />
+ <sync />
+ </clustering>
+ <!-- Eviction configuration -->
+ <eviction wakeUpInterval=3D"5000">
+ <default algorithmClass=3D"org.jboss.cache.eviction.FIFOAlgo=
rithm" eventQueueSize=3D"1000000">
+ <property name=3D"maxNodes" value=3D"10000"=
; />
+ <property name=3D"minTimeToLive" value=3D"60000=
" />
+ </default>
+ </eviction>
+</jbosscache>
+
+
+ Read more about template configurations .
+
+
+
+
+ Asynchronous Re-indexing
+
+ Managing a large data set using a JCR in a production environm=
ent at times requires special operations with Indexes, stored on File Syste=
m. One of those maintenance operations is a recreation of it. Also called &=
quot;re-indexing". There are various usecases when it's important=
to do. They include hardware faults, hard restarts, data-corruption, migra=
tions and JCR updates that brings new features related to index. Usually in=
dex re-creation requested on server's startup or in runtime.
+
+
+ On startup indexing
+
+ A common usecase for updating and re-creating the index is=
to stop the server and manually remove indexes for workspaces requiring it=
. When the server is re-started, the missing indexes are automatically reco=
vered by re-indexing.
+
+
+ The eXo JCR Supports direct RDBMS re-indexing, which can =
be faster than ordinary and can be configured via QueryHandler parameter rdbms-reindexing set to tr=
ue.
+
+
+ A new feature is asynchronous indexing on startup. Usually=
startup is blocked until the indexing process is finished. This block can =
take any period of time, depending on amount of data persisted in repositor=
ies. But this can be resolved by using an asynchronous approaches of startu=
p indexation.
+
+
+ Essentially, all indexing operations are performed in the =
background without blocking the repository. This is controlled by the value=
of the async-reindexing parameter in Query=
Handler configuration.
+
+
+ With asynchronous indexation active, the JCR starts with n=
o active indexes present. Queries on JCR still can be executed without exce=
ptions, but no results will be returned until index creation completed.
+
+
+ The index state check is accomplished via QueryMa=
nagerImpl:
+
+
+ =
+boolean online =3D ((QueryManagerImpl)Worksp=
ace.getQueryManager()).getQueryHandeler().isOnline();
+
+
+
+ The OFFLINE state means=
that the index is currently re-creating. When the state is changed, a corr=
esponding log event is printed. When the background index task starts the i=
ndex is switched to OFFLINE, with follow=
ing log event :
+
+ [INFO] Setting index OFFLINE (repository/productio=
n[system]).
+
+ When the indexing process is finished, the following two e=
vents are logged :
+
+ [INFO] Created initial index for 143018 nodes (rep=
ository/production[system]).
+[INFO] Setting index ONLINE (repository/production[system]).
+
+ Those two log lines indicates the end of process for works=
pace given in brackets. Calling isOnline() as mentioned above, will also re=
turn true.
+
+
+
+ Hot Asynchronous Workspace Re-indexing using JMX
+
+ Some hard system faults, errors during upgrades, migration=
issues and some other factors may corrupt the index. Current versions of =
JCR supports Hot Asynchronous Workspace Reindexing<=
/emphasis> feature. It allows Service Administrators to launch the process =
in background without stopping or blocking the whole application by using a=
ny JMX-compatible console.
+
+
+ JMX Jconsole
+
+
+
+
+
+
+
+ The server can continue working as expected while the inde=
x is recreated.
+
+
+ This depends on the flag allow queries being passed via JMX interface to the reindex operation invocation. If=
the flag is set, the application continues working.
+
+
+ However, there is one critical limitation users must be aw=
are of; the index is frozen while the background task is running<=
/emphasis>.
+
+
+ This means that queries are performed on a version of the =
index present at the moment the indexing task is started, and that data wri=
tten into the repository after startup will not be available through the se=
arch until process completes.
+
+
+ Data added during re-indexation is also indexed, but will =
be available only when reindexing is complete. The JCR makes a snapshot of =
indexes at the invocation of the asynchronous indexing task and uses that s=
napshot for searches.
+
+
+ When the operation is finished, the stale index is replace=
d by the newly created index, which included any newly added data.
+
+
+ If the allow queries flag is set to=
false, then all queries will throw an exception while t=
ask is running. The current state can be acquired using the following JMX o=
peration:
+
+
+
+
+ getHotReindexingState() - returns information abou=
t latest invocation: start time, if in progress or finish time if done.
+
+
+
+
+
+ Notices
+
+ Hot re-indexing via JMX cannot be launched if the index is=
already in offline mode. This means that the index is currently involved i=
n some other operations, such as re-indexing at startup, copying in cluster=
to another node or whatever.
+
+
+ Also; Hot Asynchronous Reindexing via JMX and on startup reindexing are different features. =
So you can't get the state of startup reindexing using command g=
etHotReindexingState in JMX interface, but there are some common JMX=
operations:
+
+
+
+
+ getIOMode - returns current index IO mode (READ_ON=
LY / READ_WRITE), belongs to clustered configuration states;
+
+
+
+
+ getState - returns current state: ONLINE / OFFLINE.
+
+
+
+
+
+
+ Advanced tuning
+
+ Lucene tuning
+
+ As mentioned, JCR Indexing is based on the Lucene indexing=
library as the underlying search engine. It uses Directories to store inde=
x and manages access to index by Lock Factories.
+
+
+ By default, the JCR implementation uses optimal combinatio=
n of Directory implementation and Lock Factory implementation.
+
+
+ The SimpleFSDirectory is used in Window=
s environments and the NIOFSDirectory implementation is =
used in non-Windows systems.
+
+
+ NativeFSLockFactory is an optimal solut=
ion for a wide variety of cases including clustered environment with NFS sh=
ared resources.
+
+
+ But those defaults can be overridden in the system propert=
ies.
+
+
+ Two properties: org.exoplatform.jcr.lucene.store.=
FSDirectoryLockFactoryClass and org.exoplatform.jcr.luce=
ne.FSDirectory.class control (and change) the default behavior.
+
+
+ The first defines the implementation of abstract Lucene LockFactory class and the second sets implementation class=
for FSDirectory instances.
+
+
+ For more information, refer to the Lucene documentation. B=
ut be careful, for while the JCR allows users to change implementation clas=
ses of Lucene internals, it does not guarantee the stability and functional=
ity of those changes.
+
+
+
+
+
+ JBossTransactionsService
+
+ Introduction
+
+ JBossTransactionsService implements eXo TransactionService and provides=
access to JBoss Transaction S=
ervice (JBossTS) JTA implementation via eXo container dependency.
+
+
+ TransactionService used in JCR cache org.exoplatform.services=
.jcr.impl.dataflow.persistent.jbosscache.JBossCacheWorkspaceStorageCache implementation.
+
+
+
+ Configuration
+
+ Example configuration:
+
+ <component>
+ <key>org.exoplatform.services.transaction.TransactionService<=
/key>
+ <type>org.exoplatform.services.transaction.jbosscache.JBossTrans=
actionsService</type>
+ <init-params>
+ <value-param>
+ <name>timeout</name>
+ <value>3000</value>
+ </value-param>
+ </init-params> =
+ </component>
+
+ timeout - XA transaction timeout in seconds
+
+
+
+
+ JCR Query Use-cases
+
+ Introduction
+
+ The JCR supports two query languages; JCR and XPath. A query, =
whether XPath or SQL, specifies a subset of nodes within a workspace, calle=
d the result set. The result set constitutes all the nodes in the workspace=
that meet the constraints stated in the query.
+
+
+
+ Query Lifecycle
+
+ Query Creation and Execution
+
+ SQL
+ // get QueryMana=
ger
+QueryManager queryManager =3D workspace.getQueryManager();
+// make SQL query
+Query query =3D queryManager.createQuery("SELECT * FROM nt:base "=
;, Query.SQL);
+// execute query
+QueryResult result =3D query.execute();
+
+
+ XPath
+ // get QueryMana=
ger
+QueryManager queryManager =3D workspace.getQueryManager(); =
+// make XPath query
+Query query =3D queryManager.createQuery("//element(*,nt:base)",=
Query.XPATH);
+// execute query
+QueryResult result =3D query.execute();
+
+
+
+ Query Result Processing
+ // fetch query res=
ult
+QueryResult result =3D query.execute();
+
+ To fetch the nodes:
+
+ NodeIterator it =
=3D result.getNodes();
+
+ The results can be formatted in a table:
+
+ // get column names
+String[] columnNames =3D result.getColumnNames();
+// get column rows
+RowIterator rowIterator =3D result.getRows();
+while(rowIterator.hasNext()){
+ // get next row
+ Row row =3D rowIterator.nextRow();
+ // get all values of row
+ Value[] values =3D row.getValues();
+}
+
+
+ Scoring
+
+ The result returns a score for each row in the result set.=
The score contains a value that indicates a rating of how well the result =
node matches the query. A high value means a better matching than a low val=
ue. This score can be used for ordering the result.
+
+
+ eXo JCR Scoring is a mapping of Lucene scoring. For a more=
in-depth understanding, please study Lucene documentation.
+
+
+ The jcr:score is calculated as; (lucene score)*1000f.
+
+
+
+
+ Tips and tricks
+
+ XPath queries containing node names starting with a number<=
/title>
+
+ If you execute an XPath request like this...
+
+ // get QueryManager
+QueryManager queryManager =3D workspace.getQueryManager(); =
+// make XPath query
+Query query =3D queryManager.createQuery("/jcr:root/Documents/Publie/=
2010//element(*, exo:article)", Query.XPATH);
+
+ ...you will receive an Invalid request error. This is becau=
se XML (and thus XPath) does not allow names starting with a number.
+
+
+ Therefore, XPath requests using a node name that starts with a number ar=
e invalid.
+
+
+ Some possible alternatives are:
+
+
+
+
+ Use an SQL request.
+
+
+
+
+ Use escaping:
+
+ // get QueryMa=
nager
+QueryManager queryManager =3D workspace.getQueryManager(); =
+// make XPath query
+Query query =3D queryManager.createQuery("/jcr:root/Documents/Publie/=
_x0032_010//element(*, exo:article)", Query.XPATH);
+
+
+
+
+
+
+ Searching Repository Content
+
+ Introduction
+
+ You can find the JCR configuration file here: JP=
P_DIST/gatein/gatein.ear/portal.war/portal/WEB-INF/conf/jcr/r=
epository-configuration.xml.
+
+
+ Please refer to for more information about index configuration.
+
+
+
+ Bi-directional RangeIterator
+
+ QueryResult.getNodes() will return bi-directional NodeIterator implementation.
+
+
+
+ Bi-directional NodeIterator is not supported=
emphasis> in two cases:
+
+
+
+
+ SQL query: select * from nt:base
+
+
+
+
+ XPath query: //* .
+
+
+
+
+
+ TwoWayRangeIterator interface:
+
+ /**
+ * Skip a number of elements in the iterator.
+ * =
+ * @param skipNum the non-negative number of elements to skip
+ * @throws java.util.NoSuchElementException if skipped past the first elem=
ent
+ * in the iterator.
+ */
+public void skipBack(long skipNum);
+
+ Usage:
+
+ NodeIterator iter =
=3D queryResult.getNodes();
+while (iter.hasNext()) {
+ if (skipForward) {
+ iter.skip(10); // Skip 10 nodes in forward direction
+ } else if (skipBack) {
+ TwoWayRangeIterator backIter =3D (TwoWayRangeIterator) iter; =
+ backIter.skipBack(10); // Skip 10 nodes back =
+ }
+ .......
+}
+
+
+ Fuzzy Searches
+
+ The JBoss Portal Platform JCR supports features such as Lucene Fuzzy Se=
arches. To perform a fuzzy search, form your query like the one below:
+
+ QueryManager qman =
=3D session.getWorkspace().getQueryManager();
+Query q =3D qman.createQuery("select * from nt:base where contains(fi=
eld, 'ccccc~')", Query.SQL);
+QueryResult res =3D q.execute();
+
+
+ SynonymSearch
+
+ Searching with synonyms is integrated in the jcr:contains() function and uses the same syntax as synonym searches in web search=
engines (Google, for example). If a search term is prefixed by a tilde sym=
bol ( ~ ), synonyms of the search term are taken into consideration. For ex=
ample:
+
+ SQL: select * from nt:resource where contains(., &ap=
os;~parameter')
+
+XPath: //element(*, nt:resource)[jcr:contains(., '~parameter')=
programlisting>
+
+ This feature is disabled by default and you need to add a configuration=
parameter to the query-handler element in your JCR configuration file to e=
nable it.
+
+ <param name=3D&quo=
t;synonymprovider-config-path" value=3D"..you path to configurati=
on file....."/>
+<param name=3D"synonymprovider-class" value=3D"org.exop=
latform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider"=
/>
+ /**
+ * <code>SynonymProvider</code> defines an interface for a com=
ponent that
+ * returns synonyms for a given term.
+ */
+public interface SynonymProvider {
+
+ /**
+ * Initializes the synonym provider and passes the file system resource=
to
+ * the synonym provider configuration defined by the configuration valu=
e of
+ * the <code>synonymProviderConfigPath</code> parameter. Th=
e resource may be
+ * <code>null</code> if the configuration parameter is not =
set.
+ *
+ * @param fsr the file system resource to the synonym provider
+ * configuration.
+ * @throws IOException if an error occurs while initializing the synonym
+ * provider.
+ */
+ public void initialize(InputStream fsr) throws IOException;
+
+ /**
+ * Returns an array of terms that are considered synonyms for the given
+ * <code>term</code>.
+ *
+ * @param term a search term.
+ * @return an array of synonyms for the given <code>term</code=
> or an empty
+ * array if no synonyms are known.
+ */
+ public String[] getSynonyms(String term);
+}
+
+
+ Highlighting
+
+ An ExcerptProvider retrieves text excerpts for a nod=
e in the query result and marks up the words in the text that match the que=
ry terms.
+
+
+ By default, match highlighting is disabled because as it requires that =
additional information is written to the search index.
+
+
+ To enable this feature, you need to add a configuration parameter to th=
e query-handler element in your JCR configuration fi=
le:
+
+ <param name=3D"=
;support-highlighting" value=3D"true"/>
+
+ Additionally, there is a parameter that controls the format of the exce=
rpt created. In JCR 1.9, the default is set to org.exoplatform.ser=
vices.jcr.impl.core.query.lucene.DefaultHTMLExcerpt. The configur=
ation parameter for this setting is:
+
+ <param name=3D"=
;excerptprovider-class" value=3D"org.exoplatform.services.jcr.imp=
l.core.query.lucene.DefaultXMLExcerpt"/>
+
+ DefaultXMLExcerpt
+
+ This excerpt provider creates an XML fragment of the following form:
+
+ <excerpt>
+ <fragment>
+ <highlight>exoplatform</highlight> implements both the=
mandatory
+ XPath and optional SQL <highlight>query</highlight> sy=
ntax.
+ </fragment>
+ <fragment>
+ Before parsing the XPath <highlight>query</highlight> =
in
+ <highlight>exoplatform</highlight>, the statement is s=
urrounded
+ </fragment>
+</excerpt>
+
+
+ DefaultHTMLExcerpt
+
+ This excerpt provider creates an HTML fragment of the following form:
+
+ <div>
+ <span>
+ <strong>exoplatform</strong> implements both the manda=
tory XPath
+ and optional SQL <strong>query</strong> syntax.
+ </span>
+ <span>
+ Before parsing the XPath <strong>query</strong> in
+ <strong>exoplatform</strong>, the statement is surroun=
ded
+ </span>
+</div>
+
+
+ Usage
+
+ If you are using XPath, you must use the rep:excerpt() fu=
nction in the last location step, just like you would select properties:
+
+ QueryManager qm =
=3D session.getWorkspace().getQueryManager();
+Query q =3D qm.createQuery("//*[jcr:contains(., 'exoplatform&apo=
s;)]/(@Title|rep:excerpt(.))", Query.XPATH);
+QueryResult result =3D q.execute();
+for (RowIterator it =3D result.getRows(); it.hasNext(); ) {
+ Row r =3D it.nextRow();
+ Value title =3D r.getValue("Title");
+ Value excerpt =3D r.getValue("rep:excerpt(.)");
+}
+
+ The above code searches for nodes that contain the word exop=
latform and then gets the value of the Title property and an excerpt for each resultant node.
+
+
+ It is also possible to use a relative path in the call Row.getVa=
lue() while the query statement still remains the same. Also, you ma=
y use a relative path to a string property. The returned value will then be=
an excerpt based on string value of the property.
+
+
+ Both available excerpt providers will create fragments of about 150 ch=
aracters and up to three fragments.
+
+
+ In SQL, the function is called excerpt() without the rep =
prefix, but the column in the RowIterator will nonethele=
ss be labelled rep:excerpt(.).
+
+ QueryManager qm =
=3D session.getWorkspace().getQueryManager();
+Query q =3D qm.createQuery("select excerpt(.) from nt:resource where =
contains(., 'exoplatform')", Query.SQL);
+QueryResult result =3D q.execute();
+for (RowIterator it =3D result.getRows(); it.hasNext(); ) {
+ Row r =3D it.nextRow();
+ Value excerpt =3D r.getValue("rep:excerpt(.)");
+}
+
+
+
+ SpellChecker
+
+ The lucene based query handler implementation supports a pluggable spel=
l-checker mechanism. By default, spell checking is not available, it must b=
e configured first.
+
+
+ Information about the spellCheckerClass paramete=
r is available in .
+
+
+ The JCR currently provides an implementation class which uses the lucene-spellch=
ecker.
+
+
+ The dictionary is derived from the fulltext, indexed content of the wor=
kspace and updated periodically. You can configure the refresh interval by =
picking one of the available inner classes of org.exoplatform.serv=
ices.jcr.impl.core.query.lucene.spell.LuceneSpellChecker:
+
+
+
+
+ OneMinuteRefreshInterval
+
+
+
+
+ FiveMinutesRefreshInterval
+
+
+
+
+ ThirtyMinutesRefreshInterval
+
+
+
+
+ OneHourRefreshInterval
+
+
+
+
+ SixHoursRefreshInterval
+
+
+
+
+ TwelveHoursRefreshInterval
+
+
+
+
+ OneDayRefreshInterval
+
+
+
+
+ For example, if you want a refresh interval of six hours, the class nam=
e would be; org.exoplatform.services.jcr.impl.core.query.lucene.sp=
ell.LuceneSpellChecker$SixHoursRefreshInterval.
+
+
+ If you use org.exoplatform.services.jcr.impl.core.query.lucene=
.spell.LuceneSpellChecker, the refresh interval will be one hour.
+
+
+ The spell checker dictionary is stored as a lucene index under <index-dir>/spellchecker. If this index does not exist, =
a background thread will create it on start up. Similarly, the dictionary r=
efresh is also done in a background thread so as not to block regular queri=
es.
+
+
+ Usage
+
+ You can spell check a fulltext statement either with an XPath or a SQL=
query:
+
+ // rep:spellcheck(=
'explatform') will always evaluate to true
+Query query =3D qm.createQuery("/jcr:root[rep:spellcheck('explat=
form')]/(rep:spellcheck())", Query.XPATH);
+RowIterator rows =3D query.execute().getRows();
+// the above query will always return the root node no matter what string =
we check
+Row r =3D rows.nextRow();
+// get the result of the spell checking
+Value v =3D r.getValue("rep:spellcheck()");
+if (v =3D=3D null) {
+ // no suggestion returned, the spelling is correct or the spell checker
+ // does not know how to correct it.
+} else {
+ String suggestion =3D v.getString();
+}
+
+ And the same using SQL:
+
+ // SPELLCHECK(&apo=
s;exoplatform') will always evaluate to true
+Query query =3D qm.createQuery("SELECT rep:spellcheck() FROM nt:base =
WHERE jcr:path =3D '/' AND SPELLCHECK('explatform')&quo=
t;, Query.SQL);
+RowIterator rows =3D query.execute().getRows();
+// the above query will always return the root node no matter what string =
we check
+Row r =3D rows.nextRow();
+// get the result of the spell checking
+Value v =3D r.getValue("rep:spellcheck()");
+if (v =3D=3D null) {
+ // no suggestion returned, the spelling is correct or the spell checker
+ // does not know how to correct it.
+} else {
+ String suggestion =3D v.getString();
+}
+
+
+
+ Similarity
+
+ Starting with version, 1.12 JCR allows you to search for nodes that are=
similar to an existing node.
+
+
+ Similarity is determined by looking up terms that are common to nodes. =
There are some conditions that must be met for a term to be considered. Thi=
s is required to limit the number possibly relevant terms.
+
+
+ To be considered, terms must:
+
+
+
+
+ Be at least four characters long.
+
+
+
+
+ Occur at least twice in the source node.
+
+
+
+
+ Occur in at least five other nodes.
+
+
+
+
+ Note
+
+ The similarity function requires that the support Hightlighting is ena=
bled. Please make sure that you have the following parameter set for the qu=
ery handler in your workspace.xml.
+
+ <param name=3D&qu=
ot;support-highlighting" value=3D"true"/>
+
+
+ The functions (rep:similar() in XPath and similar()<=
/code> in SQL) have two arguments:
+
+
+
+
+ relativePath
+
+
+ A relative path to a descendant node or a period (.) for the current node.
+
+
+
+
+ absoluteStringPath
+
+
+ A string literal that contains the path to the node for which to fin=
d similar nodes.
+
+
+
+
+
+ Warning
+
+ Relative path is not supported yet.
+
+
+
+ Example
+ //element(*, nt:resource)[rep:similar(., '/pa=
rentnode/node.txt/jcr:content')]
+
+ Finds nt:resource nodes, which are similar to node =
by path /parentnode/node.txt/jcr:content.
+
+
+
+
+
+ Full Text Search And Affecting Settings
+
+ Property content indexing
+
+ Each property of a node (if it is indexable) is processed with the Luce=
ne analyzer and stored in the Lucene index. This is called indexing of a pr=
operty. It allows fulltext searching of these indexed properties.
+
+
+
+ Lucene Analyzers
+
+ The purpose of analyzers is to transform all strings stored in the inde=
x into a well-defined condition. The same analyzer(s) is/are used when sear=
ching in order to adapt the query string to the index reality.
+
+
+ Therefore, performing the same query using different analyzers can retu=
rn different results.
+
+
+ The example below illustrates how the same string is transformed by dif=
ferent analyzers.
+
+
+
+
+ StandardAnalyzer is the default analyzer in the JBo=
ss Portal Platform JCR search engine. But it does not use stop words.
+
+
+
+ You can assign your analyzer as described in .
+
+
+
+ Property Indexing
+
+ Different properties are indexed in different ways and this affects whe=
ther it can be searched via fulltext by property or not.
+
+
+ Only two property types are indexed as fulltext searcheable: STRING and BINARY.
+
+
+ Fulltext search by different properties
+
+
+
+ Property Type
+ Fulltext search by all properties
+ Fulltext search by exact property
+
+
+
+
+ STRING
+ YES
+ YES
+
+
+ BINARY
+ YES
+ NO
+
+
+
+
+
+ For example, the jcr:data property (which is BINARY) will not be found with a query structured as:
+
+ SELECT * FROM nt:resource WHERE CONTAINS(jcr:data, &=
apos;some string')
+
+ This is because, BINARY is not searchable b=
y fulltext search by exact property.
+
+
+ However, the following query will return some resu=
lts (provided, of course they node contains the targeted data):
+
+ SELECT * FROM nt:resource WHERE CONTAINS( * , '=
some string')
+
+
+ Different Analyzers
+
+ First of all, we will fill repository by nodes with mixin type 'mi=
x:title' and different values of 'jcr:description' property.
+
+ root
+ =E2=94=9C=E2=94=80=E2=94=80 document1 (mix:title) jcr:description =3D &q=
uot;The quick brown fox jumped over the lazy dogs"
+ =E2=94=9C=E2=94=80=E2=94=80 document2 (mix:title) jcr:description =3D &q=
uot;Brown fox live in forest."
+ =E2=94=94=E2=94=80=E2=94=80 document3 (mix:title) jcr:description =3D &q=
uot;Fox is a nice animal."
+
+
+ The example below shows different Analyzers in action. The first instan=
ce uses base JCR settings, so the string; "The quick brown f=
ox jumped over the lazy dogs" will be transformed to the se=
t; {[the] [quick] [brown] [fox] [jumped] [over] [th=
e] [lazy] [dogs] }.
+
+ // make SQL query
+QueryManager queryManager =3D workspace.getQueryManager();
+String sqlStatement =3D "SELECT * FROM mix:title WHERE CONTAINS(jcr:d=
escription, 'the')";
+// create query
+Query query =3D queryManager.createQuery(sqlStatement, Query.SQL);
+// execute query and fetch result
+QueryResult result =3D query.execute();
+
+ The NodeIterator will return document1.
+
+
+ However, if the default analyzer is changed to org.apache.luce=
ne.analysis.StopAnalyzer, the repository populated again (the new=
Analyzer must process node properties) and the same query run, it will ret=
urn nothing, because stop words like "the" w=
ill be excluded from parsed string set.
+
+
+
+
+ WebDAV
+
+ Introduction
+
+ The WebDAV protocol enables you to =
use third party tools to communicate with hierarchical content servers via =
the HTTP protocol. It is possible to add and remove documents or a set of d=
ocuments from a path on the server.
+
+
+ DeltaV is an extension of the WebDa=
v protocol that allows managing document versioning. The Locking<=
/emphasis> feature guarantees protection against multiple access when writi=
ng resources. The ordering support allows changing the position of the reso=
urce in the list and sort the directory to make the directory tree viewed c=
onveniently. The full-text search makes it easy to find the necessary docum=
ents. You can search by using two languages: SQL and XPATH.
+
+
+ In the eXo JCR, the WebDAV layer (based on the code taken from=
the extension modules of the reference implementation) is plugged in on to=
p of our JCR implementation. This makes it possible to browse a workspace u=
sing the third party tools regardless of operating system environments. You=
can use a Java WebDAV client, such as DAVExplorer or Internet Explorer using
+ File
+ Open as a Web Folder
+ .
+
+
+ WebDav is an extension of the REST service. To get the WebDav =
server ready, you must deploy the REST application. Then, you can access an=
y workspaces of your repository by using the following URL:
+
+
+
+
+
+ When accessing the WebDAV server via , you can substit=
ute production with collaboration.
+
+
+ You will be asked to enter your login credentials. These will =
then be checked by using the organization service that can be implemented t=
hanks to an InMemory (dummy) module or a DB module or an LDAP one and the J=
CR user session will be created with the correct JCR Credentials.
+
+
+ Note:
+
+ If you try the "in ECM" option, add "@ecm&q=
uot; to the user's password. Alternatively, you may modify jaas.conf b=
y adding the domain=3Decm option as foll=
ows:
+
+ exo-domain {
+ org.exoplatform.services.security.jaas.BasicLoginModule required doma=
in=3Decm;
+};
+
+
+
+ WebDAV Configuration
+
+ The WebDAV configuration file:
+
+ <component>
+ <key>org.exoplatform.services.webdav.WebDavServiceImpl</key>
+ <type>org.exoplatform.services.webdav.WebDavServiceImpl</type&g=
t;
+ <init-params>
+
+ <!-- this parameter indicates the default login and password values
+ used as credentials for accessing the repository -->
+ <!-- value-param>
+ <name>default-identity</name>
+ <value>admin:admin</value> =
+ </value-param -->
+
+ <!-- this is the value of WWW-Authenticate header -->
+ <value-param>
+ <name>auth-header</name>
+ <value>Basic realm=3D"eXo-Platform Webdav Server 1.6.1&qu=
ot;</value>
+ </value-param>
+
+ <!-- default node type which is used for the creation of collection=
s -->
+ <value-param>
+ <name>def-folder-node-type</name>
+ <value>nt:folder</value>
+ </value-param>
+
+ <!-- default node type which is used for the creation of files --&g=
t;
+ <value-param>
+ <name>def-file-node-type</name>
+ <value>nt:file</value>
+ </value-param>
+
+ <!-- if MimeTypeResolver can't find the required mime type, =
+ which conforms with the file extension, and the mimeType header i=
s absent
+ in the HTTP request header, this parameter is used =
+ as the default mime type-->
+ <value-param>
+ <name>def-file-mimetype</name>
+ <value>application/octet-stream</value>
+ </value-param>
+
+ <!-- This parameter indicates one of the three cases when you updat=
e the content of the resource by PUT command.
+ In case of "create-version", PUT command creates the ne=
w version of the resource if this resource exists.
+ In case of "replace" - if the resource exists, PUT comm=
and updates the content of the resource and its last modification date.
+ In case of "add", the PUT command tries to create the n=
ew resource with the same name (if the parent node allows same-name sibling=
s).-->
+
+ <value-param>
+ <name>update-policy</name>
+ <value>create-version</value>
+ <!--value>replace</value -->
+ <!-- value>add</value -->
+ </value-param>
+
+ <!--
+ This parameter determines how service responds to a method that at=
tempts to modify file content.
+ In case of "checkout-checkin" value, when a modification=
request is applied to a checked-in version-controlled resource, the reques=
t is automatically preceded by a checkout and followed by a checkin operati=
on.
+ In case of "checkout" value, when a modification request=
is applied to a checked-in version-controlled resource, the request is aut=
omatically preceded by a checkout operation.
+ --> =
+ <value-param>
+ <name>auto-version</name>
+ <value>checkout-checkin</value>
+ <!--value>checkout</value -->
+ </value-param>
+
+ <!--
+ This parameter is responsible for managing Cache-Control header va=
lue which will be returned to the client.
+ You can use patterns like "text/*", "image/*" =
or wildcard to define the type of content.
+ --> =
+ <value-param>
+ <name>cache-control</name>
+ <value>text/xml,text/html:max-age=3D3600;image/png,image/jpg:m=
ax-age=3D1800;*/*:no-cache;</value>
+ </value-param>
+ =
+ <!--
+ This parameter determines the absolute path to the folder icon fil=
e, which is shown
+ during WebDAV view of the contents
+ -->
+ <value-param>
+ <name>folder-icon-path</name>
+ <value>/absolute/path/to/file</value>
+ </value-param>
+
+ </init-params>
+</component>
+
+
+ Corresponding WebDAV and JCR actions
+
+
+
+ WebDAV Considerations
+
+ There are some restrictions for WebDAV in different operating =
systems.
+
+
+ Windows 7
+
+ When attempting to set up a web folder through A=
dd a Network Location or Map a Network Drive through My Computer, an error message stating The folder you entered does not appear to be valid. Please choose anoth=
er or Windows cannot access =E2=80=A6 Check the spelli=
ng of the name. Otherwise, there might be =E2=80=A6 may be encou=
ntered. These errors may appear when you are using SSL or non-SSL.
+
+
+
+ To fix this, do as follows:
+
+
+
+
+ Go to Windows Registry Editor.
+
+
+
+
+ Find a key: \HKEY_LOCAL_MACHINE\SYSTEM\CurrentControls=
et\services\WebClient\Parameters\BasicAuthLevel .
+
+
+
+
+ Change the value to 2.
+
+
+
+
+ Microsoft Office 2010
+
+ If you have:
+
+
+
+
+
+ Microsoft Office 2007/2010 applications installed on a=
client computer AND...
+
+
+
+
+ The client computer is connected to a web server confi=
gured for Basic authentication VIA...
+
+
+
+
+ A connection that does not use Secure Sockets Layer (S=
SL) AND...
+
+
+
+
+ You try to access an Office file that is stored on the=
remote server...
+
+
+
+
+ You might experience the following symptoms when you t=
ry to open or to download the file:
+
+
+
+
+ The Office file does not open or download.
+
+
+
+
+ You do not receive a Basic authentication pass=
word prompt when you try to open or to download the file.
+
+
+
+
+ You do not receive an error message when you t=
ry to open the file. The associated Office application starts. However, the=
selected file does not open.
+
+
+
+
+
+
+ These outcomes can be circumvented by enabling Basic authentic=
ation on the client machine.
+
+
+ To enable Basic authentication on the client computer, follow =
these steps:
+
+
+
+
+ Click Start, type regedit in the St=
art Search box, and then press Enter.
+
+
+
+
+ Locate and then click the following registry subkey:
+
+
+ HKEY_CURRENT_USER\Software\Microsoft\Office\14.=
0\Common\Internet
+
+
+
+
+ On the Edit menu, point to New, and then click DWORD Value.
+
+
+
+
+ Type BasicAuthLevel, and then press=
Enter.
+
+
+
+
+ Right-click BasicAuthLevel, and the=
n click Modify.
+
+
+
+
+ In the Value data box, type 2, and =
then click OK.
+
+
+
+
+
+
+ FTP
+
+ Introduction
+
+ The JCR-FTP Server operates as an FTP server with access to a =
content stored in JCR repositories in the form of nt:file/nt:folde=
r nodes or their successors. The client of an executed Server can=
be any FTP client. The FTP server is supported by a standard configuration=
which can be changed as required.
+
+
+
+ Configuration Parameters
+
+ Parameters
+
+ command-port:
+
+ <value-param&=
gt;
+ <name>command-port</name>
+ <value>21</value>
+</value-param>
+
+ The value of the command channel port. The value &=
apos;21' is default.
+
+
+ If you have already other FTP server installed in =
your system, this parameter needs to be changed (to 2121=
, for example) to avoid conflicts or if the port is protected.
+
+
+
+
+ data-min-port and data-max-port
+
+ <value-param&=
gt;
+ <name>data-min-port</name>
+ <value>52000</value>
+</value-param>
+ <value-param&=
gt;
+ <name>data-max-port</name>
+ <value>53000</value>
+</value-param>
+
+ These two parameters indicate the minimum and maxi=
mum values of the range of ports, used by the server. The usage of the addi=
tional data channel is required by the FTP protocol, which is used to trans=
fer the contents of files and the listing of catalogues. This range of port=
s should be free from listening by other server-programs.
+
+
+
+
+ system
+
+ <value-param&=
gt;
+ <name>system</name>
+
+ <value>Windows_NT</value>
+ or
+ <value>UNIX Type: L8</value>
+</value-param>
+
+ Types of formats of listing of catalogues which ar=
e supported.
+
+
+
+
+ client-side-encoding
+
+ <value-param&=
gt;
+ <name>client-side-encoding</name>
+ =
+ <value>windows-1251</value>
+ or
+ <value>KOI8-R</value>
+ =
+</value-param>
+
+ This parameter specifies the coding which is used =
for dialogue with the client.
+
+
+
+
+ def-folder-node-type
+
+ <value-param&=
gt;
+ <name>def-folder-node-type</name>
+ <value>nt:folder</value>
+</value-param>
+
+ This parameter specifies the type of a node, when =
an FTP-folder is created.
+
+
+
+
+ def-file-node-type
+
+ <value-param&=
gt;
+ <name>def-file-node-type</name>
+ <value>nt:file</value>
+</value-param>
+
+ This parameter specifies the type of a node, when =
an FTP-file is created.
+
+
+
+
+ def-file-mime-type
+
+ <value-param&=
gt;
+ <name>def-file-mime-type</name> =
+ <value>application/zip</value>
+</value-param>
+
+ The mime type of a created file is chosen by using=
its file extension. In case, a server cannot find the corresponding mime t=
ype, this value is used.
+
+
+
+
+ cache-folder-name
+
+ <value-param&=
gt;
+ <name>cache-folder-name</name>
+ <value>../temp/ftp_cache</value>
+</value-param>
+
+ The Path of the cache folder.
+
+
+
+
+ upload-speed-limit
+
+ <value-param&=
gt;
+ <name>upload-speed-limit</name> =
+ <value>20480</value>
+</value-param>
+
+ Restriction of the upload speed. It is measured in=
bytes.
+
+
+
+
+ download-speed-limit
+
+ <value-param&=
gt;
+ <name>download-speed-limit</name>
+ <value>20480</value> =
+</value-param>
+
+ Restriction of the download speed. It is measured =
in bytes.
+
+
+
+
+ timeout
+
+ <value-param&=
gt;
+ <name>timeout</name>
+ <value>60</value>
+</value-param>
+
+ Defines the value of a timeout.
+
+
+
+
+
+
+
+ Use External Backup Tool
+
+ Repository Suspending
+
+ To have the repository content consistent with the search index and val=
ue storage, the repository should be suspended. This means all working thre=
ads are suspended until a resume operation is performed. The index will be =
flushed.
+
+
+ JCR provides ability to suspend repository via JMX.
+
+
+ Repository Suspend Contr=
oller
+
+
+
+
+
+
+
+ To suspend repository you need to invoke the suspend() operation. The returned result will be "suspended" if everything passed successfully.
+
+
+ Repository =
Suspend Controller Suspended
+
+
+
+
+
+
+
+ An "undefined" result means not all comp=
onents were successfully suspended. Check the console to review the stack t=
races.
+
+
+
+ Backup
+
+ You can backup your content manually or by using third part software. Y=
ou should back up:
+
+
+
+
+ Database.
+
+
+
+
+ Lucene index.
+
+
+
+
+ Value storage (if configured).
+
+
+
+
+
+ Repository Resuming
+
+ Once a backup is done you need to invoke the resume() operation to switch the repository back to on-line. The returned result w=
ill be "on-line".
+
+
+ Repository Sus=
pend Controller Online
+
+
+
+
+
+
+
+
+
+ eXo JCR statistics
+
+ Statistics on the Database Access Layer
+
+ In order to have a better idea of the time spent into the data=
base access layer, it can be interesting to get some statistics on that par=
t of the code, knowing that most of the time spent into eXo JCR is mainly t=
he database access.
+
+
+ These statistics will then allow you to identify, without usin=
g any profiler, what is abnormally slow in this layer which could help diag=
nose, and fix, a problem.
+
+
+ If you use org.exoplatform.services.jcr.impl.storage.jd=
bc.optimisation.CQJDBCWorkspaceDataContainer or org.exoplatf=
orm.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer as WorkspaceDataContainer, you can get statistics on the time spe=
nt into the database access layer.
+
+
+ The database access layer (in eXo JCR) is represented by the m=
ethods of the interface org.exoplatform.services.jcr.storage.Workspa=
ceStorageConnection, so for all the methods defined in this interfa=
ce, we can have the following figures:
+
+
+
+
+ The minimum time spent into the method.
+
+
+
+
+ The maximum time spent into the method.
+
+
+
+
+ The average time spent into the method.
+
+
+
+
+ The total amount of time spent into the method.
+
+
+
+
+ The total amount of time the method has been called.
+
+
+
+
+ Those figures are also available globally for all the methods =
which gives us the global behavior of this layer.
+
+
+ If you want to enable the statistics, you just need to set the=
JVM parameter called JDBCWorkspaceDataContainer.statistics.enab=
led to true. The corresponding CSV file is=
StatisticsJDBCStorageConnection-${creation-timestamp}.csv for more details about how the CSV files are managed, please refer to =
the section dedicated to the statistics manager.
+
+
+ The format of each column header is ${method-alia=
s}-${metric-alias}. The metric ali=
as are described in the statistics manager section.
+
+
+ The name of the category of statistics corresponding to these =
statistics is JDBCStorageConnection, this name is mostly=
needed to access to the statistics through JMX.
+
+
+ Method Alias
+
+
+
+ global
+ This is the alias for all the methods.
+
+
+ getItemDataById
+ This is the alias for the method getItemDa=
ta(String identifier).
+
+
+ getItemDataByNodeDataNQPathEntry
+ This is the alias for the method getItemDa=
ta(NodeData parentData, QPathEntry name).
+
+
+ getChildNodesData
+ This is the alias for the method getChildN=
odesData(NodeData parent).
+
+
+ getChildNodesCount
+ This is the alias for the method getChildN=
odesCount(NodeData parent).
+
+
+ getChildPropertiesData
+ This is the alias for the method getChildP=
ropertiesData(NodeData parent).
+
+
+ listChildPropertiesData
+ This is the alias for the method listChild=
PropertiesData(NodeData parent).
+
+
+ getReferencesData
+ This is the alias for the method getRefere=
ncesData(String nodeIdentifier).
+
+
+ commit
+ This is the alias for the method commit().=
+
+
+ addNodeData
+ This is the alias for the method add(NodeD=
ata data).
+
+
+ addPropertyData
+ This is the alias for the method add(Prope=
rtyData data).
+
+
+ updateNodeData
+ This is the alias for the method update(No=
deData data).
+
+
+ updatePropertyData
+ This is the alias for the method update(Pr=
opertyData data).
+
+
+ deleteNodeData
+ This is the alias for the method delete(No=
deData data).
+
+
+ deletePropertyData
+ This is the alias for the method delete(Pr=
opertyData data).
+
+
+ renameNodeData
+ This is the alias for the method rename(No=
deData data).
+
+
+ rollback
+ This is the alias for the method rollback(=
).
+
+
+ isOpened
+ This is the alias for the method isOpened(=
).
+
+
+ close
+ This is the alias for the method close().<=
/emphasis>
+
+
+
+
+
+
+ Statistics on the JCR API accesses
+
+ In order to know exactly how your application uses eXo JCR, it=
can be interesting to register all the JCR API accesses in order to easily=
create real life test scenario based on pure JCR calls and also to tune yo=
ur JCR to better fit your requirements.
+
+
+ In order to allow you to specify the configuration which part =
of eXo JCR needs to be monitored without applying any changes in your code =
and/or building anything, we choose to rely on the Load-time Weaving propos=
ed by AspectJ.
+
+
+ To enable this feature, you will have to add in your classpath=
the following jar files:
+
+
+
+
+ exo.jcr.component.statistics-X.Y.Z.jar corresponding to your eXo JCR version that you can get from the JBoss=
maven repository https://re=
pository.jboss.org/nexus/content/groups/public/org/exoplatform/jcr/exo.jcr.=
component.statistics/.
+
+
+
+
+ aspectjrt-1.6.8.jar that you can get from the main mav=
en repository
+ http://repo2.maven.org/maven2/org/aspectj/aspectjrt
+ .
+
+
+
+
+ You will also need to get aspectjweaver-1.6.8.jar from the main maven repository http://repo2.maven.org/maven2/org/aspec=
tj/aspectjweaver.
+
+
+ At this stage, to enable the statistics on the JCR API accesse=
s, you will need to add the JVM parameter -javaagent:${pathto}/a=
spectjweaver-1.6.8.jar to your command line, for more details p=
lease refer to http://www.eclipse.org/aspectj/doc/released/=
devguide/ltw-configuration.html.
+
+
+ By default, the configuration will collect statistics on all t=
he methods of the internal interfaces org.exoplatform.services.jcr=
.core.ExtendedSession and org.exoplatform.services.jcr.c=
ore.ExtendedNode, and the JCR API interface javax.jcr.Pr=
operty.
+
+
+ To add and/or remove some interfaces to monitor, you have two =
configuration files to change that are bundled into the jar exo.jc=
r.component.statistics-X.Y.Z.jar, which are conf/config=
uration.xml and META-INF/aop.xml.
+
+
+ The file content below is the content of conf/config=
uration.xml that you will need to modify to add and/or remove th=
e full qualified name of the interfaces to monitor, into the list of parame=
ter values of the init param called targetInterfaces.
+
+ <configuration xmln=
s:xsi=3D"http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLoc=
ation=3D"http://www.exoplaform.org/xml/ns/kernel_1_2.xsd http://www.ex=
oplaform.org/xml/ns/kernel_1_2.xsd"
+ xmlns=3D"http://www.exoplaform.org/xml/ns/kernel_1_2.xsd">
+
+ <component>
+ <type>org.exoplatform.services.jcr.statistics.JCRAPIAspectConfig&=
lt;/type>
+ <init-params>
+ <values-param>
+ <name>targetInterfaces</name>
+ <value>org.exoplatform.services.jcr.core.ExtendedSession</=
value>
+ <value>org.exoplatform.services.jcr.core.ExtendedNode</val=
ue>
+ <value>javax.jcr.Property</value>
+ </values-param>
+ </init-params>
+ </component>
+</configuration>
+
+ The file content below is the content of META-INF/ao=
p.xml that you will to need to modify to add and/or remove the f=
ull qualified name of the interfaces to monitor, into the expression filter=
of the pointcut called JCRAPIPointcut.
+
+
+ By default only JCR API calls from the exoplatform packages are taken into account. This filter can be modified to add=
other package names.
+
+ <aspectj>
+ <aspects>
+ <concrete-aspect name=3D"org.exoplatform.services.jcr.statisti=
cs.JCRAPIAspectImpl" extends=3D"org.exoplatform.services.jcr.stat=
istics.JCRAPIAspect">
+ <pointcut name=3D"JCRAPIPointcut"
+ expression=3D"(target(org.exoplatform.services.jcr.core.Exten=
dedSession) || target(org.exoplatform.services.jcr.core.ExtendedNode) || ta=
rget(javax.jcr.Property)) && call(public * *(..))" />
+ </concrete-aspect>
+ </aspects>
+ <weaver options=3D"-XnoInline">
+ <include within=3D"org.exoplatform..*" />
+ </weaver>
+</aspectj>
+
+ The corresponding CSV files are of type Statistics${interface-name}-${creation-timestam=
p}.csv for more details about how the CS=
V files are managed, please refer to the section dedicated to th=
e statistics manager.
+
+
+ The format of each column header is ${method-alia=
s}-${metric-alias}. The method ali=
as will be of type ${method-name}(semicolon-delimited-list-of-=
parameter-types-to-be-compatible-with-the-CSV-format).
+
+
+ The metric alias are described in the statistics manager secti=
on.
+
+
+ The name of the category of statistics corresponding to these =
statistics is the simple name of the monitored interface (e.g. Ext=
endedSession for org.exoplatform.services.jcr.core.Exten=
dedSession), this name is mostly needed to access to the statisti=
cs through JMX.
+
+
+ Performance Consideration
+
+ Please note that this feature will affect the performances=
of eXo JCR so it must be used with caution.
+
+
+
+
+ Statistics Manager
+
+ The statistics manager manages all the statistics provided by =
eXo JCR, it is responsible of printing the data into the CSV files and also=
exposing the statistics through JMX and/or Rest.
+
+
+ The statistics manager will create all the CSV files for each =
category of statistics that it manages, the format of those files is Statistics${category-name}-${creation-timestamp}.csv. Those =
files will be created into the user directory if it is possible otherwise i=
t will create them into the temporary directory. The format of those files =
is CSV (i.e. Comma-Separated Values), one new line will be a=
dded regularly (every 5 seconds by default) and one last line will be added=
at JVM exit. Each line, will be composed of the 5 figures described below =
for each method and globally for all the methods.
+
+
+
+ Metric Alias
+
+
+
+ Min
+ The minimum time spent into the method expressed i=
n milliseconds.
+
+
+ Max
+ The maximum time spent into the method expressed i=
n milliseconds.
+
+
+ Total
+ The total amount of time spent into the method exp=
ressed in milliseconds.
+
+
+ Avg
+ The average time spent into the method expressed i=
n milliseconds.
+
+
+ Times
+ The total amount of times the method has been call=
ed.
+
+
+
+
+ You can disable the persistence of the statistics by setting =
the JVM parameter called JCRStatisticsManager.persistence.enable=
d to false. It is set to true by default.
+
+
+ You can also define the period of time between each record (th=
at is, line of data into the file) by setting the JVM parameter called JCRStatisticsManager.persistence.timeout to your expecte=
d value expressed in milliseconds. It is set to 5000 by =
default.
+
+
+ You can also access to the statistics via JMX. The available m=
ethods are:
+
+
+
+ JMX Methods
+
+
+
+ getMin
+ Give the minimum time spent into the method corres=
ponding to the given category name and statistics name. The expected argume=
nts are the name of the category of statistics (JDBCStorageConnect=
ion for example) and the name of the expected method or global fo=
r the global value.
+
+
+ getMax
+ Give the maximum time spent into the method corres=
ponding to the given category name and statistics name. The expected argume=
nts are the name of the category of statistics and the name of the expected=
method or global for the global value.
+
+
+ getTotal
+ Give the total amount of time spent into the metho=
d corresponding to the given category name and statistics name. The expecte=
d arguments are the name of the category of statistics and the name of the =
expected method or global for the global value.
+
+
+ getAvg
+ Give the average time spent into the method corres=
ponding to the given category name and statistics name. The expected argume=
nts are the name of the category of statistics and the name of the expected=
method or global for the global value.
+
+
+ getTimes
+ Give the total amount of times the method has been=
called corresponding to the given category name and statistics name. The e=
xpected arguments are the name of the category of statistics (e.g. JDBCStor=
ageConnection) and the name of the expected method or global for the global=
value.
+
+
+ reset
+ Reset the statistics for the given category name a=
nd statistics name. The expected arguments are the name of the category of =
statistics and the name of the expected method or global for the global val=
ue.
+
+
+ resetAll
+ Reset all the statistics for the given category na=
me. The expected argument is the name of the category of statistics (e.g. J=
DBCStorageConnection).
+
+
+
+
+ The full name of the related MBean is xo:service=3Ds=
tatistic, view=3Djcr.
+
+
+
+
+ Checking repository integrity and consistency
+
+ JMX-based consistency tool
+
+ It is important to check the integrity and consistency of system regula=
rly, especially if there is no, or stale, backups. The JBoss Portal Platfor=
m JCR implementation offers an innovative JMX-based complex checking tool.
+
+
+ During an inspection, the tool checks every major JCR component, such a=
s persistent data layer and the index. The persistent layer includes JDBC D=
ata Container and Value-Storages if they are configured.
+
+
+ The database is verified using the set of complex specialized domain-sp=
ecific queries. The Value Storage tool checks the existence of, and access =
to, each file.
+
+
+ Access to the check tool is exposed via the JMX interface, with the fol=
lowing operations available:
+
+
+ Available methods
+
+
+
+
+ checkRepositoryDataConsistency()
+
+ Inspect full repository data (db, value storage and =
search index)
+
+
+
+ checkRepositoryDataBaseConsistency()
+
+ Inspect only DB
+
+
+
+ checkRepositoryValueStorageConsistency()
+
+ Inspect only ValueStorage
+
+
+
+ checkRepositorySearchIndexConsistency()
+
+ Inspect only SearchIndex
+
+
+
+
+
+ All inspection activities and corrupted data details are stored in a fi=
le in the app directory and named as per the following=
convention: report-<repository name>-dd-MMM-yy-HH-mm.txt .
+
+
+ The path to the file will be returned in result message also at the end=
of the inspection.
+
+
+
+ There are three types of inconsistency (Warning, Error and Index) and =
two of them are critical (Errors and Index):
+
+
+
+
+ Index faults are marked as "Reindex" and can be fixed by r=
e-indexing the workspace.
+
+
+
+
+ Errors can only be fixed manually.
+
+
+
+
+ Warnings can be a normal situation in some cases and usually product=
ion system will still remain fully functional.
+
+
+
+
+
+
-
+ DOC NOTE: Could possibly be moved to a specific Tuning Guide later=
-->
+ JCR Performance Tuning Guide
+
+ Introduction
+
+ This section will show you various ways of improving JCR performance.
+
+
+ It is intended for Administrators and others who want to use the JCR fe=
atures more efficiently.
+
+
+
+ JCR Performance and Scalability
+
+ Cluster configuration
+
+ The table below contains details about the configuration of the cluste=
r used in benchmark testing:
+
+
+
+
+
+ Performance Tuning Guide
+
+ JBoss Enterprise Application Platform 6 Tuning
+
+ You can use maxThreads parameter to increase ma=
ximum amount of threads that can be launched in AS instance. This can impro=
ve performance if you need a high level of concurrency. also you can use -XX:+UseParallelGC java directory to use parallel garbage collec=
tor.
+
+
+ Note
+
+ Beware of setting maxThreads too big, this can=
cause OutOfMemoryError. We've got it w=
ith maxThreads=3D1250 on such machine:
+
+
+ 7.5 GB memory
+ 4 EC2 Compute Units (2 virtual cores with 2 EC2 Comput=
e Units each)
+ 850 GB instance storage (2=C3=97420 GB plus 10 GB root=
partition)
+ 64-bit platform
+ I/O Performance: High
+ API name: m1.large
+ java -Xmx 4g
+
+
+
+
+ JCR Cache Tuning
+
+ Cache size
+
+
+ JCR-cluster implementation is built using JBoss Cache as distributed, =
replicated cache. But there is one particularity related to remove action i=
n it. Speed of this operation depends on the actual size of cache. As many =
nodes are currently in cache as much time is needed to remove one particula=
r node (subtree) from it.
+
+
+ Eviction
+
+
+ Manipulations with eviction wakeUpInterval valu=
e does not affect on performance. Performance results with values from 500 =
up to 3000 are approximately equal.
+
+
+ Transaction Timeout
+
+
+ Using short timeout for long transactions such as Export/Import, remov=
ing huge subtree defined timeout may cause TransactionTimeou=
tException.
+
+
+
+ Clustering
+
+ For performance it is better to have a load-balancer, DB server and sh=
ared NFS on different computers. If in some reasons you see that one node g=
ets more load than others you can decrease this load using load value in lo=
ad balancer.
+
+
+ JGroups configuration
+
+
+ It's recommended to use "multiplexer stack" feature pre=
sent in JGroups. It is set by default in eXo JCR and offers higher performa=
nce in cluster, using less network connections also. If there are two or mo=
re clusters in your network, please check that they use different ports and=
different cluster names.
+
+
+ Write performance in cluster
+
+
+ Exo JCR implementation uses Lucene indexing engine to provide search c=
apabilities. But Lucene brings some limitations for write operations: it ca=
n perform indexing only in one thread. That is why write performance in clu=
ster is not higher than in singleton environment. Data is indexed on coordi=
nator node, so increasing write-load on cluster may lead to ReplicationTime=
out exception. It occurs because writing threads queue in the indexer and u=
nder high load timeout for replication to coordinator will be exceeded.
+
+
+ Taking in consideration this fact, it is recommended to exceed replTimeout value in cache configurations in case of high w=
rite-load.
+
+
+ Replication timeout
+
+
+ Some operations may take too much time. So if you get R=
eplicationTimeoutException try increasing replication timeo=
ut:
+
+ <clustering mo=
de=3D"replication" clusterName=3D"${jbosscache-cluster-name}=
">
+ ...
+ <sync replTimeout=3D"60000" />
+ </clustering>
+
+
+ value is set in milliseconds.
+
+
+
+
+
+ eXo JCR with JBoss Portal Platform
+
+ How to use a Managed DataSource under JBoss Enterprise Applic=
ation Platform 6
+
+ Configurations Steps
+
+ Declaring the Datasources in the AS
+ NEEDINFO - FILE PATHS - I know this isn't right. Wh=
ere do these get deployed again?
+
+ To declare the datasources using a JBoss application server, deploy a =
ds file (XXX-ds.xml=
) into the deploy directory of the appropri=
ate server profile (/server/PROFILE/de=
ploy, for example).
+
+
+ This file configures all datasources which JBoss Portal Platform will =
need (there should be four specifically named: jdbcjcr_portal, jdbcjcr_portal-sample, jdbcidm_port=
al and jdbcidm_sample-portal).
+
+
+ For example:
+
+ <?xml version=
=3D"1.0" encoding=3D"UTF-8"?>
+<datasources>
+ <no-tx-datasource>
+ <jndi-name>jdbcjcr_portal</jndi-name>
+ <connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbc=
jcr_portal</connection-url>
+ <driver-class>org.hsqldb.jdbcDriver</driver-class>
+ <user-name>sa</user-name>
+ <password></password>
+ </no-tx-datasource>
+
+ <no-tx-datasource>
+ <jndi-name>jdbcjcr_sample-portal</jndi-name>
+ <connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbc=
jcr_sample-portal</connection-url>
+ <driver-class>org.hsqldb.jdbcDriver</driver-class>
+ <user-name>sa</user-name>
+ <password></password>
+ </no-tx-datasource>
+
+ <no-tx-datasource>
+ <jndi-name>jdbcidm_portal</jndi-name>
+ <connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbc=
idm_portal</connection-url>
+ <driver-class>org.hsqldb.jdbcDriver</driver-class>
+ <user-name>sa</user-name>
+ <password></password>
+ </no-tx-datasource>
+
+ <no-tx-datasource>
+ <jndi-name>jdbcidm_sample-portal</jndi-name>
+ <connection-url>jdbc:hsqldb:${jboss.server.data.dir}/data/jdbc=
idm_sample-portal</connection-url>
+ <driver-class>org.hsqldb.jdbcDriver</driver-class>
+ <user-name>sa</user-name>
+ <password></password>
+ </no-tx-datasource>
+</datasources>
+
+ The properties can be set for datasource can be found here: Configuring JD=
BC DataSources - The non transactional DataSource configuration schema
+
+
+
+ Do not bind datasources explicitly
+
+ Do not let the portal explicitly bind datasources.
+ NEEDINFO - FILE PATHS - I think some of the values have =
changed in the referenced file when I look at the new file below. New info =
required?
+ Edit the JPP_HOME/sta=
ndalone/configuration/gatein/configuration.properties and commen=
t out the following rows in the JCR section:
+
+ #gatein.jcr.datasource.driver=3Dorg.hsqldb.jdbcD=
river
+#gatein.jcr.datasource.url=3Djdbc:hsqldb:file:${gatein.db.data.dir}/data/j=
dbcjcr_${name}
+#gatein.jcr.datasource.username=3Dsa
+#gatein.jcr.datasource.password=3D
+
+ Comment out the following lines in the IDM section:
+
+ #gatein.idm.datasource.driver=3Dorg.hsqldb.jdbcD=
river
+#gatein.idm.datasource.url=3Djdbc:hsqldb:file:${gatein.db.data.dir}/data/j=
dbcidm_${name}
+#gatein.idm.datasource.username=3Dsa
+#gatein.idm.datasource.password=3D
+
+ Open the jcr-configuration.xml and idm-=
configuration.xml files and comment out references to the plug-i=
n InitialContextInitializer.
+
+ <!-- Commented =
because, Datasources are declared and bound by AS, not in eXo -->
+<!--
+<external-component-plugins>
+ [...]
+</external-component-plugins>
+-->
+
+
+
+
--===============2260912015684865299==--