Author: rhauch
Date: 2009-06-09 15:46:34 -0400 (Tue, 09 Jun 2009)
New Revision: 1017
Removed:
trunk/docs/reference/src/main/docbook/en-US/content/core/configuration.xml
Modified:
trunk/docs/reference/src/main/docbook/en-US/content/core/connector.xml
trunk/docs/reference/src/main/docbook/en-US/content/core/sequencing.xml
trunk/docs/reference/src/main/docbook/en-US/custom.dtd
trunk/docs/reference/src/main/docbook/en-US/master.xml
Log:
Updated the sequencing chapter of the Reference Guide.
Deleted: trunk/docs/reference/src/main/docbook/en-US/content/core/configuration.xml
===================================================================
--- trunk/docs/reference/src/main/docbook/en-US/content/core/configuration.xml 2009-06-09
18:48:18 UTC (rev 1016)
+++ trunk/docs/reference/src/main/docbook/en-US/content/core/configuration.xml 2009-06-09
19:46:34 UTC (rev 1017)
@@ -1,371 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- ~ JBoss DNA (
http://www.jboss.org/dna)
- ~
- ~ See the COPYRIGHT.txt file distributed with this work for information
- ~ regarding copyright ownership. Some portions may be licensed
- ~ to Red Hat, Inc. under one or more contributor license agreements.
- ~ See the AUTHORS.txt file in the distribution for a full listing of
- ~ individual contributors.
- ~
- ~ JBoss DNA is free software. Unless otherwise indicated, all code in JBoss DNA
- ~ is licensed to you under the terms of the GNU Lesser General Public License as
- ~ published by the Free Software Foundation; either version 2.1 of
- ~ the License, or (at your option) any later version.
- ~
- ~ JBoss DNA is distributed in the hope that it will be useful,
- ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
- ~ for more details.
- ~
- ~ You should have received a copy of the GNU Lesser General Public License
- ~ along with this distribution; if not, write to:
- ~ Free Software Foundation, Inc.
- ~ 51 Franklin Street, Fifth Floor
- ~ Boston, MA 02110-1301 USA
- -->
-<!DOCTYPE preface PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
-<!ENTITY % CustomDTD SYSTEM "../../custom.dtd">
-%CustomDTD;
-]>
-<chapter id="configuration">
- <title>Configuring and Using JBoss DNA</title>
- <para>Using JBoss DNA within your application is actually quite straightforward.
As you'll see in this chapter,
- the first step is setting up JBoss DNA and starting the
<code>JcrEngine</code>. After that, you obtain the
- <code>javax.jcr.Repository</code> instance for a named repository and just
use the standard JCR API throughout your
- application.
- </para>
- <sect1 id="jcr_engine">
- <title>JBoss DNA's JcrEngine</title>
- <para>
- JBoss DNA encapsulates everything necessary to run one or more JCR repositories into a
single &JcrEngine; instance.
- This includes all underlying repository sources, the pools of connections to the
sources, the sequencers,
- the MIME type detector(s), and the &Repository; implementations.
- </para>
- <para>
- Obtaining a &JcrEngine; instance is very easy - assuming that you have a valid
&JcrConfiguration; instance. We'll see
- how to get one of those in a little bit, but if you have one then all you have to do
is build and start the engine:
- </para>
- <programlisting role="JAVA"><![CDATA[
-JcrConfiguration config = ...
-JcrEngine engine = config.build();
-engine.start();
- ]]></programlisting>
- <para>
- Obtaining a JCR &Repository; instance is a matter of simply asking the engine for
it by the name defined in the configuration:
- </para>
- <programlisting role="JAVA"><![CDATA[
-javax.jcr.Repository repository = engine.getRepository("Name of repository");
- ]]></programlisting>
- <para>
- At this point, your application can proceed by working with the JCR API.
- </para>
- <para>
- And, once you're finished with the &JcrEngine;, you should shut it down:
- </para>
- <programlisting role="JAVA"><![CDATA[
-engine.shutdown();
-engine.awaitTermination(3,TimeUnit.SECONDS); // optional
- ]]></programlisting>
- <para>
- When the <code>shutdown()</code> method is called, the &Repository;
instances managed by the engine are marked as being shut down,
- and they will not be able to create new &Session;s. However, any existing
&Session;s or ongoing operations (e.g., event notifications)
- present at the time of the <code>shutdown()</code> call will be allowed to
finish.
- In essence, <code>shutdown()</code> is a
<emphasis>graceful</emphasis> request, and since it may take some time to
complete,
- you can wait until the shutdown has completed by simply calling
<code>awaitTermination(...)</code> as shown above.
- This method will block until the engine has indeed shutdown or until the supplied time
duration has passed (whichever comes first).
- And, yes, you can call the <code>awaitTermination(...)</code> method
repeatedly if needed.
- </para>
- </sect1>
- <sect1 id="jcr_configuration">
- <title>JcrConfiguration</title>
- <para>
- The previous section assumed the existence of a &JcrConfiguration;. It's not
really that creating an instance is all that difficult.
- In fact, there's only one no-argument constructor, so actually creating the
instance is a piece of cake. What can be a little more challenging,
- though, is setting up the &JcrConfiguration; instance, which must define the
following components:
- <itemizedlist>
- <listitem>
- <para><emphasis role="strong"><code>Repository
sources</code></emphasis> are the POJO objects that each describe a
particular
- location where content is stored. Each repository source object is an instance of a
JBoss DNA connector, and is configured
- with the properties that particular source. JBoss DNA's &RepositorySource;
classes are analogous to JDBC's &DataSource; classes -
- they are implemented by specific connectors (aka, "drivers") for specific
kinds of repository sources (aka, "databases").
- Similarly, a &RepositorySource; instance is analogous to a &DataSource;
instance, with bean properties for each configurable
- parameter. Therefore, each repository source definition must supply the name of the
&RepositorySource; class, any
- bean properties, and, optionally, the classpath that should be used to load the
class. </para>
- </listitem>
- <listitem>
- <para><emphasis
role="strong"><code>Repositories</code></emphasis> define
the JCR repositories that are available. Each
- repository has a unique name that is used to obtain the &Repository; instance
from the &JcrEngine;'s <code>getRepository(String)</code>
- method, but each repository definition also can include the predefined namespaces
(other than those automatically defined by
- JBoss DNA), various options, and the node types that are to be available in the
repository without explicit registration
- through the JCR API.</para>
- </listitem>
- <listitem>
- <para><emphasis
role="strong"><code>Sequencers</code></emphasis> define the
particular sequencers that are available for use.
- Each sequencer definition provides the path expressions governing which nodes in the
repository should be sequenced when those nodes change,
- and where the resulting output generated by the sequencer should be placed. The
definition also must state the name of
- the sequencer class, any bean properties and, optionally, the classpath that should
be used to load the class.</para>
- </listitem>
- <listitem>
- <para><emphasis role="strong"><code>MIME type
detectors</code></emphasis> define the particular MIME type detector(s) that
should
- be made available. A MIME type detector does exactly what the name implies: it
attempts to determine the MIME type given a
- "filename" and contents. JBoss DNA automatically uses a detector that
uses the file extension to identify the MIME type,
- but also provides an implementation that uses an external library to identify the
MIME type based upon the contents.
- The definition must state the name of the detector class, any bean properties and,
optionally, the classpath that should
- be used to load the class.</para>
- </listitem>
- </itemizedlist>
- </para>
- <para>
- There really are three options:
- <itemizedlist>
- <listitem>
- <para><emphasis role="strong"><code>Load from a
file</code></emphasis> is conceptually the easiest and requires the least
amount
- of Java code, but it now requires a configuration file.</para>
- </listitem>
- <listitem>
- <para><emphasis role="strong"><code>Load from a
configuration repository</code></emphasis> is not much more complicated than
loading
- from a file, but it does allow multiple &JcrEngine; instances (usually in
different processes perhaps on different machines)
- to easily access their (shared) configuration. And technically, loading the
configuration from a file really just creates an
- &InMemoryRepositorySource;, imports the configuration file into that source, and
then proceeds with this approach.</para>
- </listitem>
- <listitem>
- <para><emphasis role="strong"><code>Programmatic
configuration</code></emphasis> is always possible, even if the configuration
is loaded
- from a file or repository. Using the &JcrConfiguration;'s API, you can
define (or update or remove) all of the definitions that make
- up a configuration.</para>
- </listitem>
- </itemizedlist>
- </para>
- <para>
- Each of these approaches has their obvious advantages, so the choice of which one to
use is entirely up to you.
- </para>
- <sect2 id="loading_from_file">
- <title>Loading from a configuration file</title>
- <para>
- Loading the JBoss DNA configuration from a file is actually very simple:
- </para>
- <programlisting role="JAVA"><![CDATA[
-JcrConfiguration config = new JcrConfiguration();
-configuration.loadFrom(file);
- ]]></programlisting>
- <para>
- where the <code>file</code> parameter can actually be a &File;
instance, a &URL; to the file, an &InputStream;
- containing the contents of the file, or even a &String; containing the contents
of the file.
- </para>
- <note>
- <para>The <code>loadFrom(...)</code> method can be called any
number of times, but each time it is called it completely wipes
- out any current notion of the configuration and replaces it with the configuration
found in the file.
- </para>
- </note>
- <para>
- There is an optional second parameter that defines the &Path; within the
configuration file identifying the parent node of the various
- configuration nodes. If not specified, it assumes "/". This makes it
possible for the configuration content to be
- located at a different location in the hierarchical structure. (This is not often
required, but when it is required
- this second parameter is very useful.)
- </para>
- <para>
- Here is the configuration file that is used in the repository example, though it has
been simplified a bit and most comments
- have been removed for clarity):
- </para>
- <programlisting role="JAVA"><![CDATA[
-<?xml version="1.0" encoding="UTF-8"?>
-<configuration
xmlns="http://www.jboss.org/dna/1.0"
xmlns:jcr="http://www.jcp.org/jcr/1.0">
- <!--
- Define the JCR repositories
- -->
- <dna:repositories>
- <!--
- Define a JCR repository that accesses the 'Cars' source directly.
- This of course is optional, since we could access the same content through
'vehicles'.
- -->
- <dna:repository jcr:name="car repository"
dna:source="Cars">
- <options jcr:primaryType="dna:options"/>
- <jaasLoginConfigName jcr:primaryType="dna:option"
dna:value="dna-jcr"/>
- </options>
- </dna:repository>
- </dna:repositories>
- <!--
- Define the sources for the content. These sources are directly accessible using the
DNA-specific Graph API.
- -->
- <dna:sources jcr:primaryType="nt:unstructured">
- <dna:source jcr:name="Cars"
dna:classname="org.jboss.dna.graph.connector.inmemory.InMemoryRepositorySource"
dna:retryLimit="3" dna:defaultWorkspaceName="workspace1"/>
- <dna:source jcr:name="Aircraft"
dna:classname="org.jboss.dna.graph.connector.inmemory.InMemoryRepositorySource">
- <!-- Define the name of the workspace used by default. Optional, but
convenient. -->
- <defaultWorkspaceName>workspace2</defaultWorkspaceName>
- </dna:source>
- </dna:sources>
- <!--
- Define the sequencers. This is an optional section. For this example, we're not
using any sequencers.
- -->
- <dna:sequencers>
- <!--dna:sequencer jcr:name="Image Sequencer"
dna:classname="org.jboss.dna.sequencer.image.ImageMetadataSequencer">
- <dna:description>Image metadata sequencer</dna:description>
- <dna:pathExpression>/foo/source =>
/foo/target</dna:pathExpression>
- <dna:pathExpression>/bar/source =>
/bar/target</dna:pathExpression>
- </dna:sequencer-->
- </dna:sequencers>
- <dna:mimeTypeDetectors>
- <dna:mimeTypeDetector jcr:name="Detector"
dna:description="Standard extension-based MIME type detector"/>
- </dna:mimeTypeDetectors>
-</configuration>
- ]]></programlisting>
- </sect2>
- <sect2 id="loading_from_repository">
- <title>Loading from a configuration repository</title>
- <para>
- Loading the JBoss DNA configuration from an existing repository is also pretty
straightforward. Simply create and configure the
- &RepositorySource; instance to point to the desired repository, and then call the
<code>loadFrom(&RepositorySource; source)</code>
- method:
- </para>
- <programlisting role="JAVA"><![CDATA[
-RepositorySource configSource = ...
-JcrConfiguration config = new JcrConfiguration();
-configuration.loadFrom(configSource);
- ]]></programlisting>
- <para>
- This really is a more advanced way to define your configuration, so we won't go
into how you configure a &RepositorySource;.
- For more information, consult the &ReferenceGuide;.
- </para>
- <note>
- <para>The <code>loadFrom(...)</code> method can be called any
number of times, but each time it is called it completely wipes
- out any current notion of the configuration and replaces it with the configuration
found in the file.
- </para>
- </note>
- <para>
- There is an optional second parameter that defines the name of the workspace in the
supplied source where the configuration content
- can be found. It is not needed if the workspace is the source's default
workspace.
- There is an optional third parameter that defines the &Path; within the
configuration repository identifying the parent node of the various
- configuration nodes. If not specified, it assumes "/". This makes it
possible for the configuration content to be
- located at a different location in the hierarchical structure. (This is not often
required, but when it is required
- this second parameter is very useful.)
- </para>
- </sect2>
- <sect2 id="programmatic_configuration">
- <title>Programmatic configuration</title>
- <para>
- Defining the configuration programmatically is not terribly complicated, and it for
obvious reasons results in more verbose Java code.
- But this approach is very useful and often the easiest approach when the
configuration must change or is a reflection of other
- dynamic information.
- </para>
- <para>
- The &JcrConfiguration; class was designed to have an easy-to-use API that makes
it easy to configure each of the different kinds of
- components, especially when using an IDE with code completion. Here are several
examples:
- </para>
- <sect3 id="programmatically_configuring_sources">
- <title>Repository sources</title>
- <para>Each repository source definition must include the name of the
&RepositorySource; class as well as each bean property
- that should be set on the object:
- </para>
- <programlisting role="JAVA"><![CDATA[
-JcrConfiguration config = ...
-config.repositorySource("source A")
- .usingClass(InMemoryRepositorySource.class)
- .setDescription("The repository for our content")
- .setProperty("defaultWorkspaceName", workspaceName);
- ]]></programlisting>
- <para>
- This example defines an in-memory source with the name "source A", a
description, and a single "defaultWorkspaceName" bean property.
- Different &RepositorySource; implementations will the bean properties that are
required and optional.
- Of course, the class can be specified as Class reference or a string (followed by
whether the class should be loaded from
- the classpath or from a specific classpath).
- </para>
- <note>
- <para>Each time <code>repositorySource(String)</code> is called,
it will either load the existing definition with the supplied
- name or will create a new definition if one does not already exist. To remove a
definition, simply call <code>remove()</code>
- on the result of <code>repositorySource(String)</code>.
- The set of existing definitions can be accessed with the
<code>repositorySources()</code> method.
- </para>
- </note>
- </sect3>
- <sect3 id="programmatically_configuring_repositories">
- <title>Repositories</title>
- <para>Each repository must be defined to use a named repository source, but all
other aspects (e.g., namespaces, node types, options)
- are optional.</para>
- <programlisting role="JAVA"><![CDATA[
-JcrConfiguration config = ...
-config.repository("repository A")
- .addNodeTypes("myCustomNodeTypes.cnd")
- .setSource("source 1")
- .registerNamespace("acme","http://www.example.com/acme")
- .setOption(JcrRepository.Option.JAAS_LOGIN_CONFIG_NAME, "dna-jcr");
- ]]></programlisting>
- <para>
- This example defines a repository that uses the "source 1" repository
source (which could be a federated source, an in-memory source,
- a database store, or any other source). Additionally, this example adds the node
types in the "myCustomNodeTypes.cnd" file as those
- that will be made available when the repository is accessed. It also defines the
"http://www.example.com/acme" namespace,
- and finally sets the "JAAS_LOGIN_CONFIG_NAME" option to define the name of
the JAAS login configuration that should be used by
- the JBoss DNA repository.
- </para>
- <note>
- <para>Each time <code>repository(String)</code> is called, it will
either load the existing definition with the supplied
- name or will create a new definition if one does not already exist. To remove a
definition, simply call <code>remove()</code>
- on the result of <code>repository(String)</code>.
- The set of existing definitions can be accessed with the
<code>repositories()</code> method.
- </para>
- </note>
- </sect3>
- <sect3 id="programmatically_configuring_sequencers">
- <title>Sequencers</title>
- <para>Each defined sequencer must specify the name of the &StreamSequencer;
implementation class as well as the path expressions
- defining which nodes should be sequenced and the output paths defining where the
sequencer output should be placed (often as a function
- of the input path expression).</para>
- <programlisting role="JAVA"><![CDATA[
-JcrConfiguration config = ...
-config.sequencer("Image Sequencer")
- .usingClass("org.jboss.dna.sequencer.image.ImageMetadataSequencer")
- .loadedFromClasspath()
- .setDescription("Sequences image files to extract the characteristics of the
image")
-
.sequencingFrom("//(*.(jpg|jpeg|gif|bmp|pcx|png|iff|ras|pbm|pgm|ppm|psd)[*])/jcr:content[@jcr:data]")
- .andOutputtingTo("/images/$1");
- ]]></programlisting>
- <para>
- This shows an example of a sequencer definition named "Image Sequencer"
that uses the &ImageMetadataSequencer; class
- (loaded from the classpath), that is to sequence the "jcr:data" property
on any new or changed nodes that are named
- "jcr:content" below a parent node with a name ending in ".jpg",
".jpeg", ".gif", ".bmp", ".pcx", ".iff",
".ras",
- ".pbm", ".pgm", ".ppm" or ".psd". The
output of the sequencing operation should be placed at the "/images/$1" node,
- where the "$1" value is captured as the name of the parent node. (The
capture groups work the same was as regular expressions;
- see the &ReferenceGuide; for more details.)
- Of course, the class can be specified as Class reference or a string (followed by
whether the class should be loaded from
- the classpath or from a specific classpath).
- </para>
- <note>
- <para>Each time <code>sequencer(String)</code> is called, it will
either load the existing definition with the supplied
- name or will create a new definition if one does not already exist. To remove a
definition, simply call <code>remove()</code>
- on the result of <code>sequencer(String)</code>.
- The set of existing definitions can be accessed with the
<code>sequencers()</code> method.
- </para>
- </note>
- </sect3>
- <sect3 id="programmatically_configuring_mime_type_detectors">
- <title>MIME type detectors</title>
- <para>Each defined MIME type detector must specify the name of the
&MimeTypeDetector; implementation class as well as any
- other bean properties required by the implementation.</para>
- <programlisting role="JAVA"><![CDATA[
-JcrConfiguration config = ...
-config.mimeTypeDetector("Extension Detector")
- .usingClass(org.jboss.dna.graph.mimetype.ExtensionBasedMimeTypeDetector.class);
- ]]></programlisting>
- <para>
- Of course, the class can be specified as Class reference or a string (followed by
whether the class should be loaded from
- the classpath or from a specific classpath).
- </para>
- <note>
- <para>Each time <code>mimeTypeDetector(String)</code> is called,
it will either load the existing definition with the supplied
- name or will create a new definition if one does not already exist. To remove a
definition, simply call <code>remove()</code>
- on the result of <code>mimeTypeDetector(String)</code>.
- The set of existing definitions can be accessed with the
<code>mimeTypeDetectors()</code> method.
- </para>
- </note>
- </sect3>
- </sect2>
- </sect1>
- <sect1 id="using_dna_whats_next">
- <title>What's next</title>
- <para>
- This chapter outline how you configure JBoss DNA, how you then access a
<code>javax.jcr.Repository</code> instance,
- and use the standard JCR API to interact with the repository. The
- <link linkend="downloading_and_running">next chapter </link>
walks you through downloading
- and running the JBoss DNA examples.
- </para>
- </sect1>
-</chapter>
Modified: trunk/docs/reference/src/main/docbook/en-US/content/core/connector.xml
===================================================================
--- trunk/docs/reference/src/main/docbook/en-US/content/core/connector.xml 2009-06-09
18:48:18 UTC (rev 1016)
+++ trunk/docs/reference/src/main/docbook/en-US/content/core/connector.xml 2009-06-09
19:46:34 UTC (rev 1017)
@@ -260,6 +260,7 @@
As for testing, you probably will want to add more dependencies, such as those listed
here:
</para>
<programlisting role="XML"><![CDATA[
+<!-- DNA-related unit testing utilities and classes -->
<dependency>
<groupId>org.jboss.dna</groupId>
<artifactId>dna-graph</artifactId>
Modified: trunk/docs/reference/src/main/docbook/en-US/content/core/sequencing.xml
===================================================================
--- trunk/docs/reference/src/main/docbook/en-US/content/core/sequencing.xml 2009-06-09
18:48:18 UTC (rev 1016)
+++ trunk/docs/reference/src/main/docbook/en-US/content/core/sequencing.xml 2009-06-09
19:46:34 UTC (rev 1017)
@@ -30,229 +30,233 @@
]>
<chapter id="sequencing_framework">
<title>Sequencing framework</title>
- <para>As we've mentioned before, JBoss DNA is able to work with existing JCR
repositories. Your client applications
- make changes to the information in those repositories, and JBoss DNA automatically
uses its sequencers to extract
- additional information from the uploaded files.</para>
<para>
- This chapter discusses the sequencing features of JBoss DNA and the components that are
involved.
+ Many repositories are used (at least in part) to manage files and other artifacts,
including
+ service definitions, policy files, images, media, documents, presentations, application
components,
+ reusable libraries, configuration files, application installations, databases schemas,
management scripts, and so on.
+ Unlocking the information buried within all of those files is what JBoss DNA sequencing
is all about.
+ As files are loaded into the repository, you JBoss DNA can automatically sequence these
files to extract
+ from their content meaningful information that can be stored in the repository, where
it can then be
+ searched, accessed, and analyzed using the JCR API.
</para>
- <sect1 id="sequencing-service">
- <title>Sequencing Service</title>
- <para>The JBoss DNA <emphasis>sequencing service</emphasis> is the
component that manages the <emphasis>sequencers</emphasis>,
- reacting to changes in JCR repositories and then running the appropriate sequencers.
- This involves processing the changes on a node, determining which (if any)
sequencers should be run on that node,
- and for each sequencer constructing the execution environment, calling the
sequencer, and saving the information
- generated by the sequencer.</para>
- <note>
- <para>Configuring JBoss DNA services is a bit more manual than is ideal. As
you'll see, JBoss DNA uses dependency
- injection to allow a great deal of flexibility in how it can be configured and
customized. But this flexibility
- makes it more difficult for you to use. We understand this, and will soon provide
a much easier way to set up
- and manage JBoss DNA. Current plans are to use the <ulink
url="http://www.jboss.org/jbossmc">JBoss Microcontainer</ulink>
- along with a configuration repository.</para>
- </note>
- <para>To set up the sequencing service, an instance is created, and dependent
components are injected into
- the object. This includes among other things:
- <itemizedlist>
- <listitem>
- <para>An <emphasis>execution context</emphasis> that defines the
context in which the service runs, including
- a factory for JCR sessions given names of the repository and workspace. This
factory must be configured,
- and is how JBoss DNA knows about your JCR repositories and how to connect to
them. More on this a bit later.</para>
- </listitem>
- <listitem>
- <para>An optional <emphasis>factory for class loaders</emphasis>
used to load sequencers. If no factory is supplied,
- the service uses the current thread's context class loader (or if that is
null, the class loader that loaded the
- sequencing service class).</para>
- </listitem>
- <listitem>
- <para>An &ExecutorService; used to execute the sequencing activites. If
none
- is supplied, a new single-threaded executor is created by calling
<code>Executors.newSingleThreadExecutor()</code>.
- (This can easily be changed by subclassing and overriding the
<code>SequencerService.createDefaultExecutorService()</code>
method.)</para>
- </listitem>
- <listitem>
- <para>Filters for sequencers and events. By default, all sequencers are
considered for "node added", "property added"
- and "property changed" events.</para>
- </listitem>
- </itemizedlist>
- </para>
- <para>As mentioned above, the &JcrExecutionContext; provides access to a
&SessionFactory; that is used
- by JBoss DNA to establish sessions to your JCR repositories. Two implementations
are available:
- <itemizedlist>
- <listitem>
- <para>The &JndiSessionFactory;> looks up JCR &Repository; instances
in JNDI using
- names that are supplied when creating sessions. This implementation also
has methods to set the
- JCR &Credentials; for a given workspace name.</para>
- </listitem>
- <listitem>
- <para>The &SimpleSessionFactory; has methods to register the JCR
&Repository; instances
- with names, as well as methods to set the JCR &Credentials; for a given
workspace name.</para>
- </listitem>
- </itemizedlist>
- You can use the &JcrExecutionContext; and use one of these &SessionFactory;
implementations or another
- implementation that you provide.</para>
- <para>Here's an example of how to instantiate and configure the
&SequencingService;:</para>
+ <sect1 id="sequencers">
+ <title>Sequencers</title>
+ <para>
+ Sequencers are just POJOs that implement a specific interface, and their job is to
process a stream of
+ data (supplied by JBoss DNA) to extract meaningful content that usually takes the form
of a structured graph.
+ Exactly what content is up to each sequencer
+ implementation. For example, JBoss DNA comes with an <link
linkend="image-sequencer">image sequencer</link>
+ that extracts the simple metadata from different kinds of image files (e.g., JPEG,
GIF, PNG, etc.).
+ Another example is the <link linkend="cnd-sequencer">Compact Node
Definition (CND) sequencer</link>
+ that processes the CND files to extract and produce a structured representation of the
node type definitions,
+ property definitions, and child node definitions contained within the file.
+ </para>
+ <para>
+ Sequencers are configured to identify the kinds of nodes that the sequencers can work
against.
+ When content in the repository changes, JBoss DNA looks to see which (if any)
sequencers might be able
+ to run on the changed content. If any sequencer configurations do match, those
sequencers are run
+ against the content, and the structured graph output of the sequencers is then written
back into the repository
+ (at a location dictated by the sequencer configuration). And once that information is
in the repository,
+ it can be easily found and accessed via the standard JCR API.
+ </para>
+ <para>
+ In other words, JBoss DNA uses sequencers to help you extract more meaning from the
artifacts you already
+ are managing, and makes it much easier for applications to find and use all that
valuable information.
+ All without your applications doing anything extra.
+ </para>
+ </sect1>
+ <sect1 id="stream-sequencers">
+ <title>Stream Sequencers</title>
+ <para>
+ The &StreamSequencer; interface defines the single method that must be implemented
by a sequencer:
+ </para>
<programlisting>
-&SimpleSessionFactory; sessionFactory = new &SimpleSessionFactory;();
-sessionFactory.registerRepository("Main Repository", this.repository);
-&Credentials; credentials = new &SimpleCredentials;("jsmith",
"secret".toCharArray());
-sessionFactory.registerCredentials("Main Repository/Workspace1", credentials);
-// Now create the JCR execution context, with a reference to the session factory
-// and the name of the repository from which sessions will be obtained ...
-ExecutionContext executionContext = new
&JcrExecutionContext;(sessionFactory,"Main Repository");
+public interface &StreamSequencer; {
-// Create the sequencing service, passing in the execution context ...
-&SequencingService; sequencingService = new &SequencingService;();
-sequencingService.setExecutionContext(executionContext);
+ /**
+ * Sequence the data found in the supplied stream, placing the output
+ * information into the supplied map.
+ *
+ * @param stream the stream with the data to be sequenced; never null
+ * @param output the output from the sequencing operation; never null
+ * @param context the context for the sequencing operation; never null
+ */
+ void sequence( &InputStream; stream, &SequencerOutput; output,
&StreamSequencerContext; context );
+}
</programlisting>
- <para>After the sequencing service is created and configured, it must be started.
The &SequencingService;
- has an <emphasis>administration object</emphasis> (that is an instance
of &ServiceAdministrator;)
- with <code>start()</code>, <code>pause()</code>, and
<code>shutdown()</code> methods. The latter method will
- close the queue for sequencing, but will allow sequencing operations already
running to complete normally.
- To wait until all sequencing operations have completed, simply call the
<code>awaitTermination</code> method
- and pass it the maximum amount of time you want to wait.</para>
- <programlisting>
-sequencingService.getAdministrator().start();
-</programlisting>
- <para>The JBoss DNA services are utilizing resources and threads that must be
released before your application is ready to shut down.
- The safe way to do this is to simply obtain the &ServiceAdministrator; for each
service (via the <code>getServiceAdministrator()</code> method)
- and call <code>shutdown()</code>. As previously mentioned, the shutdown
method will simply prevent new work from being processed
- and will not wait for existing work to be completed. If you want to wait until the
service completes all its work, you must wait
- until the service terminates. Here's an example that shows how this is
done:</para>
- <programlisting>
-// Shut down the service and wait until it's all shut down ...
-sequencingService.getAdministrator().shutdown();
-sequencingService.getAdministrator().awaitTermination(5, TimeUnit.SECONDS);
+ <para>
+ Implementations are responsible for processing the content in the supplied
&InputStream; content and generating
+ structured content using the supplied &SequencerOutput; interface.
+ The &StreamSequencerContext; provides additional details about the information
that is being sequenced,
+ including the location and properties of the node being sequenced, the MIME type
+ of the node being sequenced, and a &Problems; object where the sequencer can
record problems that aren't
+ severe enough to warrant throwing an exception. The &StreamSequencerContext; also
provides access
+ to the &ValueFactories; that can be used to create &Path;, &Name;, and any
other value objects.
+ </para>
+ <para>The &SequencerOutput; interface is fairly easy to use, and its job is
to hide from the sequencer
+ all the specifics about where the output is being written. Therefore, the interface
has only a few methods
+ for implementations to call.
+ Two methods set the property values on a node, while the other sets references to
other nodes in the repository.
+ Use these methods to describe the properties of the nodes you want to create, using
relative paths for the nodes and
+ valid JCR property names for properties and references. JBoss DNA will ensure that
nodes are created or updated
+ whenever they're needed.
+ </para>
+ <programlisting>
+public interface &SequencerOutput; {
-// Shut down the observation service ...
-observationService.getAdministrator().shutdown();
-observationService.getAdministrator().awaitTermination(5, TimeUnit.SECONDS);
-</programlisting>
- </sect1>
- <sect1 id="sequencer-configuration">
- <title>Sequencer Configurations</title>
- <para>The sequencing service must also be configured with the sequencers that it
will use. This is done using the
- <code>addSequencer(SequencerConfig)</code> method and passing a
&SequencerConfig; instance that
- you create. Here's the code that defines 3 sequencer configurations: 1 that
places image metadata into
- "<code><![CDATA[/images/<filename>]]></code>",
another that places MP3 metadata into
"<code><![CDATA[/mp3s/<filename>]]></code>",
- and a third that places a structure that represents the classes, methods, and
attributes found within Java source into
-
"<code><![CDATA[/java/<filename>]]></code>".</para>
- <programlisting>
-String name = "Image Sequencer";
-String desc = "Sequences image files to extract the characteristics of the
image";
-String classname = "org.jboss.dna.sequencer.images.ImageMetadataSequencer";
-String[] classpath = null; // Use the current classpath
-String[] pathExpressions =
{"//(*.(jpg|jpeg|gif|bmp|pcx|png)[*])/jcr:content[@jcr:data] =>
/images/$1"};
-&SequencerConfig; imageSequencerConfig = new &SequencerConfig;(name, desc,
classname,
- classpath, pathExpressions);
-sequencingService.addSequencer(imageSequencerConfig);
+ /**
+ * Set the supplied property on the supplied node. The allowable
+ * values are any of the following:
+ * - primitives (which will be autoboxed)
+ * - String instances
+ * - String arrays
+ * - byte arrays
+ * - InputStream instances
+ * - Calendar instances
+ *
+ * @param nodePath the path to the node containing the property;
+ * may not be null
+ * @param property the name of the property to be set
+ * @param values the value(s) for the property; may be empty if
+ * any existing property is to be removed
+ */
+ void setProperty( String nodePath, String property, Object... values );
+ void setProperty( &Path; nodePath, &Name; property, Object... values );
-name = "MP3 Sequencer";
-desc = "Sequences MP3 files to extract the ID3 tags from the audio file";
-classname = "org.jboss.dna.sequencer.mp3.Mp3MetadataSequencer";
-pathExpressions = {"//(*.mp3[*])/jcr:content[@jcr:data] => /mp3s/$1"};
-&SequencerConfig; mp3SequencerConfig = new &SequencerConfig;(name, desc,
classname,
- classpath, pathExpressions);
-sequencingService.addSequencer(mp3SequencerConfig);
-
-name = "Java Sequencer";
-desc = "Sequences java files to extract the characteristics of the Java
source";
-classname = "org.jboss.dna.sequencer.java.JavaMetadataSequencer";
-pathExpressions = {"//(*.java[*])/jcr:content[@jcr:data] => /java/$1"};
-&SequencerConfig; javaSequencerConfig = new &SequencerConfig;(name, desc,
classname,
- classpath, pathExpressions);
-this.sequencingService.addSequencer(javaSequencerConfig);
-</programlisting>
- <para>Each configuration defines several things, including the name,
description, and sequencer implementation class.
- The configuration also defines the classpath information, which can be passed to the
&ClassLoaderFactory; to get
- a Java &ClassLoader; with which the sequencer class can be loaded. (If no
classpath information is provided, as is done
- in the code above, the application class loader is used.) The configuration also
specifies the path expressions that
- identify the nodes that should be sequenced with the sequencer and where to store
the output generated by the sequencer.
- Path expressions are pretty straightforward but are quite powerful, so before we go
any further with the example,
- let's dive into path expressions in more detail.</para>
- <sect2 id="path_expressions">
- <title>Path Expressions</title>
- <para>Path expressions consist of two parts: a selection criteria (or an input
path) and an output path:</para>
- <programlisting><![CDATA[ inputPath => outputPath
]]></programlisting>
- <para>The <emphasis>inputPath</emphasis> part defines an expression
for the path of a node that is to be sequenced.
- Input paths consist of '<code>/</code>' separated segments,
where each segment represents a pattern for a single node's
- name (including the same-name-sibling indexes) and
'<code>@</code>' signifies a property name.</para>
- <para>Let's first look at some simple examples:</para>
- <table frame='all'>
- <title>Simple Input Path Examples</title>
- <tgroup cols='2' align='left' colsep='1'
rowsep='1'>
- <colspec colname='c1' colwidth="1*"/>
- <colspec colname='c2' colwidth="1*"/>
- <thead>
- <row>
- <entry>Input Path</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tbody>
- <row><entry>/a/b</entry><entry>Match node
"<code>b</code>" that is a child of the top level node
"<code>a</code>". Neither node
- may have any same-name-sibilings.</entry></row>
- <row><entry>/a/*</entry><entry>Match any child node of the
top level node "<code>a</code>".</entry></row>
- <row><entry>/a/*.txt</entry><entry>Match any child node of
the top level node "<code>a</code>" that also has a name ending in
"<code>.txt</code>".</entry></row>
- <row><entry>/a/*.txt</entry><entry>Match any child node of
the top level node "<code>a</code>" that also has a name ending in
"<code>.txt</code>".</entry></row>
- <row><entry>/a/b@c</entry><entry>Match the property
"<code>c</code>" of node
"<code>/a/b</code>".</entry></row>
- <row><entry>/a/b[2]</entry><entry>The second child named
"<code>b</code>" below the top level node
"<code>a</code>".</entry></row>
- <row><entry>/a/b[2,3,4]</entry><entry>The second, third or
fourth child named "<code>b</code>" below the top level node
"<code>a</code>".</entry></row>
- <row><entry>/a/b[*]</entry><entry>Any (and every) child
named "<code>b</code>" below the top level node
"<code>a</code>".</entry></row>
- <row><entry>//a/b</entry><entry>Any node named
"<code>b</code>" that exists below a node named
"<code>a</code>", regardless
- of where node "<code>a</code>" occurs. Again, neither
node may have any same-name-sibilings.</entry></row>
- </tbody>
- </tgroup>
- </table>
- <para>With these simple examples, you can probably discern the most important
rules. First, the '<code>*</code>' is a wildcard character
- that matches any character or sequence of characters in a node's name (or index
if appearing in between square brackets), and
- can be used in conjunction with other characters (e.g.,
"<code>*.txt</code>").</para>
- <para>Second, square brackets (i.e., '<code>[</code>' and
'<code>]</code>') are used to match a node's same-name-sibiling
index.
- You can put a single non-negative number or a comma-separated list of non-negative
numbers. Use '0' to match a node that has no
- same-name-sibilings, or any positive number to match the specific
same-name-sibling.</para>
- <para>Third, combining two delimiters (e.g.,
"<code>//</code>") matches any sequence of nodes, regardless of what
their names are
- or how many nodes. Often used with other patterns to identify nodes at any level
matching other patterns.
- Three or more sequential slash characters are treated as two.</para>
- <para>Many input paths can be created using just these simple rules. However,
input paths can be more complicated. Here are some
- more examples:</para>
- <table frame='all'>
- <title>More Complex Input Path Examples</title>
- <tgroup cols='2' align='left' colsep='1'
rowsep='1'>
- <colspec colname='c1' colwidth="1*"/>
- <colspec colname='c2' colwidth="1*"/>
- <thead>
- <row>
- <entry>Input Path</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tbody>
- <row><entry>/a/(b|c|d)</entry><entry>Match children of the
top level node "<code>a</code>" that are named
"<code>a</code>",
- "<code>b</code>" or
"<code>c</code>". None of the nodes may have same-name-sibling
indexes.</entry></row>
- <row><entry>/a/b[c/d]</entry><entry>Match node
"<code>b</code>" child of the top level node
"<code>a</code>", when node
- "<code>b</code>" has a child named
"<code>c</code>", and "<code>c</code>" has a
child named "<code>d</code>".
- Node "<code>b</code>" is the selected node, while nodes
"<code>b</code>" and "<code>b</code>" are used
as criteria but are not
- selected.</entry></row>
- <row><entry>/a(/(b|c|d|)/e)[f/g/@something]</entry><entry>Match
node "<code>/a/b/e</code>",
"<code>/a/c/e</code>", "<code>/a/d/e</code>",
- or "<code>/a/e</code>" when they also have a child
"<code>f</code>" that itself has a child
"<code>g</code>" with property
- "<code>something</code>". None of the nodes may have
same-name-sibling indexes.</entry></row>
- </tbody>
- </tgroup>
- </table>
- <para>These examples show a few more advanced rules. Parentheses (i.e.,
'<code>(</code>' and '<code>)</code>') can be
used
- to define a set of options for names, as shown in the first and third rules.
Whatever part of the selected node's path
- appears between the parentheses is captured for use within the output path. Thus,
the first input path in the previous table
- would match node "<code>/a/b</code>", and "b" would
be captured and could be used within the output path using
"<code>$1</code>",
- where the number used in the output path identifies the parentheses.</para>
- <para>Square brackets can also be used to specify criteria on a node's
properties or children. Whatever appears in between the square
- brackets does not appear in the selected node.</para>
- <para>Let's go back to the previous code fragment and look at the first
path expression:</para>
- <programlisting><![CDATA[
//(*.(jpg|jpeg|gif|bmp|pcx|png)[*])/jcr:content[@jcr:data] => /images/$1
]]></programlisting>
- <para>This matches a node named
"<code>jcr:content</code>" with property
"<code>jcr:data</code>" but no siblings with the same name,
- and that is a child of a node whose name ends with
"<code>.jpg</code>", "<code>.jpeg</code>",
"<code>.gif</code>", "<code>.bmp</code>",
"<code>.pcx</code>",
- or "<code>.png</code>" that may have any same-name-sibling
index. These nodes can appear at any level in the repository.
- Note how the input path capture the filename (the segment containing the file
extension), including any same-name-sibling index.
- This filename is then used in the output path, which is where the sequenced content
is placed.</para>
- </sect2>
+ /**
+ * Set the supplied reference on the supplied node.
+ *
+ * @param nodePath the path to the node containing the property;
+ * may not be null
+ * @param property the name of the property to be set
+ * @param paths the paths to the referenced property, which may be
+ * absolute paths or relative to the sequencer output node;
+ * may be empty if any existing property is to be removed
+ */
+ void setReference( String nodePath, String property, String... paths );
+}
+ </programlisting>
+ <note>
+ <para>
+ JBoss DNA will create nodes of type <code>nt:unstructured</code> unless
you specify the value for the
+ <code>jcr:primaryType</code> property. You can also specify the
values for the <code>jcr:mixinTypes</code> property
+ if you want to add mixins to any node.
+ </para>
+ </note>
</sect1>
+ <sect1 id="path-expressions">
+ <title>Path Expressions</title>
+ <para>
+ Each sequencer must be configured to describe the areas or types of content that the
sequencer is capable
+ of handling. This is done by specifying these patterns using path expressions that
+ identify the nodes (or node patterns) that should be sequenced and where to store
the output generated by the sequencer.
+ We'll see how to fully configure a sequencer in the <link
linkend="configuration">next chapter</link>,
+ but before then let's dive into path expressions in more detail.
+ </para>
+ <para>
+ A path expression consist of two parts: a selection criteria (or an input path) and an
output path:
+ </para>
+ <programlisting><![CDATA[ inputPath => outputPath
]]></programlisting>
+ <para>
+ The <emphasis>inputPath</emphasis> part defines an expression for the path
of a node that is to be sequenced.
+ Input paths consist of '<code>/</code>' separated segments, where
each segment represents a pattern for a single node's
+ name (including the same-name-sibling indexes) and
'<code>@</code>' signifies a property name.
+ </para>
+ <para>
+ Let's first look at some simple examples:
+ </para>
+ <table frame='all'>
+ <title>Simple Input Path Examples</title>
+ <tgroup cols='2' align='left' colsep='1'
rowsep='1'>
+ <colspec colname='c1' colwidth="1*"/>
+ <colspec colname='c2' colwidth="1*"/>
+ <thead>
+ <row>
+ <entry>Input Path</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row><entry>/a/b</entry><entry>Match node
"<code>b</code>" that is a child of the top level node
"<code>a</code>". Neither node
+ may have any same-name-sibilings.</entry></row>
+ <row><entry>/a/*</entry><entry>Match any child node of the
top level node "<code>a</code>".</entry></row>
+ <row><entry>/a/*.txt</entry><entry>Match any child node of
the top level node "<code>a</code>" that also has a name ending in
"<code>.txt</code>".</entry></row>
+ <row><entry>/a/*.txt</entry><entry>Match any child node of
the top level node "<code>a</code>" that also has a name ending in
"<code>.txt</code>".</entry></row>
+ <row><entry>/a/b@c</entry><entry>Match the property
"<code>c</code>" of node
"<code>/a/b</code>".</entry></row>
+ <row><entry>/a/b[2]</entry><entry>The second child named
"<code>b</code>" below the top level node
"<code>a</code>".</entry></row>
+ <row><entry>/a/b[2,3,4]</entry><entry>The second, third or
fourth child named "<code>b</code>" below the top level node
"<code>a</code>".</entry></row>
+ <row><entry>/a/b[*]</entry><entry>Any (and every) child
named "<code>b</code>" below the top level node
"<code>a</code>".</entry></row>
+ <row><entry>//a/b</entry><entry>Any node named
"<code>b</code>" that exists below a node named
"<code>a</code>", regardless
+ of where node "<code>a</code>" occurs. Again, neither
node may have any same-name-sibilings.</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para>
+ With these simple examples, you can probably discern the most important rules. First,
the '<code>*</code>' is a wildcard character
+ that matches any character or sequence of characters in a node's name (or index
if appearing in between square brackets), and
+ can be used in conjunction with other characters (e.g.,
"<code>*.txt</code>").
+ </para>
+ <para>
+ Second, square brackets (i.e., '<code>[</code>' and
'<code>]</code>') are used to match a node's same-name-sibiling
index.
+ You can put a single non-negative number or a comma-separated list of non-negative
numbers. Use '0' to match a node that has no
+ same-name-sibilings, or any positive number to match the specific same-name-sibling.
+ </para>
+ <para>
+ Third, combining two delimiters (e.g., "<code>//</code>")
matches any sequence of nodes, regardless of what their names are
+ or how many nodes. Often used with other patterns to identify nodes at any level
matching other patterns.
+ Three or more sequential slash characters are treated as two.
+ </para>
+ <para>
+ Many input paths can be created using just these simple rules. However, input paths
can be more complicated. Here are some
+ more examples:
+ </para>
+ <table frame='all'>
+ <title>More Complex Input Path Examples</title>
+ <tgroup cols='2' align='left' colsep='1'
rowsep='1'>
+ <colspec colname='c1' colwidth="1*"/>
+ <colspec colname='c2' colwidth="1*"/>
+ <thead>
+ <row>
+ <entry>Input Path</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row><entry>/a/(b|c|d)</entry><entry>Match children of the
top level node "<code>a</code>" that are named
"<code>a</code>",
+ "<code>b</code>" or "<code>c</code>".
None of the nodes may have same-name-sibling indexes.</entry></row>
+ <row><entry>/a/b[c/d]</entry><entry>Match node
"<code>b</code>" child of the top level node
"<code>a</code>", when node
+ "<code>b</code>" has a child named
"<code>c</code>", and "<code>c</code>" has a
child named "<code>d</code>".
+ Node "<code>b</code>" is the selected node, while nodes
"<code>b</code>" and "<code>b</code>" are used
as criteria but are not
+ selected.</entry></row>
+ <row><entry>/a(/(b|c|d|)/e)[f/g/@something]</entry><entry>Match
node "<code>/a/b/e</code>",
"<code>/a/c/e</code>", "<code>/a/d/e</code>",
+ or "<code>/a/e</code>" when they also have a child
"<code>f</code>" that itself has a child
"<code>g</code>" with property
+ "<code>something</code>". None of the nodes may have
same-name-sibling indexes.</entry></row>
+ </tbody>
+ </tgroup>
+ </table>
+ <para>
+ These examples show a few more advanced rules. Parentheses (i.e.,
'<code>(</code>' and '<code>)</code>') can be
used
+ to define a set of options for names, as shown in the first and third rules.
Whatever part of the selected node's path
+ appears between the parentheses is captured for use within the output path. Thus,
the first input path in the previous table
+ would match node "<code>/a/b</code>", and "b" would
be captured and could be used within the output path using
"<code>$1</code>",
+ where the number used in the output path identifies the parentheses.
+ </para>
+ <para>
+ Square brackets can also be used to specify criteria on a node's properties or
children. Whatever appears in between the square
+ brackets does not appear in the selected node.
+ </para>
+ <para>
+ Let's go back to the previous code fragment and look at the first path
expression:
+ </para>
+ <programlisting><![CDATA[
//(*.(jpg|jpeg|gif|bmp|pcx|png)[*])/jcr:content[@jcr:data] => /images/$1
]]></programlisting>
+ <para>
+ This matches a node named "<code>jcr:content</code>" with
property "<code>jcr:data</code>" but no siblings with the same
name,
+ and that is a child of a node whose name ends with
"<code>.jpg</code>", "<code>.jpeg</code>",
"<code>.gif</code>", "<code>.bmp</code>",
"<code>.pcx</code>",
+ or "<code>.png</code>" that may have any same-name-sibling
index. These nodes can appear at any level in the repository.
+ Note how the input path capture the filename (the segment containing the file
extension), including any same-name-sibling index.
+ This filename is then used in the output path, which is where the sequenced content
is placed.
+ </para>
+ </sect1>
<sect1 id="sequencer-library">
<title>Out-of-the-box sequencers</title>
<para>
@@ -309,19 +313,15 @@
<programlisting role="XML"><![CDATA[
<dependency>
<groupId>org.jboss.dna</groupId>
- <artifactId>dna-common</artifactId>
- <version>0.3</version>
-</dependency>
-<dependency>
- <groupId>org.jboss.dna</groupId>
<artifactId>dna-graph</artifactId>
- <version>0.3</version>
+ <version>0.5</version>
</dependency>
]]></programlisting>
<para>These are minimum dependencies required for compiling a sequencer. Of
course, you'll have to add
other dependencies that your sequencer needs.</para>
<para>As for testing, you probably will want to add more dependencies, such as
those listed here:</para>
<programlisting role="XML"><![CDATA[
+<!-- DNA-related unit testing utilities and classes -->
<dependency>
<groupId>org.jboss.dna</groupId>
<artifactId>dna-graph</artifactId>
@@ -330,6 +330,14 @@
<scope>test</scope>
</dependency>
<dependency>
+ <groupId>org.jboss.dna</groupId>
+ <artifactId>dna-common</artifactId>
+ <version>0.5</version>
+ <type>test-jar</type>
+ <scope>test</scope>
+</dependency>
+<!-- Unit testing -->
+<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.4</version>
@@ -353,8 +361,7 @@
<artifactId>log4j</artifactId>
<version>1.2.14</version>
<scope>test</scope>
-</dependency>
- ]]></programlisting>
+</dependency> ]]></programlisting>
<para>Testing JBoss DNA sequencers does not require a JCR repository or the
JBoss DNA services. (For more detail,
see the <link linkend="testing_custom_sequencers">testing
section</link>.) However, if you want to do
integration testing with a JCR repository and the JBoss DNA services, you'll
need additional dependencies for these libraries.</para>
@@ -374,81 +381,16 @@
<scope>test</scope>
</dependency>
]]></programlisting>
- <para>At this point, your project should be set up correctly, and you're
ready to move on to
- <link linkend="custom_sequencer_implementation">writing the Java
implementation</link> for your sequencer.</para>
- </sect2>
- <sect2 id="custom_sequencer_implementation">
- <title>Implementing the StreamSequencer interface</title>
- <para>After creating the project and setting up the dependencies, the next
step is to create a Java class that implements
- the &StreamSequencer; interface. This interface is very straightforward and
involves a single method:</para>
- <programlisting>
-public interface &StreamSequencer; {
-
- /**
- * Sequence the data found in the supplied stream, placing the output
- * information into the supplied map.
- *
- * @param stream the stream with the data to be sequenced; never null
- * @param output the output from the sequencing operation; never null
- * @param context the context for the sequencing operation; never null
- */
- void sequence( &InputStream; stream, &SequencerOutput; output,
&SequencerContext; context );
-}
-</programlisting>
- <para>The job of a stream sequencer is to process the data in the supplied
stream, and place into the &SequencerOutput;
- any information that is to go into the JCR repository. JBoss DNA figures out when
your sequencer should be called
- (of course, using the sequencing configuration you'll add in a bit), and then
makes sure the generated information
- is saved in the correct place in the repository.
- </para>
- <para>The &SequencerContext; provides information about
- the current sequencing operation, including the location and properties of the node
being sequenced, the MIME type
- of the node being sequenced, and a location to record problems that aren't
severe enough to warrant throwing an exception.
- </para>
- <para>The &SequencerOutput; class is fairly easy to use. There are
basically two methods you need to call.
- One method sets the property values, while the other sets references to other nodes
in the repository. Use these
- methods to describe the properties of the nodes you want to create, using relative
paths for the nodes and
- valid JCR property names for properties and references. JBoss DNA will ensure that
nodes are created or updated
- whenever they're needed.</para>
- <programlisting>
-public interface &SequencerOutput; {
-
- /**
- * Set the supplied property on the supplied node. The allowable
- * values are any of the following:
- * - primitives (which will be autoboxed)
- * - String instances
- * - String arrays
- * - byte arrays
- * - InputStream instances
- * - Calendar instances
- *
- * @param nodePath the path to the node containing the property;
- * may not be null
- * @param property the name of the property to be set
- * @param values the value(s) for the property; may be empty if
- * any existing property is to be removed
- */
- void setProperty( String nodePath, String property, Object... values );
-
- /**
- * Set the supplied reference on the supplied node.
- *
- * @param nodePath the path to the node containing the property;
- * may not be null
- * @param property the name of the property to be set
- * @param paths the paths to the referenced property, which may be
- * absolute paths or relative to the sequencer output node;
- * may be empty if any existing property is to be removed
- */
- void setReference( String nodePath, String property, String... paths );
-}
-</programlisting>
- <para>JBoss DNA will create nodes of type
<code>nt:unstructured</code> unless you specify the value for the
- <code>jcr:primaryType</code> property. You can also specify the
values for the <code>jcr:mixinTypes</code> property
- if you want to add mixins to any node.</para>
- <para>For a complete example of a sequencer, let's look at the
&ImageMetadataSequencer;
- implementation:</para>
- <programlisting>
+ <para>
+ At this point, your project should be set up correctly, and you're ready to move
on to
+ write your custom implementation of the &StreamSequencer; interface. As stated
earlier, this should be fairly
+ straightforward: process the stream and generate the output that's appropriate
for the kind of file being
+ sequenced.
+ </para>
+ <para>
+ Let's look at an example. Here is the complete code for the
&ImageMetadataSequencer; implementation:
+ </para>
+ <programlisting>
public class &ImageMetadataSequencer; implements &StreamSequencer; {
public static final String METADATA_NODE = "image:metadata";
@@ -493,20 +435,26 @@
output.setProperty(METADATA_NODE, IMAGE_HEIGHT, metadata.getHeight());
output.setProperty(METADATA_NODE, IMAGE_BITS_PER_PIXEL,
metadata.getBitsPerPixel());
output.setProperty(METADATA_NODE, IMAGE_PROGRESSIVE,
metadata.isProgressive());
- output.setProperty(METADATA_NODE, IMAGE_NUMBER_OF_IMAGES,
metadata.getNumberOfImages());
- output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_WIDTH_DPI,
metadata.getPhysicalWidthDpi());
- output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_HEIGHT_DPI,
metadata.getPhysicalHeightDpi());
- output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_WIDTH_INCHES,
metadata.getPhysicalWidthInch());
- output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_HEIGHT_INCHES,
metadata.getPhysicalHeightInch());
+ output.setProperty(METADATA_NODE, IMAGE_NUMBER_OF_IMAGES,
+ metadata.getNumberOfImages());
+ output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_WIDTH_DPI,
+ metadata.getPhysicalWidthDpi());
+ output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_HEIGHT_DPI,
+ metadata.getPhysicalHeightDpi());
+ output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_WIDTH_INCHES,
+ metadata.getPhysicalWidthInch());
+ output.setProperty(METADATA_NODE, IMAGE_PHYSICAL_HEIGHT_INCHES,
+ metadata.getPhysicalHeightInch());
}
}
}
</programlisting>
- <para>
- Notice how the image metadata is extracted and the output graph is generated. A
single node is created with the name <code>image:metadata</code>
- and with the <code>image:metadata</code> node type. No mixins are
defined for the node, but several properties are set on the node
- using the values obtained from the image metadata. After this method returns, the
constructed graph will be saved to the repository
- in all of the places defined by its configuration. (This is why only relative paths
are used in the sequencer.)
+ <para>
+ Notice how the image metadata is extracted and the output graph is generated. A
single node is created with the name
+ <code>image:metadata</code>
+ and with the <code>image:metadata</code> node type. No mixins are
defined for the node, but several properties are set on the node
+ using the values obtained from the image metadata. After this method returns, the
constructed graph will be saved to the repository
+ in all of the places defined by its configuration. (This is why only relative
paths are used in the sequencer.)
</para>
</sect2>
<sect2 id="testing_custom_sequencers">
@@ -563,38 +511,18 @@
that <link linkend="using_dna">configure JBoss DNA</link> to
use a custom sequencer, and to then upload
content using the JCR API, verifying that the custom sequencer did run. However,
remember that JBoss DNA
runs sequencers asynchronously in the background, and you must synchronize your
tests to ensure that the
- sequencers have a chance to run before checking the results. (One way of doing this
(although, granted, not always reliable) is to wait for a second
- after uploading your content, shutdown the &SequencingService; and await its
termination,
- and then check that the sequencer output has been saved to the JCR repository. For
an example of this technique,
- see the <code>SequencingClientTest</code> unit test in the example
application.)
+ sequencers have a chance to run before checking the results.
</para>
</sect2>
- <sect2 id="deploying_custom_sequencers">
- <title>Deploying custom sequencers</title>
- <para>The first step of deploying a sequencer consists of adding/changing the
sequencer configuration (e.g., &SequencerConfig;)
- in the &SequencingService;. This was covered in the <link
linkend="sequencing_service">previous chapter</link>.
- </para>
- <para>
- The second step is to make the sequencer implementation available to JBoss DNA. At
this time, the JAR containing
- your new sequencer, as well as any JARs that your sequencer depends on, should be
placed on your application classpath.</para>
- <note>
- <para>A future goal of JBoss DNA is to allow sequencers, connectors, and
other extensions to be easily deployed into
- a runtime repository. This process will not only be much simpler, but it will
also provide JBoss DNA
- with the information necessary to update configurations and create the
appropriate class loaders for each extension.
- Having separate class loaders for each extension helps prevent the pollution of
the common classpath,
- facilitates an isolated runtime environment to eliminate any dependency
conflicts, and may potentially
- enable hot redeployment of newer extension versions.
- </para>
- </note>
- </sect2>
</sect1>
<sect1>
<title>Summary</title>
<para>
- In this chapter, we described how JBoss DNA sequences files as they're uploaded
into a repository.
- And one of the things we mentioned was that each sequencer is handed (with other
inputs) the MIME type of the file it is to process.
- How does DNA know what the MIME type is?
- JBoss DNA uses <emphasis>MIME type detectors</emphasis>, and this is the
topic of the <link linkend="mimetypes">next chapter</link>.
+ In this chapter, we described how JBoss DNA sequences files as they're uploaded
into a repository. We've also learned
+ in previous chapters about the JBoss DNA <link
linkend="execution-context">execution contexts</link>,
+ <link linkend="graph-model">graph model</link>, and <link
linkend="connector-framework">connectors</link>.
+ In the <link linkend="jcr">next part</link> we'll put all
these pieces together to learn how
+ to set up a JBoss DNA repository and access it using the JCR API.
</para>
</sect1>
</chapter>
Modified: trunk/docs/reference/src/main/docbook/en-US/custom.dtd
===================================================================
--- trunk/docs/reference/src/main/docbook/en-US/custom.dtd 2009-06-09 18:48:18 UTC (rev
1016)
+++ trunk/docs/reference/src/main/docbook/en-US/custom.dtd 2009-06-09 19:46:34 UTC (rev
1017)
@@ -82,6 +82,7 @@
<!ENTITY UrlEncoder "<ulink
url='&API;common/text/Jsr283Encoder.html'><classname>UrlEncoder</classname></ulink>">
<!ENTITY XmlNameEncoder "<ulink
url='&API;common/text/Jsr283Encoder.html'><classname>XmlNameEncoder</classname></ulink>">
<!ENTITY XmlValueEncoder "<ulink
url='&API;common/text/XmlValueEncoder.html'><classname>XmlValueEncoder</classname></ulink>">
+<!ENTITY Problems "<ulink
url='&API;common/collection/Problems.html'><interface>Problems</interface></ulink>">
<!-- Types in dna-graph -->
@@ -141,7 +142,7 @@
<!ENTITY RequestProcessor "<ulink
url='&API;graph/request/processor/RequestProcessor.html'><classname>RequestProcessor</classname></ulink>">
<!ENTITY StreamSequencer "<ulink
url='&API;graph/sequencer/StreamSequencer.html'><interface>StreamSequencer</interface></ulink>">
<!ENTITY SequencerOutput "<ulink
url='&API;graph/sequencer/SequencerOutput.html'><interface>SequencerOutput</interface></ulink>">
-<!ENTITY SequencerContext "<ulink
url='&API;graph/sequencer/SequencerContext.html'><interface>SequencerContext</interface></ulink>">
+<!ENTITY StreamequencerContext "<ulink
url='&API;graph/sequencer/StreamequencerContext.html'><interface>StreamequencerContext</interface></ulink>">
<!ENTITY MimeTypeDetector "<ulink
url='&API;graph/mimetype/MimeTypeDetector.html'><interface>MimeTypeDetector</interface></ulink>">
<!ENTITY MockSequencerOutput "<ulink
url='&API;graph/sequencer/MockSequencerOutput.html'><interface>MockSequencerOutput</interface></ulink>">
<!ENTITY MockSequencerContext "<ulink
url='&API;graph/sequencer/MockSequencerContext.html'><interface>MockSequencerContext</interface></ulink>">
Modified: trunk/docs/reference/src/main/docbook/en-US/master.xml
===================================================================
--- trunk/docs/reference/src/main/docbook/en-US/master.xml 2009-06-09 18:48:18 UTC (rev
1016)
+++ trunk/docs/reference/src/main/docbook/en-US/master.xml 2009-06-09 19:46:34 UTC (rev
1017)
@@ -85,7 +85,7 @@
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="content/core/graph.xml"/>
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="content/core/connector.xml"/>
<xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="content/core/sequencing.xml"/>
- <xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="content/core/configuration.xml"/>
+ <xi:include
xmlns:xi="http://www.w3.org/2001/XInclude"
href="content/jcr/configuration.xml"/>
</part>
<part id="jcr-part">
<title>JBoss DNA JCR</title>