Hibernate SVN: r10866 - branches/Lucene_Integration/HibernateExt/metadata/doc/reference/en/modules - hibernate-commits

Thursday, 23 November 2006


Author: epbernard
Date: 2006-11-23 17:41:27 -0500 (Thu, 23 Nov 2006)
New Revision: 10866

Modified:
   branches/Lucene_Integration/HibernateExt/metadata/doc/reference/en/modules/lucene.xml
Log:
Hibernate Search documentation

Modified:
branches/Lucene_Integration/HibernateExt/metadata/doc/reference/en/modules/lucene.xml
===================================================================
---
branches/Lucene_Integration/HibernateExt/metadata/doc/reference/en/modules/lucene.xml	2006-11-23
22:30:01 UTC (rev 10865)
+++
branches/Lucene_Integration/HibernateExt/metadata/doc/reference/en/modules/lucene.xml	2006-11-23
22:41:27 UTC (rev 10866)
@@ -1,91 +1,63 @@
 <?xml version="1.0" encoding="ISO-8859-1"?>
-<chapter id="lucene" revision="1">
-  <title>Hibernate Lucene Integration</title>
+<chapter id="lucene" revision="2">
+  <title>Hibernate Search: Apache <trademark>Lucene</trademark>
+  Integration</title>
 
-  <para>Lucene is a high-performance Java search engine library available from
-  the Apache Software Foundation. Hibernate Annotations includes a package of
-  annotations that allows you to mark any domain model object as indexable and
-  have Hibernate maintain a Lucene index of any instances persisted via
-  Hibernate.</para>
+  <para><ulink url="http://lucene.apache.org">Apache
Lucene</ulink> is a
+  high-performance Java search engine library available at the Apache Software
+  Foundation. Hibernate Annotations includes a package of annotations that
+  allows you to mark any domain model object as indexable and have Hibernate
+  maintain a Lucene index of any instances persisted via Hibernate. Apache
+  Lucene is also integrated with the Hibernate query facility.</para>
 
-  <para>Hibernate Lucene is a work in progress and new features are cooking in
+  <para>Hibernate Search is a work in progress and new features are cooking in
   this area. So expect some compatibility changes in subsequent
   versions.</para>
 
-  <section id="lucene-mapping">
-    <title>Mapping the entities to the index</title>
+  <section id="lucene-architecture">
+    <title>Architecture</title>
 
-    <para>First, we must declare a persistent class as indexable. This is done
-    by annotating the class with <literal>@Indexed</literal>:</para>
+    <para>Hibernate Search is made of an indexing engine and an index search
+    engine. Both are backed by Apache Lucene.</para>
 
-    <programlisting>@Entity
-@Indexed(index="indexes/essays")
-public class Essay {
-    ...
-}</programlisting>
+    <para>When an entity is inserted, updated or removed to/from the database,
+    <productname>Hibernate Search</productname> will keep track of this
event
+    (through the Hibernate event system) and schedule an index update. When
+    out of transaction, the update is executed right after the actual database
+    operation. It is however recommended, for both your database and Hibernate
+    Search, to execute your operation in a transaction (whether JDBC or JTA).
+    When in a transaction, the index update is schedule for the transaction
+    commit (and discarded in case of transaction rollback). You can think of
+    this as the regular (infamous) autocommit vs transactional behavior. From
+    a performance perspective, the <emphasis>in transaction</emphasis> mode
is
+    recommended. All the index updates are handled for you without you having
+    to use the Apache Lucene APIs.</para>
 
-    <para>The <literal>index</literal> attribute tells Hibernate what
the
-    lucene directory name is (usually a directory on your file system). If you
-    wish to define a base directory for all lucene indexes, you can use the
-    <literal>hibernate.lucene.default.indexDir</literal> property in your
-    configuration file.</para>
+    <para>To interact with Apache Lucene indexes, Hibernate Search has the
+    notion of <classname>DirectoryProvider</classname>. A directory provider
+    will manage a given Lucene <classname>Directory</classname> type. You
can
+    configure directory providers to adjust the directory target.</para>
 
-    <para>Lucene indexes contain four kinds of fields:
-    <emphasis>keyword</emphasis> fields,
<emphasis>text</emphasis> fields,
-    <emphasis>unstored</emphasis> fields and
<emphasis>unindexed</emphasis>
-    fields. Hibernate Annotations provides annotations to mark a property of
-    an entity as one of the first three kinds of indexed fields.</para>
-
-    <programlisting>@Entity
-@Indexed(index="indexes/essays")
-public class Essay {
-    ...
-
-    @Id
-    @Keyword(id=true)
-    public Long getId() { return id; }
-    
-    @Text(name="Abstract")
-    public String getSummary() { return summary; }
-    
-    @Lob
-    @Unstored
-    public String getText() { return text; }
-    
-}</programlisting>
-
-    <para>These annotations define an index with three fields:
-    <literal>id</literal>, <literal>Abstract</literal> and
-    <literal>text</literal>. Note that by default the field name is
-    decapitalized, following the JavaBean specification.</para>
-
-    <para>Note: you <emphasis>must</emphasis> specify
-    <literal>@Keyword(id=true)</literal> on the identifier property of your
-    entity class.</para>
-
-    <para>Lucene has the notion of <emphasis>boost factor</emphasis>.
It's a
-    way to give more weigth to a field or to an indexed element over an other
-    during the indexation process. You can use <literal>@Boost</literal> at
-    the field or the class level.</para>
-
-    <para>The analyzer class used to index the elements is configurable
-    through the <literal>hibernate.lucene.analyzer</literal> property. If
none
-    defined,
-   
<classname>org.apache.lucene.analysis.standard.StandardAnalyzer</classname>
-    is used as the default.</para>
+    <para><productname>Hibernate Search</productname> can also use a
Lucene
+    index to search an entity and return a (list of) managed entity saving you
+    from the tedious Object / Lucene Document mapping and low level Lucene
+    APIs. The application code use the unified
+    <classname>org.hibernate.Query</classname> API exactly the way a HQL or
+    native query would be done.</para>
   </section>
 
   <section id="lucene-configuration">
     <title>Configuration</title>
 
     <section id="lucene-configuration-directory">
-      <title>directory configuration</title>
+      <title>Directory configuration</title>
 
-      <para>Lucene has a notion of Directory where the index is stored. The
-      Directory implementation can be customized but Lucene comes bundled with
-      a file system and a full memory implementation. Hibernate Lucene has the
-      notion of <literal>DirectoryProvider</literal> that handle the
-      configuration and the initialization of the Lucene Directory.</para>
+      <para>Apache Lucene has a notion of Directory where the index is stored.
+      The Directory implementation can be customized but Lucene comes bundled
+      with a file system and a full memory implementation.
+      <productname>Hibernate Search</productname> has the notion of
+      <literal>DirectoryProvider</literal> that handle the configuration and
+      the initialization of the Lucene Directory.</para>
 
       <table>
         <title>List of built-in Directory Providers</title>
@@ -103,19 +75,19 @@
 
           <tbody>
             <row>
-              <entry>org.hibernate.lucene.store.FSDirectoryProvider</entry>
+              <entry>org.hibernate.search.store.FSDirectoryProvider</entry>
 
               <entry>File system based directory. The directory used will be
-             
&amp;lt;indexBase&amp;gt;/&amp;lt;&lt;literal&gt;(a)Index.name&lt;/literal&gt;&amp;gt;&lt;/entry&gt;
+             
&amp;lt;indexBase&amp;gt;/&amp;lt;&lt;literal&gt;(a)Indexed.name&lt;/literal&gt;&amp;gt;&lt;/entry&gt;
 
               <entry><literal>indexBase</literal>: Base
directory</entry>
             </row>
 
             <row>
-              <entry>org.hibernate.lucene.store.RAMDirectoryProvider</entry>
+              <entry>org.hibernate.search.store.RAMDirectoryProvider</entry>
 
               <entry>Memory based directory, the directory will be uniquely
-              indentified by the <literal&gt;(a)Index.name&lt;/literal&gt;
+              indentified by the <literal&gt;(a)Indexed.name&lt;/literal&gt;
               element</entry>
 
               <entry>none</entry>
@@ -132,17 +104,17 @@
       <para>Each indexed entity is associated to a Lucene index (an index can
       be shared by several entities but this is not usually the case). You can
       configure the index through properties prefixed by
-     
<literal><literal>hibernate.lucene.&lt;indexname&gt;</literal></literal>.
+     
<constant>hibernate.search.</constant><replaceable>indexname</replaceable>.
       Default properties inherited to all indexes can be defined using the
-      prefix hibernate.lucene.default.</para>
+      prefix <constant>hibernate.search.default.</constant></para>
 
       <para>To define the directory provider of a given index, you use the
-     
<literal>hibernate.lucene.&lt;indexname&gt;.directory_provider</literal></para>
+     
<constant>hibernate.search.<replaceable>indexname</replaceable>.directory_provider</constant></para>
 
-      <programlisting>hibernate.lucene.default.directory_provider
org.hibernate.lucene.store.FSDirectoryProvider
-hibernate.lucene.default.indexDir=/usr/lucene/indexes
+      <programlisting>hibernate.search.default.directory_provider
org.hibernate.search.store.FSDirectoryProvider
+hibernate.search.default.indexDir=/usr/lucene/indexes
 
-hibernate.lucene.Rules.directory_provider
org.hibernate.lucene.store.RAMDirectoryProvider
+hibernate.search.Rules.directory_provider
org.hibernate.search.store.RAMDirectoryProvider
 </programlisting>
 
       <para>applied on</para>
@@ -162,32 +134,537 @@
       and base directory, and overide those default later on on a per index
       basis.</para>
 
-      <para>Writing your own DirectoryProvider, you can benefit this
-      configuration mechanism too.</para>
+      <para>Writing your own <classname>DirectoryProvider</classname>,
you can
+      benefit this configuration mechanism too.</para>
     </section>
 
-    <section id="lucene-configuration-event">
+    <section id="lucene-configuration-event" revision="1">
       <title>Enabling automatic indexing</title>
 
-      <para>Finally, we enable the
<literal>LuceneEventListener</literal> for
-      the three Hibernate events that occur after changes are committed to the
+      <para>Finally, we enable the
<literal>SearchEventListener</literal> for
+      the three Hibernate events that occur after changes are executed to the
       database.</para>
 
       <programlisting>&lt;hibernate-configuration&gt;
     ...
-    &lt;event type="post-commit-update" 
-        &lt;listener  
-            class="org.hibernate.lucene.event.LuceneEventListener"/&gt;
+    &lt;event type="post-update" 
+        &lt;listener
class="org.hibernate.search.event.FullTextIndexEventListener"/&gt;
     &lt;/event&gt;
-    &lt;event type="post-commit-insert" 
-        &lt;listener 
-            class="org.hibernate.lucene.event.LuceneEventListener"/&gt;
+    &lt;event type="post-insert" 
+        &lt;listener
class="org.hibernate.search.event.FullTextIndexEventListener"/&gt;
     &lt;/event&gt;
-    &lt;event type="post-commit-delete" 
-        &lt;listener 
-            class="org.hibernate.lucene.event.LuceneEventListener"/&gt;
+    &lt;event type="post-delete" 
+        &lt;listener
class="org.hibernate.search.event.FullTextIndexEventListener"/&gt;
     &lt;/event&gt;
 &lt;/hibernate-configuration&gt;</programlisting>
     </section>
   </section>
+
+  <section id="lucene-mapping" revision="1">
+    <title>Mapping entities to the index structure</title>
+
+    <para>All the metadata information related to indexed entities is
+    described through some Java annotations. There is no need for xml mapping
+    files nor a list of indexed entities. The list is discovered at startup
+    time scanning the Hibernate mapped entities.</para>
+
+    <para>First, we must declare a persistent class as indexable. This is done
+    by annotating the class with <literal>@Indexed</literal> (all entities
not
+    annotated with <literal>@Indexed</literal> will be ignored by the
indexing
+    process):</para>
+
+    <programlisting>@Entity
+<emphasis
role="bold">@Indexed(index="indexes/essays")</emphasis>
+public class Essay {
+    ...
+}</programlisting>
+
+    <para>The <literal>index</literal> attribute tells Hibernate what
the
+    Lucene directory name is (usually a directory on your file system). If you
+    wish to define a base directory for all Lucene indexes, you can use the
+    <literal>hibernate.search.default.indexDir</literal> property in your
+    configuration file. Each entity instance will be represented by a Lucene
+    <classname>Document</classname> inside the given index (aka
+    Directory).</para>
+
+    <para>For each property (or attribute) of your entity, you have the
+    ability to describe how it will be indexed. The default (ie no annotation)
+    means that the property is completly ignored by the indexing process.
+    <literal>@Field</literal> does declare a property as indexed. When
+    indexing an element to a Lucene document you can specify how it is
+    indexed:</para>
+
+    <itemizedlist>
+      <listitem>
+        <para><literal>name</literal>: describe under which name, the
property
+        should be stored in the Lucene Document. The default value is the
+        property name (following the JavaBeans convention)</para>
+      </listitem>
+
+      <listitem>
+        <para><literal>store</literal>: describe whether or not the
property
+        is stored in the Lucene index. You can store the value
+        <literal>Store.YES</literal> (comsuming more space in the index),
+        store it in a compressed way <literal>Store.COMPRESS</literal> (this
+        does consume more CPU), or avoid any storage
+        <literal>Store.NO</literal> (this is the default value). When a
+        property is stored, you can retrieve it from the Lucene Document (note
+        that this is not related to whether the element is indexed or
+        not).</para>
+      </listitem>
+
+      <listitem>
+        <para>index: describe how the element is indexed (ie the process used
+        to index the property and the type of information store). The
+        different values are <literal>Index.NO</literal> (no indexing, ie
+        cannot be found by a query), <literal>Index.TOKENIZED</literal> (use
+        an analyzer to process the property),
+        <literal>Index.UN_TOKENISED</literal> (no analyzer pre processing),
+        <literal>Index.NO_NORM</literal> (do not store the normalization
+        data).</para>
+      </listitem>
+    </itemizedlist>
+
+    <para>These attributes are part of the <literal>@Field</literal>
+    annotation.</para>
+
+    <para>Whether or not you want to store the data depends on how you wish to
+    use the index query result. As of today, for a pure <productname>Hibernate
+    Search</productname> usage, storing is not necessary. Whether or not you
+    want to tokenize a property or not depends on whether you wish to search
+    the element as is, or only normalized part of it. It make sense to
+    tokenize a text field, but it does not to do it for a date field (or an id
+    field).</para>
+
+    <para>Finally, the id property of an entity is a special property used by
+    <productname>Hibernate Search</productname> to ensure index unicity of a
+    given entity. By design, an id has to be stored and must not be tokenized.
+    To mark a property as index id, use the <literal>@DocumentId</literal>
+    annotation.</para>
+
+    <programlisting>@Entity
+@Indexed(index="indexes/essays")
+public class Essay {
+    ...
+
+    @Id
+    <emphasis role="bold">@DocumentId</emphasis>
+    public Long getId() { return id; }
+    
+    <emphasis role="bold">@Field(name="Abstract",
index=Index.TOKENIZED, store=Store.YES)</emphasis>
+    public String getSummary() { return summary; }
+    
+    @Lob
+    <emphasis role=&quot;bold&quot;&gt;(a)Field(index=Index.TOKENIZED)&lt;/emphasis&gt;
+    public String getText() { return text; }
+    
+}</programlisting>
+
+    <para>These annotations define an index with three fields:
+    <literal>id</literal>, <literal>Abstract</literal> and
+    <literal>text</literal>. Note that by default the field name is
+    decapitalized, following the JavaBean specification.</para>
+
+    <para>Note: you <emphasis>must</emphasis> specify
+    <literal>@DocumentId</literal> on the identifier property of your entity
+    class.</para>
+
+    <para>Lucene has the notion of <emphasis>boost factor</emphasis>.
It's a
+    way to give more weigth to a field or to an indexed element over an other
+    during the indexation process. You can use <literal>@Boost</literal> at
+    the field or the class level.</para>
+
+    <programlisting>@Entity
+@Indexed(index="indexes/essays")
+<emphasis role="bold">@Boost(2)</emphasis>
+public class Essay {
+    ...
+
+    @Id
+    @DocumentId
+    public Long getId() { return id; }
+    
+    @Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES)
+    <emphasis role=&quot;bold&quot;&gt;(a)Boost(2.5f)&lt;/emphasis&gt;
+    public String getSummary() { return summary; }
+    
+    @Lob
+    @Field(index=Index.TOKENIZED)
+    public String getText() { return text; }
+    
+}</programlisting>
+
+    <para>In our example, Essay's probability to reach the top of the search
+    list will be multiplied by 2 and the summary field will be 2.5 more
+    important than the test field. Note that this explaination is actually
+    wrong, but it is simple and close enought to the reality. Please check the
+    Lucene documentation or the excellent <citetitle>Lucene In
+    Action</citetitle> from Otis Gospodnetic and Erik Hatcher.</para>
+
+    <para>The analyzer class used to index the elements is configurable
+    through the <literal>hibernate.search.analyzer</literal> property. If
none
+    defined,
+   
<classname>org.apache.lucene.analysis.standard.StandardAnalyzer</classname>
+    is used as the default.</para>
+  </section>
+
+  <section id="lucene-bridge">
+    <title>Property/Field Bridge</title>
+
+    <para>All field of a full text index in Lucene have to be represented as
+    Strings. Ones Java properties have to be indexed in a String form. For
+    most of your properties, <productname>Hibernate Search</productname>
does
+    the translation job for you thanks to a built-in set of bridges. In some
+    cases, though you need a fine grain control over the translation
+    process.</para>
+
+    <section>
+      <title>Built-in bridges</title>
+
+      <para><literal>Hibernate Search</literal> comes bundled with a
set of
+      built-in bridges between a Java property type and its full text
+      representation.</para>
+
+      <para><literal>Null</literal> elements are not indexed (Lucene
does not
+      support null elements and it does not make much sense either)</para>
+
+      <variablelist>
+        <varlistentry>
+          <term>null</term>
+
+          <listitem>
+            <para>null elements are not indexed. Lucene does not support null
+            elements and this does not make much sense either.</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>java.lang.String</term>
+
+          <listitem>
+            <para>String are indexed as is</para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>short, Short, integer, Integer, long, Long, float, Float,
+          double, Double, BigInteger, BigDecimal</term>
+
+          <listitem>
+            <para>Numbers are converted in their String representation. Note
+            that numbers cannot be compared by Lucene (ie used in ranged
+            queries) out of the box: they have to be padded <footnote>
+                <para>Using a Range query is debattable and has drawbacks, an
+                alternative approach is to use a Filter query which will
+                filter the result query to the appropriate range.</para>
+
+                <para><productname>Hibernate Search</productname> will
support
+                a padding mechanism</para>
+              </footnote></para>
+          </listitem>
+        </varlistentry>
+
+        <varlistentry>
+          <term>java.util.Date</term>
+
+          <listitem>
+            <para>Dates are stored as yyyyMMddHHmmssSSS in GMT time
+            (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You
+            shouldn't really bother with the internal format. What is
+            important is that when using a DateRange Query, you should know
+            that the dates have to be expressed in GMT time.</para>
+
+            <para>Usually, storing the date up to the milisecond is not
+            necessary. <literal>@DateBridge</literal> defines the
appropriate
+            resolution you are willing to store in the index
+           
(<literal&gt;&lt;literal&gt;(a)DateBridge(resolution=Resolution.DAY)&lt;/literal&gt;&lt;/literal>).
+            The date pattern will then be truncated accordingly.</para>
+
+            <programlisting>@Entity @Indexed 
+public class Meeting {
+    @Field(index=Index.UN_TOKENIZED)
+    <emphasis
role=&quot;bold&quot;&gt;(a)DateBridge(resolution=Resolution.MINUTE)&lt;/emphasis&gt;
+    private Date date;
+    ...
+}</programlisting>
+
+            <warning>
+              <para>A Date whose resolution is lower than
+              <literal>MILLISECOND</literal> cannot be a
+              <literal>@DocumentId</literal></para>
+            </warning>
+          </listitem>
+        </varlistentry>
+      </variablelist>
+
+      <para></para>
+    </section>
+
+    <section>
+      <title>Custom Bridge</title>
+
+      <para>It can happen that the built-in bridges of Hibernate Search does
+      not cover some of your property types, or that the String representation
+      used is not what you expect.</para>
+
+      <section>
+        <title>StringBridge</title>
+
+        <para>The simpliest custom solution is to give
<productname>Hibernate
+        Search</productname> an implementation of your expected
+        <emphasis>object to String</emphasis> bridge. To do so you need to
+        implements the
+        <literal>org.hibernate.search.bridge.StringBridge</literal>
+        interface</para>
+
+        <programlisting>/**
+ * Padding Integer bridge.
+ * All numbers will be padded with 0 to match 5 digits
+ *
+ * @author Emmanuel Bernard
+ */
+public class PaddedIntegerBridge implements <emphasis
role="bold">StringBridge</emphasis> {
+
+    private int PADDING = 5;
+
+    <emphasis role="bold">public String objectToString(Object
object)</emphasis> {
+        String rawInteger = ( (Integer) object ).toString();
+        if (rawInteger.length() &gt; PADDING) throw new IllegalArgumentException(
"Try to pad on a number too big" );
+        StringBuilder paddedInteger = new StringBuilder( );
+        for ( int padIndex = rawInteger.length() ; padIndex &lt; PADDING ; padIndex++
) {
+            paddedInteger.append('0');
+        }
+        return paddedInteger.append( rawInteger ).toString();
+    }
+}</programlisting>
+
+        <para>Then any property or field can use this bridge thanks to the
+        <literal>@FieldBridge</literal> annotation</para>
+
+        <programlisting><emphasis role="bold">@FieldBridge(impl =
PaddedIntegerBridge.class)</emphasis>
+private Integer length;</programlisting>
+
+        <para>Parameters can be passed to the Bridge implementation making it
+        more flexible. The Bridge implementation implements a
+        <classname>ParameterizedBridge</classname> interface, and the
+        parameters are passed through the <literal>@FieldBridge</literal>
+        annotation.</para>
+
+        <programlisting>public class PaddedIntegerBridge implements StringBridge,
<emphasis
+            role="bold">ParameterizedBridge</emphasis> {
+
+    public static String PADDING_PROPERTY = "padding";
+    private int padding = 5; //default
+
+    <emphasis role="bold">public void setParameterValues(Map
parameters)</emphasis> {
+        Object padding = parameters.get( PADDING_PROPERTY );
+        if (padding != null) this.padding = (Integer) padding;
+    }
+
+    public String objectToString(Object object) {
+        String rawInteger = ( (Integer) object ).toString();
+        if (rawInteger.length() &gt; padding) throw new IllegalArgumentException(
"Try to pad on a number too big" );
+        StringBuilder paddedInteger = new StringBuilder( );
+        for ( int padIndex = rawInteger.length() ; padIndex &lt; padding ; padIndex++
) {
+            paddedInteger.append('0');
+        }
+        return paddedInteger.append( rawInteger ).toString();
+    }
+}
+
+
+//property
+@FieldBridge(impl = PaddedIntegerBridge.class, 
+        <emphasis role="bold">params =
@Parameter(name="padding", value="10")</emphasis> )
+private Integer length;</programlisting>
+
+        <para>The <classname>ParameterizedBridge</classname> interface
can be
+        implemented by <classname>StringBridge</classname>,
+        <classname>TwoWayStringBridge</classname>,
+        <classname>FieldBridge</classname> implementations (see
+        bellow).</para>
+
+        <para>If you expect to use your bridge implementation on for an id
+        property (ie annotated with <literal>@DocumentId</literal>), you
need
+        to use a slightly extended version of
<literal>StringBridge</literal>
+        named <classname>TwoWayStringBridge</classname>.
<literal>Hibernate
+        Search</literal> needs to read the string representation of the
+        identifier and generate the object out of it. There is not difference
+        in the way the <literal>@FieldBridge</literal> annotation is
+        used.</para>
+
+        <programlisting>public class PaddedIntegerBridge implements
TwoWayStringBridge, ParameterizedBridge {
+
+    public static String PADDING_PROPERTY = "padding";
+    private int padding = 5; //default
+
+    public void setParameterValues(Map parameters) {
+        Object padding = parameters.get( PADDING_PROPERTY );
+        if (padding != null) this.padding = (Integer) padding;
+    }
+
+    public String objectToString(Object object) {
+        String rawInteger = ( (Integer) object ).toString();
+        if (rawInteger.length() &gt; padding) throw new IllegalArgumentException(
"Try to pad on a number too big" );
+        StringBuilder paddedInteger = new StringBuilder( );
+        for ( int padIndex = rawInteger.length() ; padIndex &lt; padding ; padIndex++
) {
+            paddedInteger.append('0');
+        }
+        return paddedInteger.append( rawInteger ).toString();
+    }
+
+    <emphasis role="bold">public Object stringToObject(String
stringValue)</emphasis> {
+        return new Integer(stringValue);
+    }
+}
+
+
+//id property
+@DocumentId
+@FieldBridge(impl = PaddedIntegerBridge.class,
+            params = @Parameter(name="padding", value="10") )
+private Integer id;</programlisting>
+
+        <para>It is critically important for the two-way process to be
+        idempotent (ie object = stringToObject( objectToString( object ) )
+        ).</para>
+      </section>
+
+      <section>
+        <title>FieldBridge</title>
+
+        <para>Some usecase requires more than a simple object to string
+        translation when mapping a property to a Lucene index. To give you
+        most of the flexibility you can also implement a bridge as a
+        <classname>FieldBridge</classname>. This interface give you a
property
+        value and let you map it the way you want in your Lucene
+        <classname>Document</classname>.This interface is very similar in
its
+        concept to the <productname>Hibernate</productname>
+        <classname>UserType</classname>.</para>
+
+        <para>You can for example store a given property in two different
+        document fields</para>
+
+        <programlisting>/**
+ * Store the date in 3 different field year, month, day
+ * to ease Range Query per year, month or day
+ * (eg get all the elements of december for the last 5 years)
+ *
+ * @author Emmanuel Bernard
+ */
+public class DateSplitBridge implements FieldBridge {
+    private final static TimeZone GMT = TimeZone.getTimeZone("GMT");
+
+    <emphasis role="bold">public void set(String name, Object value,
Document document, Field.Store store, Field.Index index, Float boost) {</emphasis>
+        Date date = (Date) value;
+        Calendar cal = GregorianCalendar.getInstance( GMT );
+        cal.setTime( date );
+        int year = cal.get( Calendar.YEAR );
+        int month = cal.get( Calendar.MONTH ) + 1;
+        int day = cal.get( Calendar.DAY_OF_MONTH );
+        //set year
+        Field field = new Field( name + ".year", String.valueOf(year), store,
index );
+        if ( boost != null ) field.setBoost( boost );
+        document.add( field );
+        //set month and pad it if needed
+        field = new Field( name + ".month", month &lt; 10 ? "0" :
"" + String.valueOf(month), store, index );
+        if ( boost != null ) field.setBoost( boost );
+        document.add( field );
+        //set day and pad it if needed
+        field = new Field( name + ".day", day &lt; 10 ? "0" :
"" + String.valueOf(day), store, index );
+        if ( boost != null ) field.setBoost( boost );
+        document.add( field );
+    }
+}
+
+
+//property
+<emphasis role="bold">@FieldBridge(impl =
DateSplitBridge.class)</emphasis>
+private Integer length;</programlisting>
+
+        <para></para>
+      </section>
+    </section>
+  </section>
+
+  <section id="lucene-query">
+    <title>Querying</title>
+
+    <para>The second most important capability of <productname>Hibernate
+    Search</productname> is the ability to execute a Lucene query and retrieve
+    entities managed by an Hibernate session, providing the power of Lucene
+    without living the Hibernate paradygm, and giving another dimension to the
+    Hibernate classic search mechanisms (HQL, Criteria query, native SQL
+    query).</para>
+
+    <para>To access the <productname>Hibernate Search</productname>
querying
+    facilities, you have to use an Hibernate
+    <classname>FullTextSession</classname>. A SearchSession wrap an regular
+    <classname>org.hibernate.Session</classname> to provide query and
indexing
+    capabilities.</para>
+
+    <programlisting>Session session = sessionFactory.openSession();
+...
+FullTextSession fullTextSession =
Search.createFullTextSession(session);</programlisting>
+
+    <para>The search facility is built on native Lucene queries.</para>
+
+    <programlisting>org.apache.lucene.QueryParser parser = new
QueryParser("title", new StopAnalyzer() );
+
+org.hibernate.lucene.search.Query luceneQuery = parser.parse( "summary:Festina Or
brand:Seiko" );
+<emphasis role="bold">org.hibernate.Query fullTextQuery =
fullTextSession.createFullTextQuery( luceneQuery );</emphasis>
+
+List result = fullTextQuery.list(); //return a list of managed
objects</programlisting>
+
+    <para>The Hibernate query built on top of the Lucene query is a regular
+    <literal>org.hibernate.Query</literal>, you are is the same paradygm as
+    the other Hibernate query facilities (HQL, Native or Criteria). The
+    regular <literal>list()</literal>,
<literal>uniqueResult()</literal>,
+    <literal>iterate()</literal> and <literal>scroll()</literal>
can be
+    used.</para>
+
+    <para>If you expect a reasonnable result number and expect to work on all
+    of them, <methodname>list()</methodname> or
+    <methodname>uniqueResult()</methodname> are recommanded.
+    <methodname>list()</methodname> work best if the entity
+    <literal>batch-size</literal> is set up properly. Note that Hibernate
+    Seach has to process all Lucene Hits elements when using
+    <methodname>list()</methodname>,
<methodname>uniqueResult()</methodname>
+    and <methodname>iterate()</methodname>. If you wish to minimize Lucene
+    document loading, <methodname>scroll()</methodname> is more appropriate,
+    Don't forget to close the <classname>ScrollableResults</classname>
object
+    when you're done, since it keeps Lucene resources.</para>
+
+    <para>An efficient way to work with queries is to use pagination. The
+    pagination API is exactly the one available in
+    <classname>org.hibernate.Query</classname>:</para>
+
+    <programlisting><emphasis role="bold">org.hibernate.Query
fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery );</emphasis>
+fullTextQuery.setFirstResult(30);
+fullTextQuery.setMaxResult(20);
+fullTextQuery.list(); //will return a list of 20 elements starting from the
30th</programlisting>
+
+    <para>Only the relevant Lucene Documents are accessed.</para>
+  </section>
+
+  <section id="lucene-index">
+    <title>Indexing</title>
+
+    <para>It is sometimes useful to index an object event if this object is
+    not inserted nor updated to the database. This is especially true when you
+    want to build your index the first time. You can achieve that goal using
+    the <classname>FullTextSession</classname>.</para>
+
+    <programlisting>FullTextSession fullTextSession =
Search.createFullTextSession(session);
+Transaction tx = fullTextSession.beginTransaction();
+for (Customer customer : customers) {
+    <emphasis
role="bold">fullTextSession.index(customer);</emphasis>
+}
+tx.commit(); //index are written at commit time</programlisting>
+
+    <para>For maximum efficiency, Hibernate Search batch index operations
+    which and execute them at commit time (Note: you don't need to use
+    <classname>org.hibernate.Transaction</classname> in a JTA
+    environment).</para>
+  </section>
 </chapter>
\ No newline at end of file


    

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Hibernate SVN: r10866 - branches/Lucene_Integration/HibernateExt/metadata/doc/reference/en/modules