[hibernate-commits] Hibernate SVN: r15642 - search/trunk/doc/reference/en/modules.

Tue Dec 2 11:33:47 EST 2008

Author: hardy.ferentschik
Date: 2008-12-02 11:33:47 -0500 (Tue, 02 Dec 2008)
New Revision: 15642

Modified:
   search/trunk/doc/reference/en/modules/batchindex.xml
   search/trunk/doc/reference/en/modules/lucene-native.xml
   search/trunk/doc/reference/en/modules/optimize.xml
Log:
HSEARCH-303

Modified: search/trunk/doc/reference/en/modules/batchindex.xml
===================================================================

--- search/trunk/doc/reference/en/modules/batchindex.xml	2008-12-02 15:11:04 UTC (rev 15641)
+++ search/trunk/doc/reference/en/modules/batchindex.xml	2008-12-02 16:33:47 UTC (rev 15642)
@@ -22,8 +22,8 @@
   ~ 51 Franklin Street, Fifth Floor
   ~ Boston, MA  02110-1301  USA
   -->
-
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> 
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
 <chapter id="search-batchindex">
   <!--  $Id$ -->
 
@@ -32,37 +32,36 @@
   <section id="search-batchindex-indexing">
     <title>Indexing</title>
 
-    <para>It is sometimes useful to index an object even if this object is not
-    inserted nor updated to the database. This is especially true when you
-    want to build your index for the first time. You can achieve that goal
-    using the <classname>FullTextSession</classname>.</para>
+    <para>It is sometimes useful to index an entity even if this entity is not
+    inserted or updated to the database. This is for example the case when you
+    want to build your index for the first time.
+    <classname>FullTextSession</classname>.<methodname>index()</methodname>
+    allows you to do so.</para>
 
-    <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(session);
+    <example>
+      <title>Indexing an entity via
+      <methodname>FullTextSession.index()</methodname></title>
+
+      <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(session);
 Transaction tx = fullTextSession.beginTransaction();
 for (Customer customer : customers) {
     <emphasis role="bold">fullTextSession.index(customer);</emphasis>
 }
 tx.commit(); //index are written at commit time    </programlisting>
+    </example>
 
     <para>For maximum efficiency, Hibernate Search batches index operations
-    and executes them at commit time (Note: you don't need to use
-    <classname>org.hibernate.Transaction</classname> in a JTA
-    environment).</para>
+    and executes them at commit time. If you expect to index a lot of data,
+    however, you need to be careful about memory consumption since all
+    documents are kept in a queue until the transaction commit. You can
+    potentially face an <classname>OutOfMemoryException</classname>. To avoid
+    this exception, you can use
+    <methodname>fullTextSession.flushToIndexes()</methodname>. Every time
+    <methodname>fullTextSession.flushToIndexes()</methodname> is called (or if
+    the transaction is committed), the batch queue is processed (freeing
+    memory) applying all index changes. Be aware that once flushed changes
+    cannot be rolled back.</para>
 
-    <para>If you expect to index a lot of data, you need to be careful about
-    memory consumption: since all documents are kept in a queue until the
-    transaction commit, you can potentially face an
-    <classname>OutOfMemoryException</classname>.</para>
-
-    <para>To avoid that, you can use
-    <methodname>fullTextSession.flushToIndexes()</methodname>: all index
-    operations are queued until
-    <methodname>fullTextSession.flushToIndexes()</methodname> is called. Every
-    time <methodname>fullTextSession.flushToIndexes()</methodname> is called
-    (or if the transaction is committed), the queue is processed (freeing
-    memory) and emptied. Be aware that changes made before a flush cannot be
-    rollbacked. </para>
-
     <note>
       <para><literal>hibernate.search.worker.batch_size</literal> has been
       deprecated in favor of this explicit API which provides better
@@ -70,26 +69,43 @@
     </note>
 
     <para>Other parameters which also can affect indexing time and memory
-    consumption are
-    <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.batch.max_buffered_docs</literal>
-    ,
-    <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.batch.max_field_length</literal>
-    ,
-    <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.batch.max_merge_docs</literal>
-    ,
-    <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.batch.merge_factor</literal>
-    ,
-    <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.batch.ram_buffer_size</literal>
-    and
-    <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.batch.term_index_interval</literal>
-    . These parameters are Lucene specific and Hibernate Search is just
+    consumption are:</para>
+
+    <itemizedlist>
+      <listitem>
+        <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.[batch|transaction].max_buffered_docs</literal>
+      </listitem>
+
+      <listitem>
+        <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.[batch|transaction].max_field_length</literal>
+      </listitem>
+
+      <listitem>
+        <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.[batch|transaction].max_merge_docs</literal>
+      </listitem>
+
+      <listitem>
+        <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.[batch|transaction].merge_factor</literal>
+      </listitem>
+
+      <listitem>
+        <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.[batch|transaction].ram_buffer_size</literal>
+      </listitem>
+
+      <listitem>
+        <literal>hibernate.search.[default|&lt;indexname&gt;].indexwriter.[batch|transaction].term_index_interval</literal>
+      </listitem>
+    </itemizedlist>
+
+    <para>These parameters are Lucene specific and Hibernate Search is just
     passing these parameters through - see <xref
     linkend="lucene-indexing-performance" /> for more details.</para>
 
-    <para>Here is an especially efficient way to index a given class (useful
-    for index (re)initialization):</para>
+    <example>
+      <title>Efficiently indexing a given class (useful for index
+      (re)initialization)</title>
 
-    <programlisting>fullTextSession.setFlushMode(FlushMode.MANUAL);
+      <programlisting>fullTextSession.setFlushMode(FlushMode.MANUAL);
 fullTextSession.setCacheMode(CacheMode.IGNORE);
 transaction = fullTextSession.beginTransaction();
 //Scrollable results will avoid loading too many objects in memory
@@ -106,9 +122,10 @@
     }
 }
 transaction.commit();</programlisting>
+    </example>
 
-    <para>Try to use a batch size that guaranty that your application will not
-    run out of memory.</para>
+    <para>Try to use a batch size that guarantees that your application will
+    not run out of memory.</para>
   </section>
 
   <section>
@@ -116,29 +133,38 @@
 
     <para>It is equally possible to remove an entity or all entities of a
     given type from a Lucene index without the need to physically remove them
-    from the database. This operation is named purging and is done through the
-    <classname>FullTextSession</classname>.</para>
+    from the database. This operation is named purging and is also done
+    through the <classname>FullTextSession</classname>.</para>
 
-    <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(session);
+    <example>
+      <title>Purging a specific instance of an entity from the index</title>
+
+      <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(session);
 Transaction tx = fullTextSession.beginTransaction();
 for (Customer customer : customers) {
     <emphasis role="bold">fullTextSession.purge( Customer.class, customer.getId() );</emphasis>
 }
 tx.commit(); //index are written at commit time    </programlisting>
+    </example>
 
     <para>Purging will remove the entity with the given id from the Lucene
     index but will not touch the database.</para>
 
     <para>If you need to remove all entities of a given type, you can use the
-    <methodname>purgeAll</methodname> method. This operation remove all entities of the type passed
-        as a parameter as well as all its subtypes.</para>
+    <methodname>purgeAll</methodname> method. This operation remove all
+    entities of the type passed as a parameter as well as all its
+    subtypes.</para>
 
-    <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(session);
+    <example>
+      <title>Purging all instances of an entity from the index</title>
+
+      <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(session);
 Transaction tx = fullTextSession.beginTransaction();
 <emphasis role="bold">fullTextSession.purgeAll( Customer.class );</emphasis>
 //optionally optimize the index
 //fullTextSession.getSearchFactory().optimize( Customer.class );
 tx.commit(); //index are written at commit time    </programlisting>
+    </example>
 
     <para>It is recommended to optimize the index after such an
     operation.</para>
@@ -150,4 +176,4 @@
       well.</para>
     </note>
   </section>
-</chapter>
\ No newline at end of file
+</chapter>

Modified: search/trunk/doc/reference/en/modules/lucene-native.xml
===================================================================
--- search/trunk/doc/reference/en/modules/lucene-native.xml	2008-12-02 15:11:04 UTC (rev 15641)
+++ search/trunk/doc/reference/en/modules/lucene-native.xml	2008-12-02 16:33:47 UTC (rev 15642)
@@ -22,8 +22,8 @@
   ~ 51 Franklin Street, Fifth Floor
   ~ Boston, MA  02110-1301  USA
   -->
-
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> 
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
 <chapter id="search-lucene-native">
   <!--  $Id$ -->
 
@@ -37,8 +37,12 @@
     way to access Lucene natively. The <classname>SearchFactory</classname>
     can be accessed from a <classname>FullTextSession</classname>:</para>
 
-    <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(regularSession);
+    <example>
+      <title>Accessing the <classname>SearchFactory</classname></title>
+
+      <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(regularSession);
 SearchFactory searchFactory = fullTextSession.getSearchFactory();</programlisting>
+    </example>
   </section>
 
   <section>
@@ -51,12 +55,16 @@
     <classname>DirectoryProvider</classname>s per indexed class. One directory
     provider can be shared amongst several indexed classes if the classes
     share the same underlying index directory. While usually not the case, a
-    given entity can have several <classname>DirectoryProvider</classname>s is
+    given entity can have several <classname>DirectoryProvider</classname>s if
     the index is sharded (see <xref
     linkend="search-configuration-directory-sharding" />).</para>
 
-    <programlisting>DirectoryProvider[] provider = searchFactory.getDirectoryProviders(Order.class);
+    <example>
+      <title>Accessing the Lucene <classname>Directory</classname></title>
+
+      <programlisting>DirectoryProvider[] provider = searchFactory.getDirectoryProviders(Order.class);
 org.apache.lucene.store.Directory directory = provider[0].getDirectory();</programlisting>
+    </example>
 
     <para>In this example, directory points to the lucene index storing
     <classname>Order</classname>s information. Note that the obtained Lucene
@@ -68,11 +76,14 @@
     <title>Using an IndexReader</title>
 
     <para>Queries in Lucene are executed on an <literal>IndexReader</literal>.
-    Hibernate Search caches such index readers to maximize performances. Your
-    code can access such cached / shared resources. You will just have to
-    follow some "good citizen" rules.</para>
+    Hibernate Search caches all index readers to maximize performance. Your
+    code can access this cached resources, but you have to follow some "good
+    citizen" rules.</para>
 
-    <programlisting>DirectoryProvider orderProvider = searchFactory.getDirectoryProviders(Order.class)[0];
+    <example>
+      <title>Accesing an <classname>IndexReader</classname></title>
+
+      <programlisting>DirectoryProvider orderProvider = searchFactory.getDirectoryProviders(Order.class)[0];
 DirectoryProvider clientProvider = searchFactory.getDirectoryProviders(Client.class)[0];
 
 ReaderProvider readerProvider = searchFactory.getReaderProvider();
@@ -84,24 +95,26 @@
 finally {
     readerProvider.closeReader(reader);
 }</programlisting>
+    </example>
 
     <para>The ReaderProvider (described in <xref
     linkend="search-architecture-readerstrategy" />), will open an IndexReader
-    on top of the index(es) referenced by the directory providers. This
-    IndexReader being shared amongst several clients, you must adhere to the
-    following rules:</para>
+    on top of the index(es) referenced by the directory providers. Because
+    this <classname>IndexReader</classname> is shared amongst several clients,
+    you must adhere to the following rules:</para>
 
     <itemizedlist>
       <listitem>
         <para>Never call indexReader.close(), but always call
-        readerProvider.closeReader(reader); (a finally block is the best
-        area).</para>
+        readerProvider.closeReader(reader), preferably in a finally
+        block.</para>
       </listitem>
 
       <listitem>
-        <para>This indexReader can't be used for modification operations
-        (you would get an exception). If you want to use a read/write index reader,
-        open one from the Lucene Directory object.</para>
+        <para>Don't use this <classname>IndexReader</classname> for
+        modification operations (you would get an exception). If you want to
+        use a read/write index reader, open one from the Lucene Directory
+        object.</para>
       </listitem>
     </itemizedlist>
 
@@ -156,10 +169,10 @@
             </row>
 
             <row>
-              <entry align="left">queryNorm(q) </entry>
+              <entry align="left">queryNorm(q)</entry>
 
               <entry>Normalizing factor used to make scores between queries
-              comparable. </entry>
+              comparable.</entry>
             </row>
 
             <row>
@@ -178,7 +191,7 @@
         </tgroup>
       </informaltable>It is beyond the scope of this manual to explain this
     formula in more detail. Please refer to
-    <classname>Similarity</classname>'s Javadocs for more information. </para>
+    <classname>Similarity</classname>'s Javadocs for more information.</para>
 
     <para>Hibernate Search provides two ways to modify Lucene's similarity
     calculation. First you can set the default similarity by specifying the
@@ -196,6 +209,6 @@
     term appears in a document. Documents with a single occurrence of the term
     should be scored the same as documents with multiple occurrences. In this
     case your custom implementation of the method <methodname>tf(float
-    freq)</methodname> should return 1.0. </para>
+    freq)</methodname> should return 1.0.</para>
   </section>
-</chapter>
\ No newline at end of file
+</chapter>

Modified: search/trunk/doc/reference/en/modules/optimize.xml
===================================================================
--- search/trunk/doc/reference/en/modules/optimize.xml	2008-12-02 15:11:04 UTC (rev 15641)
+++ search/trunk/doc/reference/en/modules/optimize.xml	2008-12-02 16:33:47 UTC (rev 15642)
@@ -22,23 +22,23 @@
   ~ 51 Franklin Street, Fifth Floor
   ~ Boston, MA  02110-1301  USA
   -->
-
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> 
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
 <chapter id="search-optimize">
   <!--  $Id$ -->
 
   <title>Index Optimization</title>
 
   <para>From time to time, the Lucene index needs to be optimized. The process
-  is essentially a defragmentation: until the optimization occurs deleted
-  documents are just marked as such, no physical deletion is applied; the
-  optimization can also adjust the number of files in the Lucene
-  Directory.</para>
+  is essentially a defragmentation. Until an optimization is triggered Lucene
+  only marks deleted documents as such, no physical deletions are applied.
+  During the optimization process the deletions will be applied which also
+  effects the number of files in the Lucene Directory.</para>
 
-  <para>The optimization speeds up searches but in no way speeds up indexation
-  (update). During an optimization, searches can be performed (but will most
-  likely be slowed down), and all index updates will be stopped. Prefer
-  optimizing:</para>
+  <para>Optimising the Lucene index speeds up searches but has no effect on
+  the indexation (update) performance. During an optimization, searches can be
+  performed, but will most likely be slowed down. All index updates will be
+  stopped. It is recommended to schedule optimization:</para>
 
   <itemizedlist>
     <listitem>
@@ -46,40 +46,42 @@
     </listitem>
 
     <listitem>
-      <para>after a lot of index modifications (doing so before will not speed
-      up the indexation process)</para>
+      <para>after a lot of index modifications</para>
     </listitem>
   </itemizedlist>
 
   <section>
     <title>Automatic optimization</title>
 
-    <para>Hibernate Search can optimize automatically an index after:</para>
+    <para>Hibernate Search can automatically optimize an index after:</para>
 
     <itemizedlist>
       <listitem>
-        <para>a certain amount of operations have been applied (insertion,
-        deletion)</para>
+        <para>a certain amount of operations (insertion, deletion)</para>
       </listitem>
 
       <listitem>
-        <para>or a certain amout of transactions have been applied</para>
+        <para>or a certain amout of transactions </para>
       </listitem>
     </itemizedlist>
 
-    <para>The configuration can be global or defined at the index
-    level:</para>
+    <para>The configuration for automatic index optimization can be defined on
+    a global level or per index:</para>
 
-    <programlisting>hibernate.search.default.optimizer.operation_limit.max = 1000
+    <example>
+      <title>Defining automatic optimization parameters</title>
+
+      <programlisting>hibernate.search.default.optimizer.operation_limit.max = 1000
 hibernate.search.default.optimizer.transaction_limit.max = 100
 hibernate.search.Animal.optimizer.transaction_limit.max = 50</programlisting>
+    </example>
 
     <para>An optimization will be triggered to the <literal>Animal</literal>
     index as soon as either:</para>
 
     <itemizedlist>
       <listitem>
-        <para>the number of addition and deletion reaches 1000</para>
+        <para>the number of additions and deletions reaches 1000</para>
       </listitem>
 
       <listitem>
@@ -100,22 +102,25 @@
     <para>You can programmatically optimize (defragment) a Lucene index from
     Hibernate Search through the <classname>SearchFactory</classname>:</para>
 
-    <programlisting>searchFactory.optimize(Order.class);</programlisting>
+    <example>
+      <title>Programmatic index optimization</title>
 
-    <programlisting>searchFactory.optimize();</programlisting>
+      <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(regularSession);
+SearchFactory searchFactory = fullTextSession.getSearchFactory();
 
+searchFactory.optimize(Order.class);
+// or
+searchFactory.optimize();</programlisting>
+    </example>
+
     <para>The first example optimizes the Lucene index holding
     <classname>Order</classname>s; the second, optimizes all indexes.</para>
 
-    <para>The <classname>SearchFactory</classname> can be accessed from a
-    <classname>FullTextSession</classname>:</para>
-
-    <programlisting>FullTextSession fullTextSession = Search.getFullTextSession(regularSession);
-SearchFactory searchFactory = fullTextSession.getSearchFactory();</programlisting>
-
-    <para>Note that <literal>searchFactory.optimize()</literal> has no effect
-    on a JMS backend. You must apply the optimize operation on the Master
-    node.</para>
+    <note>
+      <para><literal>searchFactory.optimize()</literal> has no effect on a JMS
+      backend. You must apply the optimize operation on the Master
+      node.</para>
+    </note>
   </section>
 
   <section>
@@ -151,4 +156,4 @@
       </itemizedlist> See <xref linkend="lucene-indexing-performance" /> for
     more details.</para>
   </section>
-</chapter>
\ No newline at end of file
+</chapter>