[hibernate-commits] Hibernate SVN: r13993 - search/trunk/doc/reference/en/modules.

Mon Sep 3 23:05:11 EDT 2007

Author: epbernard
Date: 2007-09-03 23:05:10 -0400 (Mon, 03 Sep 2007)
New Revision: 13993

Modified:
   search/trunk/doc/reference/en/modules/batchindex.xml
   search/trunk/doc/reference/en/modules/query.xml
Log:
documentaton on purge

Modified: search/trunk/doc/reference/en/modules/batchindex.xml
===================================================================

--- search/trunk/doc/reference/en/modules/batchindex.xml	2007-09-04 02:57:31 UTC (rev 13992)
+++ search/trunk/doc/reference/en/modules/batchindex.xml	2007-09-04 03:05:10 UTC (rev 13993)
@@ -1,66 +1,60 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <!--  $Id$ -->
 <chapter id="search-batchindex">
+  <title>Manual indexing</title>
 
-  <title>Indexing</title>
+  <section id="search-batchindex-indexing">
+    <title>Indexing</title>
 
-  <para>It is sometimes useful to index an object event if this object is not
-  inserted nor updated to the database. This is especially true when you want
-  to build your index the first time. You can achieve that goal using the
-  <classname>FullTextSession</classname> .</para>
+    <para>It is sometimes useful to index an object event if this object is
+    not inserted nor updated to the database. This is especially true when you
+    want to build your index the first time. You can achieve that goal using
+    the <classname>FullTextSession</classname> .</para>
 
-  <programlisting>FullTextSession fullTextSession = Search.createFullTextSession(session);
+    <programlisting>FullTextSession fullTextSession = Search.createFullTextSession(session);
 Transaction tx = fullTextSession.beginTransaction();
 for (Customer customer : customers) {
     <emphasis role="bold">fullTextSession.index(customer);</emphasis>
 }
 tx.commit(); //index are written at commit time    </programlisting>
 
-  <para>For maximum efficiency, Hibernate Search batch index operations which
-  and execute them at commit time (Note: you don't need to use
-  <classname>org.hibernate.Transaction</classname> in a JTA
-  environment).</para>
+    <para>For maximum efficiency, Hibernate Search batch index operations
+    which and execute them at commit time (Note: you don't need to use
+    <classname>org.hibernate.Transaction</classname> in a JTA
+    environment).</para>
 
-  <para>If you expect to index a lot of data, you need to be careful about
-  memory consumption: since all documents are kept in a queue until the
-  transaction commit, you can potentially face an OutOfMemoryException.</para>
+    <para>If you expect to index a lot of data, you need to be careful about
+    memory consumption: since all documents are kept in a queue until the
+    transaction commit, you can potentially face an
+    OutOfMemoryException.</para>
 
-  <para>To avoid that, you can set up the
-  <literal>hibernate.search.worker.batch_size</literal> property to a
-  sensitive value: all index operations are queued until
-  <literal>batch_size</literal> is reached. Every time
-  <literal>batch_size</literal> is reached (or if the transaction is
-  committed), the queue is processed (freeing memory) and emptied. Be aware
-  that the changes cannot be rollbacked if the number of index elements goes
-  beyond <literal>batch_size</literal>. Be also aware that the queue limits are
-  also applied on regular transparent indexing (and not only when
-  <literal>session.index()</literal> is used). That's why a sensitive
-  <literal>batch_size</literal> value is expected.</para>
+    <para>To avoid that, you can set up the
+    <literal>hibernate.search.worker.batch_size</literal> property to a
+    sensitive value: all index operations are queued until
+    <literal>batch_size</literal> is reached. Every time
+    <literal>batch_size</literal> is reached (or if the transaction is
+    committed), the queue is processed (freeing memory) and emptied. Be aware
+    that the changes cannot be rollbacked if the number of index elements goes
+    beyond <literal>batch_size</literal>. Be also aware that the queue limits
+    are also applied on regular transparent indexing (and not only when
+    <literal>session.index()</literal> is used). That's why a sensitive
+    <literal>batch_size</literal> value is expected.</para>
 
-  <para>Other parameters which also can effect indexing time and memory consumption are 
+    <para>Other parameters which also can effect indexing time and memory
+    consumption are
+    <literal>hibernate.search.[default|&lt;indexname&gt;].batch.merge_factor</literal>
+    ,
+    <literal>hibernate.search.[default|&lt;indexname&gt;].batch.max_merge_docs</literal>
+    and
+    <literal>hibernate.search.[default|&lt;indexname&gt;].batch.max_buffered_docs</literal>
+    . These parameters are Lucene specific and Hibernate Search is just
+    passing these paramters through - see <xref
+    linkend="lucene-indexing-performance" /> for more details.</para>
 
-  <literal>hibernate.search.[default|&lt;indexname&gt;].batch.merge_factor</literal>
+    <para>Here is an especially efficient way to index a given class (useful
+    for index (re)initialization):</para>
 
-  , 
-
-  <literal>hibernate.search.[default|&lt;indexname&gt;].batch.max_merge_docs</literal>
-
-   and 
-
-  <literal>hibernate.search.[default|&lt;indexname&gt;].batch.max_buffered_docs</literal>
-
-  . These parameters are Lucene specific and Hibernate Search is just passing these paramters through - see 
-
-  <xref linkend="lucene-indexing-performance" />
-
-   for more details. 
-</para>
-  <para>Here is an especially efficient way to index a given class (useful for
-  index (re)initialization):</para>
-
-   
-
-  <programlisting>fullTextSession.setFlushMode(FlushMode.MANUAL);
+    <programlisting>fullTextSession.setFlushMode(FlushMode.MANUAL);
 transaction = fullTextSession.beginTransaction();
 //Scrollable results will avoid loading too many objects in memory
 ScrollableResults results = fullTextSession.createCriteria( Email.class ).scroll( ScrollMode.FORWARD_ONLY );
@@ -72,11 +66,40 @@
 }
 transaction.commit();</programlisting>
 
-   
+    <para>It is critical that <literal>batchSize</literal> in the previous
+    example matches the <literal>batch_size</literal> value described
+    previously.</para>
+  </section>
 
-  <para>It is critical that <literal>batchSize</literal> in the previous
-  example matches the <literal>batch_size</literal> value described
-  previously.</para>
+  <section>
+    <title>Purging</title>
 
-   
+    <para>It is equally possible to remove an entity or all entities of a
+    given type from a Lucene index without the need to physically remove them
+    from the database. This operation is named purging and is done through the
+    <classname>FullTextSession</classname>.</para>
+
+    <programlisting>FullTextSession fullTextSession = Search.createFullTextSession(session);
+Transaction tx = fullTextSession.beginTransaction();
+for (Customer customer : customers) {
+    <emphasis role="bold">fullTextSession.purge( Customer.class, customer.getId() );</emphasis>
+}
+tx.commit(); //index are written at commit time    </programlisting>
+
+    <para>Purging will remove the entity with the given id from the Lucene
+    index but will not touch the database.</para>
+
+    <para>If you need to remove all entities of a given type, you can use the
+    <methodname>purgeAll</methodname> method.</para>
+
+    <programlisting>FullTextSession fullTextSession = Search.createFullTextSession(session);
+Transaction tx = fullTextSession.beginTransaction();
+<emphasis role="bold">fullTextSession.purge( Customer.class );</emphasis>
+//optionally optimize the index
+//fullTextSession.getSearchFactory().optimize( Customer.class );
+tx.commit(); //index are written at commit time    </programlisting>
+
+    <para>It is recommended to optimize the index after such an
+    operation.</para>
+  </section>
 </chapter>
\ No newline at end of file

Modified: search/trunk/doc/reference/en/modules/query.xml
===================================================================
--- search/trunk/doc/reference/en/modules/query.xml	2007-09-04 02:57:31 UTC (rev 13992)
+++ search/trunk/doc/reference/en/modules/query.xml	2007-09-04 03:05:10 UTC (rev 13993)
@@ -85,10 +85,10 @@
         <programlisting>FullTextSession fullTextSession = Search.createFullTextSession( session );
 org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery );</programlisting>
 
-        <para>If not specified otherwise, the query will be executed against all indexed entities,
-        potentially returning all types of indexed classes. It is advised,
-        from a performance point of view, to restrict the returned
-        types:</para>
+        <para>If not specified otherwise, the query will be executed against
+        all indexed entities, potentially returning all types of indexed
+        classes. It is advised, from a performance point of view, to restrict
+        the returned types:</para>
 
         <programlisting>org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery, Customer.class );
 //or
@@ -280,11 +280,18 @@
       Search has to process all Lucene Hits elements (within the pagination)
       when using <methodname>list()</methodname> ,
       <methodname>uniqueResult()</methodname> and
-      <methodname>iterate()</methodname>. If you wish to minimize Lucene
-      document loading, <methodname>scroll()</methodname> is more appropriate,
-      Don't forget to close the <classname>ScrollableResults</classname>
-      object when you're done, since it keeps Lucene resources. Pagination is
-      a preferred method over scrolling though.</para>
+      <methodname>iterate()</methodname>. </para>
+
+      <para>If you wish to minimize Lucene document loading,
+      <methodname>scroll()</methodname> is more appropriate. Don't forget to
+      close the <classname>ScrollableResults</classname> object when you're
+      done, since it keeps Lucene resources. If you expect to use
+      <methodname>scroll</methodname> but wish to load objects in batch, you
+      can use <methodname>query.setFetchSize()</methodname>: When an object is
+      accessed, and if not already loaded, Hibernate Search will load the next
+      <literal>fetchSize</literal> objects in one pass. </para>
+
+      <para>Pagination is a preferred method over scrolling though.</para>
     </section>
 
     <section>
@@ -445,7 +452,7 @@
 fullTextQuery.enableFullTextFilter("security")<emphasis role="bold">.setParameter( "level", 5 )</emphasis>;</programlisting>
 
     <para>Each parameter name should have an associated setter on either the
-    filter or filter factory of the targeted named filter definition. </para>
+    filter or filter factory of the targeted named filter definition.</para>
 
     <programlisting>public class SecurityFilterFactory {
     private Integer level;
@@ -498,8 +505,8 @@
     implementation to each of the parameters equals and hashcode
     methods.</para>
 
-    <para>Why should filters be cached? There are two area where filter caching
-    shines:</para>
+    <para>Why should filters be cached? There are two area where filter
+    caching shines:</para>
 
     <itemizedlist>
       <listitem>