Hibernate SVN: r15663 - in search/tags: v3_1_0_GA and 2 other directories.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-04 05:45:13 -0500 (Thu, 04 Dec 2008)
New Revision: 15663
Added:
search/tags/v3_1_0_GA/
search/tags/v3_1_0_GA/changelog.txt
search/tags/v3_1_0_GA/doc/reference/en/master.xml
search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml
Removed:
search/tags/v3_1_0_GA/changelog.txt
search/tags/v3_1_0_GA/doc/reference/en/master.xml
search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml
Log:
Created tag v3_1_0_GA.
Copied: search/tags/v3_1_0_GA (from rev 15659, search/trunk)
Deleted: search/tags/v3_1_0_GA/changelog.txt
===================================================================
--- search/trunk/changelog.txt 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/tags/v3_1_0_GA/changelog.txt 2008-12-04 10:45:13 UTC (rev 15663)
@@ -1,342 +0,0 @@
-Hibernate Search Changelog
-==========================
-
-3.1.0.GA (4-12-2008)
-------------------------
-
-3.1.0.CR1 (17-10-2008)
-------------------------
-
-** Bug
- * [HSEARCH-250] - In ReaderStrategies, ensure that the reader is current AND that the directory returned by the DirectoryProvider are the same
- * [HSEARCH-293] - AddLuceneWork is not being removed from the queue when DeleteLuceneWork is added for the same entity
- * [HSEARCH-300] - Fix documentation on use_compound_file
-
-** Improvement
- * [HSEARCH-213] - Use FieldSelector and doc(int, fieldSelector) to only select the necessary fields
- * [HSEARCH-224] - Use MultiClassesQueryLoader in ProjectionLoader
- * [HSEARCH-255] - Create a extensive Analyzer testing suite
- * [HSEARCH-266] - Do not switch to the current directory in FSSlaveDirectoryProvider if no file has been copied
- * [HSEARCH-274] - Use Lucene's new readonly IndexReader
- * [HSEARCH-281] - Work should be Work<T>
- * [HSEARCH-283] - Replace deprecated Classes and methods calls to Lucene 2.4
-
-** New Feature
- * [HSEARCH-104] - Make @DocumentId optional and rely on @Id
- * [HSEARCH-290] - Use IndexReader = readonly on Reader strategies (see Lucene 2.4)
- * [HSEARCH-294] - Rename INSTANCE_AND_BITSETRESULTS to INSTANCE_AND_DOCIDSETRESULTS
-
-** Task
- * [HSEARCH-288] - Evaluate changes in Lucene 2.4.0
- * [HSEARCH-289] - Move to new Lucene Filter DocIdSet
- * [HSEARCH-291] - improve documentation about thread safety requirements of Bridges.
-
-
-3.1.0.Beta2 (27-10-2008)
-------------------------
-
-** Bug
- * [HSEARCH-142] - Modifications on objects indexed via @IndexedEmbedded not updated when not annotated @Indexed
- * [HSEARCH-162] - NPE on queries when no entity is marked as @Indexed
- * [HSEARCH-222] - Entities not found during concurrent update
- * [HSEARCH-225] - Avoid using IndexReader.deleteDocument when index is not shared amongst several entity types
- * [HSEARCH-232] - Using SnowballPorterFilterFactory throws NoClassDefFoundError
- * [HSEARCH-237] - IdHashShardingStrategy fails on IDs having negative hashcode
- * [HSEARCH-241] - initialize methods taking Properties cannot list available properties
- * [HSEARCH-247] - Hibernate Search cannot run without apache-solr-analyzer.jar
- * [HSEARCH-253] - Inconsistent detection of EventListeners during autoregistration into Hibernate listeners
- * [HSEARCH-257] - Ignore delete operation when Core does update then delete on the same entity
- * [HSEARCH-259] - Filter were not isolated by name in the cache
- * [HSEARCH-262] - fullTextSession.purgeAll(Class<?>) does not consider subclasses
- * [HSEARCH-263] - Wrong analyzers used in IndexWriter
- * [HSEARCH-267] - Inheritance of annotations and analyzer
- * [HSEARCH-271] - wrong Similarity used when sharing index among entities
- * [HSEARCH-287] - master.xml is mistakenly copied to the distribution
-
-** Deprecation
- * [HSEARCH-279] - deprecate SharedReaderProvider replaced by SharingBufferReaderProvider as default ReaderProvider
-
-** Improvement
- * [HSEARCH-145] - Document a configuration property
- * [HSEARCH-226] - Use Lucene ability to delete by query in IndexWriter
- * [HSEARCH-240] - Generify the IndexShardingStrategy
- * [HSEARCH-245] - Add ReaderStratregy.destroy() method
- * [HSEARCH-256] - Remove CacheBitResults.YES
- * [HSEARCH-260] - Simplify the Filter Caching definition: cache=FilterCacheModeType.[MODE]
- * [HSEARCH-272] - Improve contention on DirectoryProviders in lucene backend
- * [HSEARCH-273] - Make LuceneOptions an interface
- * [HSEARCH-282] - Make the API more Generics friendly
-
-** New Feature
- * [HSEARCH-170] - Support @Boost in @Field
- * [HSEARCH-235] - provide a destroy() method in ReaderProvider
- * [HSEARCH-252] - Document Solr integration
- * [HSEARCH-258] - Add configuration option for Lucene's UseCompoundFile
-
-** Patch
- * [HSEARCH-20] - Lucene extensions
-
-** Task
- * [HSEARCH-231] - Update the getting started guide with Solr analyzers
- * [HSEARCH-236] - Find whether or not indexWriter.optimize() requires an index lock
- * [HSEARCH-244] - Abiltiy to ask SearchFactory for the scoped analyzer of a given class
- * [HSEARCH-254] - Migrate to Solr 1.3
- * [HSEARCH-276] - upgrade to Lucene 2.4
- * [HSEARCH-286] - Align to GA versions of all dependencies
- * [HSEARCH-292] - Document the new Filter caching approach
-
-
-3.1.0.Beta1 (17-07-2008)
-------------------------
-
-** Bug
- * [HSEARCH-166] - documentation error : hibernate.search.worker.batch_size vs hibernate.worker.batch_size
- * [HSEARCH-171] - Do not log missing objects when using QueryLoader
- * [HSEARCH-173] - CachingWrapperFilter loses its WeakReference making filter caching inefficient
- * [HSEARCH-194] - Inconsistent performance between hibernate search and pure lucene access
- * [HSEARCH-196] - ObjectNotFoundException not caught in FullTextSession
- * [HSEARCH-198] - Documentation out of sync with implemented/released features
- * [HSEARCH-203] - Counter of index modification operations not always incremented
- * [HSEARCH-204] - Improper calls to Session during a projection not involving THIS
- * [HSEARCH-205] - Out of Memory on copy of large indexes
- * [HSEARCH-217] - Proper errors on parsing of all numeric configuration parameters
- * [HSEARCH-227] - Criteria based fetching is not used when objects are loaded one by one (iterate())
-
-
-** Improvement
- * [HSEARCH-19] - Do not filter classes on queries when we know that all Directories only contains the targeted classes
- * [HSEARCH-156] - Retrofit FieldBridge.set lucene parameters into a LuceneOptions class
- * [HSEARCH-157] - Make explicit in FAQ and doc that query.list() followed by query.getResultSize() triggers only one query
- * [HSEARCH-163] - Enhance error messages when @FieldBridge is wrongly used (no impl or impl not implementing the right interfaces)
- * [HSEARCH-176] - Permits alignment properties to lucene default (Sanne Grinovero)
- * [HSEARCH-179] - Documentation should be explicit that @FulltextFilter filters every object, regardless which object is annotated
- * [HSEARCH-181] - Better management of file-based index directories (Sanne Grinovero)
- * [HSEARCH-189] - Thread management improvements for Master/Slave DirectoryProviders
- * [HSEARCH-197] - Move to slf4j
- * [HSEARCH-199] - Property close Search resources on SessionFactory.close()
- * [HSEARCH-202] - Avoid many maps lookup in Workspace
- * [HSEARCH-207] - Make DateBridge TwoWay to facilitate projection
- * [HSEARCH-208] - Raise exception on index and purge when the entity is not an indexed entity
- * [HSEARCH-209] - merge FullTextIndexCollectionEventListener into FullTextIndexEventListener
- * [HSEARCH-215] - Rename Search.createFTS to Search.getFTS deprecating the old method
- * [HSEARCH-223] - Use multiple criteria queries rather than ObjectLoader in most cases
- * [HSEARCH-230] - Ensure initialization safety in a multi-core machine
-
-** New Feature
- * [HSEARCH-133] - Allow overriding DefaultSimilarity for indexing and searching (Nick Vincent)
- * [HSEARCH-141] - Allow term position information to be stored in an index
- * [HSEARCH-153] - Provide the possibility to configure writer.setRAMBufferSizeMB() (Lucene 2.3)
- * [HSEARCH-154] - Provide a facility to access Lucene query explanations
- * [HSEARCH-164] - Built-in bridge to index java.lang.Class
- * [HSEARCH-165] - URI and URL built-in bridges
- * [HSEARCH-174] - Improve transparent filter caching by wrapping filters into our own CachingWrapperFilter
- * [HSEARCH-186] - Enhance analyzer to support the Solr model
- * [HSEARCH-190] - Add pom
- * [HSEARCH-191] - Make build independent of Hibernate Core structure
- * [HSEARCH-192] - Move to Hibernate Core 3.3
- * [HSEARCH-193] - Use dependency on Solr-analyzer JAR rather than the full Solr JAR
- * [HSEARCH-195] - Expose Analyzers instance by name: searchFactory.getAnalyzer(String)
- * [HSEARCH-200] - Expose IndexWriter setting MAX_FIELD_LENGTH via IndexWriterSetting
- * [HSEARCH-212] - Added ReaderProvider strategy reusing unchanged segments (using reader.reopen())
- * [HSEARCH-220] - introduce session.flushToIndexes API and deprecate batch_size
-
-
-** Task
- * [HSEARCH-169] - Migrate to Lucene 2.3.1 (index corruption possiblity in 2.3.0)
- * [HSEARCH-187] - Clarify which directories need read-write access, verify readonly behaviour on others.
- * [HSEARCH-214] - Upgrade Lucene to 2.3.2
- * [HSEARCH-229] - Deprecate FullTextQuery.BOOST
-
-
-3.0.1.GA (20-02-2008)
----------------------
-
-** Bug
- * [HSEARCH-56] - Updating a collection does not reindex
- * [HSEARCH-123] - Use mkdirs instead of mkdir to create necessary parent directory in the DirectoryProviderHelper
- * [HSEARCH-128] - Indexing embedded children's child
- * [HSEARCH-136] - CachingWrapperFilter does not cache
- * [HSEARCH-137] - Wrong class name in Exception when a FieldBridge does not implement TwoWayFieldBridge for a document id property
- * [HSEARCH-138] - JNDI Property names have first character cut off
- * [HSEARCH-140] - @IndexedEmbedded default depth is effectively 1 due to integer overflow
- * [HSEARCH-146] - ObjectLoader doesn't catch javax.persistence.EntityNotFoundException
- * [HSEARCH-149] - Default FieldBridge for enums passing wrong class to EnumBridge constructor
-
-
-** Improvement
- * [HSEARCH-125] - Add support for fields declared by interface or unmapped superclass
- * [HSEARCH-127] - Wrong prefix for worker configurations
- * [HSEARCH-129] - IndexedEmbedded for Collections Documentation
- * [HSEARCH-130] - Should provide better log infos (on the indexBase parameter for the FSDirectoryProvider)
- * [HSEARCH-144] - Keep indexer running till finished on VM shutdown
- * [HSEARCH-147] - Allow projection of Lucene DocId
-
-** New Feature
- * [HSEARCH-114] - Introduce ResultTransformer to the query API
- * [HSEARCH-150] - Migrate to Lucene 2.3
-
-** Patch
- * [HSEARCH-126] - Better diagnostic when Search index directory cannot be opened (Ian)
-
-
-3.0.0.GA (23-09-2007)
----------------------
-
-** Bug
- * [HSEARCH-116] - FullTextEntityManager acessing getDelegate() in the constructor leads to NPE in JBoss AS + Seam
- * [HSEARCH-117] - FullTextEntityManagerImpl and others should implement Serializable
-
-** Deprecation
- * [HSEARCH-122] - Remove query.setIndexProjection (replaced by query.setProjection)
-
-** Improvement
- * [HSEARCH-118] - Add ClassBridges (plural) functionality
-
-** New Feature
- * [HSEARCH-81] - Create a @ClassBridge Annotation (John Griffin)
-
-
-** Task
- * [HSEARCH-98] - Add a Getting started section to the reference documentation
-
-
-3.0.0.CR1 (4-09-2007)
----------------------
-
-** Bug
- * [HSEARCH-108] - id of embedded object is not indexed when using @IndexedEmbedded
- * [HSEARCH-109] - Lazy loaded entity could not be indexed
- * [HSEARCH-110] - ScrollableResults does not obey out of bounds rules (John Griffin)
- * [HSEARCH-112] - Unkown @FullTextFilter when attempting to associate a filter
-
-** Deprecation
- * [HSEARCH-113] - Remove @Text, @Keyword and @Unstored (old mapping annotations)
-
-** Improvement
- * [HSEARCH-107] - DirectoryProvider should have a start() method
-
-** New Feature
- * [HSEARCH-14] - introduce fetch_size for Hibernate Search scrollable resultsets (John Griffin)
- * [HSEARCH-69] - Ability to purge an index by class (John Griffin)
- * [HSEARCH-111] - Ability to disable event based indexing (for read only or batch based indexing)
-
-
-3.0.0.Beta4 (1-08-2007)
------------------------
-
-** Bug
- * [HSEARCH-88] - Unable to update 2 entity types in the same transaction if they share the same index
- * [HSEARCH-90] - Use of setFirstResult / setMaxResults can lead to a list with negative capacity (John Griffin)
- * [HSEARCH-92] - NPE for null fields on projection
- * [HSEARCH-99] - Avoid returning non initialized proxies in scroll() and iterate() (loader.load(EntityInfo))
-
-
-** Improvement
- * [HSEARCH-79] - Recommend to use FlushMode.APPLICATION on massive indexing
- * [HSEARCH-84] - Migrate to Lucene 2.2
- * [HSEARCH-91] - Avoid wrapping a Session object if the Session is already FullTextSession
- * [HSEARCH-100] - Rename fullTextSession.setIndexProjection() to fullTextSession.setProjection()
- * [HSEARCH-102] - Default index operation in @Field to TOKENIZED
- * [HSEARCH-106] - Use the shared reader strategy as the default strategy
-
-** New Feature
- * [HSEARCH-6] - Provide access to the Hit.getScore() and potentially the Document on a query
- * [HSEARCH-15] - Notion of Filtered Lucene queries (Hardy Ferentschik)
- * [HSEARCH-41] - Allow fine grained analyzers (Entity, attribute, @Field)
- * [HSEARCH-45] - Support @Fields() for multiple indexing per property (useful for sorting)
- * [HSEARCH-58] - Support named Filters (and caching)
- * [HSEARCH-67] - Expose mergeFactor, maxMergeDocs and minMergeDocs (Hardy Ferentschik)
- * [HSEARCH-73] - IncrementalOptimizerStrategy triggered on transactions or operations limits
- * [HSEARCH-74] - Ability to project Lucene meta information (Score, Boost, Document, Id, This) (John Griffin)
- * [HSEARCH-83] - Introduce OptimizerStrategy
- * [HSEARCH-86] - Index sharding: multiple Lucene indexes per entity type
- * [HSEARCH-89] - FullText wrapper for JPA APIs
- * [HSEARCH-103] - Ability to override the indexName in the FSDirectoryProviders family
-
-
-** Task
- * [HSEARCH-94] - Deprecate ContextHelper
-
-
-3.0.0.Beta3 (6-06-2007)
------------------------
-
-** Bug
- * [HSEARCH-64] - Exception Thrown If Index Directory Does Not Exist
- * [HSEARCH-66] - Some results not returned in some circumstances (Brandon Munroe)
-
-
-** Improvement
- * [HSEARCH-60] - Introduce SearchFactory / SearchFactoryImpl
- * [HSEARCH-68] - Set index copy threads as daemon
- * [HSEARCH-70] - Create the index base directory if it does not exists
-
-** New Feature
- * [HSEARCH-11] - Provide access to IndexWriter.optimize()
- * [HSEARCH-33] - hibernate.search.worker.batch_size to prevent OutOfMemoryException while inserting many objects
- * [HSEARCH-71] - Provide fullTextSession.getSearchFactory()
- * [HSEARCH-72] - searchFactory.optimize() and searchFactory.optimize(Class) (Andrew Hahn)
-
-
-3.0.0.Beta2 (31-05-2007)
-------------------------
-
-** Bug
- * [HSEARCH-37] - Verify that Serializable return type are not resolved by StringBridge built in type
- * [HSEARCH-39] - event listener declaration example is wrong
- * [HSEARCH-44] - Build the Lucene Document in the beforeComplete transaction phase
- * [HSEARCH-50] - Null Booleans lead to NPE
- * [HSEARCH-59] - Unable to index @indexEmbedded object through session.index when object is lazy and field access is used in object
-
-
-** Improvement
- * [HSEARCH-36] - Meaningful exception message when Search Listeners are not initialized
- * [HSEARCH-38] - Make the @IndexedEmbedded documentation example easier to understand
- * [HSEARCH-51] - Optimization: Use a query rather than batch-size to load objects when a single entity (hierarchy) is expected
- * [HSEARCH-63] - rename query.resultSize() to getResultSize()
-
-** New Feature
- * [HSEARCH-4] - Be able to use a Lucene Sort on queries (Hardy Ferentschik)
- * [HSEARCH-13] - Cache IndexReaders per SearchFactory
- * [HSEARCH-40] - Be able to embed collections in lucene index (@IndexedEmbeddable in collections)
- * [HSEARCH-43] - Expose resultSize and do not load object when only resultSize is retrieved
- * [HSEARCH-52] - Ability to load more efficiently an object graph from a lucene query by customizing the fetch modes
- * [HSEARCH-53] - Add support for projection (ie read the data from the index only)
- * [HSEARCH-61] - Move from MultiSearcher to MultiReader
- * [HSEARCH-62] - Support pluggable ReaderProvider strategies
-
-
-** Task
- * [HSEARCH-65] - Update to JBoss Embedded beta2
-
-
-3.0.0.Beta1 (19-03-2007)
-------------------------
-
-Initial release as a standalone product (see Hibernate Annotations changelog for previous informations)
-
-
-Release Notes - Hibernate Search - Version 3.0.0.beta1
-
-** Bug
- * [HSEARCH-7] - Ignore object found in the index but no longer present in the database (for out of date indexes)
- * [HSEARCH-21] - NPE in SearchFactory while using different threads
- * [HSEARCH-22] - Enum value Index.UN_TOKENISED is misspelled
- * [HSEARCH-24] - Potential deadlock when using multiple DirectoryProviders in a highly concurrent index update
- * [HSEARCH-25] - Class cast exception in org.hibernate.search.impl.FullTextSessionImpl<init>(FullTextSessionImpl.java:54)
- * [HSEARCH-28] - Wrong indexDir property in Apache Lucene Integration
-
-
-** Improvement
- * [HSEARCH-29] - Share the initialization state across all Search event listeners instance
- * [HSEARCH-30] - @FieldBridge now use o.h.s.a.Parameter rather than o.h.a.Parameter
- * [HSEARCH-31] - Move to Lucene 2.1.0
-
-** New Feature
- * [HSEARCH-1] - Give access to Directory providers
- * [HSEARCH-2] - Default FieldBridge for enums (Sylvain Vieujot)
- * [HSEARCH-3] - Default FieldBridge for booleans (Sylvain Vieujot)
- * [HSEARCH-9] - Introduce a worker factory and its configuration
- * [HSEARCH-16] - Cluster capability through JMS
- * [HSEARCH-23] - Support asynchronous batch worker queue
- * [HSEARCH-27] - Ability to index associated / embedded objects
Copied: search/tags/v3_1_0_GA/changelog.txt (from rev 15662, search/trunk/changelog.txt)
===================================================================
--- search/tags/v3_1_0_GA/changelog.txt (rev 0)
+++ search/tags/v3_1_0_GA/changelog.txt 2008-12-04 10:45:13 UTC (rev 15663)
@@ -0,0 +1,368 @@
+Hibernate Search Changelog
+==========================
+
+3.1.0.GA (4-12-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-233] - EntityNotFoundException during indexing
+ * [HSEARCH-280] - Make FSSlaveAndMasterDPTest pass against postgresql
+ * [HSEARCH-297] - Allow PatternTokenizerFactory to be used
+ * [HSEARCH-309] - PurgeAllLuceneWork duplicates in work queue
+
+** Improvement
+ * [HSEARCH-221] - Get Lucene Analyzer runtime (indexing)
+ * [HSEARCH-265] - Raise warnings when an abstract class is marked @Indexed
+ * [HSEARCH-285] - Refactor DocumentBuilder to support containedIn only and regular Indexed entities
+ * [HSEARCH-298] - Warn for dangerous IndexWriter settings
+ * [HSEARCH-299] - Use of faster Bit operations when possible to chain Filters
+ * [HSEARCH-302] - Utilize pagination settings when retrieving TopDocs from the Lucene query to only retrieve required TopDocs
+ * [HSEARCH-308] - getResultSize() implementation should not load documents
+ * [HSEARCH-311] - Add a close() method to BackendQueueProcessorFactory
+ * [HSEARCH-312] - Rename hibernate.search.filter.cache_bit_results.size to hibernate.search.filter.cache_docidresults.size
+
+** New Feature
+ * [HSEARCH-160] - Truly polymorphic queries
+ * [HSEARCH-268] - Apply changes to different indexes in parallel
+ * [HSEARCH-296] - Expose managed entity class via a Projection constant
+
+** Task
+ * [HSEARCH-303] - Review reference documentation
+
+
+3.1.0.CR1 (17-10-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-250] - In ReaderStrategies, ensure that the reader is current AND that the directory returned by the DirectoryProvider are the same
+ * [HSEARCH-293] - AddLuceneWork is not being removed from the queue when DeleteLuceneWork is added for the same entity
+ * [HSEARCH-300] - Fix documentation on use_compound_file
+
+** Improvement
+ * [HSEARCH-213] - Use FieldSelector and doc(int, fieldSelector) to only select the necessary fields
+ * [HSEARCH-224] - Use MultiClassesQueryLoader in ProjectionLoader
+ * [HSEARCH-255] - Create a extensive Analyzer testing suite
+ * [HSEARCH-266] - Do not switch to the current directory in FSSlaveDirectoryProvider if no file has been copied
+ * [HSEARCH-274] - Use Lucene's new readonly IndexReader
+ * [HSEARCH-281] - Work should be Work<T>
+ * [HSEARCH-283] - Replace deprecated Classes and methods calls to Lucene 2.4
+
+** New Feature
+ * [HSEARCH-104] - Make @DocumentId optional and rely on @Id
+ * [HSEARCH-290] - Use IndexReader = readonly on Reader strategies (see Lucene 2.4)
+ * [HSEARCH-294] - Rename INSTANCE_AND_BITSETRESULTS to INSTANCE_AND_DOCIDSETRESULTS
+
+** Task
+ * [HSEARCH-288] - Evaluate changes in Lucene 2.4.0
+ * [HSEARCH-289] - Move to new Lucene Filter DocIdSet
+ * [HSEARCH-291] - improve documentation about thread safety requirements of Bridges.
+
+
+3.1.0.Beta2 (27-10-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-142] - Modifications on objects indexed via @IndexedEmbedded not updated when not annotated @Indexed
+ * [HSEARCH-162] - NPE on queries when no entity is marked as @Indexed
+ * [HSEARCH-222] - Entities not found during concurrent update
+ * [HSEARCH-225] - Avoid using IndexReader.deleteDocument when index is not shared amongst several entity types
+ * [HSEARCH-232] - Using SnowballPorterFilterFactory throws NoClassDefFoundError
+ * [HSEARCH-237] - IdHashShardingStrategy fails on IDs having negative hashcode
+ * [HSEARCH-241] - initialize methods taking Properties cannot list available properties
+ * [HSEARCH-247] - Hibernate Search cannot run without apache-solr-analyzer.jar
+ * [HSEARCH-253] - Inconsistent detection of EventListeners during autoregistration into Hibernate listeners
+ * [HSEARCH-257] - Ignore delete operation when Core does update then delete on the same entity
+ * [HSEARCH-259] - Filter were not isolated by name in the cache
+ * [HSEARCH-262] - fullTextSession.purgeAll(Class<?>) does not consider subclasses
+ * [HSEARCH-263] - Wrong analyzers used in IndexWriter
+ * [HSEARCH-267] - Inheritance of annotations and analyzer
+ * [HSEARCH-271] - wrong Similarity used when sharing index among entities
+ * [HSEARCH-287] - master.xml is mistakenly copied to the distribution
+
+** Deprecation
+ * [HSEARCH-279] - deprecate SharedReaderProvider replaced by SharingBufferReaderProvider as default ReaderProvider
+
+** Improvement
+ * [HSEARCH-145] - Document a configuration property
+ * [HSEARCH-226] - Use Lucene ability to delete by query in IndexWriter
+ * [HSEARCH-240] - Generify the IndexShardingStrategy
+ * [HSEARCH-245] - Add ReaderStratregy.destroy() method
+ * [HSEARCH-256] - Remove CacheBitResults.YES
+ * [HSEARCH-260] - Simplify the Filter Caching definition: cache=FilterCacheModeType.[MODE]
+ * [HSEARCH-272] - Improve contention on DirectoryProviders in lucene backend
+ * [HSEARCH-273] - Make LuceneOptions an interface
+ * [HSEARCH-282] - Make the API more Generics friendly
+
+** New Feature
+ * [HSEARCH-170] - Support @Boost in @Field
+ * [HSEARCH-235] - provide a destroy() method in ReaderProvider
+ * [HSEARCH-252] - Document Solr integration
+ * [HSEARCH-258] - Add configuration option for Lucene's UseCompoundFile
+
+** Patch
+ * [HSEARCH-20] - Lucene extensions
+
+** Task
+ * [HSEARCH-231] - Update the getting started guide with Solr analyzers
+ * [HSEARCH-236] - Find whether or not indexWriter.optimize() requires an index lock
+ * [HSEARCH-244] - Abiltiy to ask SearchFactory for the scoped analyzer of a given class
+ * [HSEARCH-254] - Migrate to Solr 1.3
+ * [HSEARCH-276] - upgrade to Lucene 2.4
+ * [HSEARCH-286] - Align to GA versions of all dependencies
+ * [HSEARCH-292] - Document the new Filter caching approach
+
+
+3.1.0.Beta1 (17-07-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-166] - documentation error : hibernate.search.worker.batch_size vs hibernate.worker.batch_size
+ * [HSEARCH-171] - Do not log missing objects when using QueryLoader
+ * [HSEARCH-173] - CachingWrapperFilter loses its WeakReference making filter caching inefficient
+ * [HSEARCH-194] - Inconsistent performance between hibernate search and pure lucene access
+ * [HSEARCH-196] - ObjectNotFoundException not caught in FullTextSession
+ * [HSEARCH-198] - Documentation out of sync with implemented/released features
+ * [HSEARCH-203] - Counter of index modification operations not always incremented
+ * [HSEARCH-204] - Improper calls to Session during a projection not involving THIS
+ * [HSEARCH-205] - Out of Memory on copy of large indexes
+ * [HSEARCH-217] - Proper errors on parsing of all numeric configuration parameters
+ * [HSEARCH-227] - Criteria based fetching is not used when objects are loaded one by one (iterate())
+
+
+** Improvement
+ * [HSEARCH-19] - Do not filter classes on queries when we know that all Directories only contains the targeted classes
+ * [HSEARCH-156] - Retrofit FieldBridge.set lucene parameters into a LuceneOptions class
+ * [HSEARCH-157] - Make explicit in FAQ and doc that query.list() followed by query.getResultSize() triggers only one query
+ * [HSEARCH-163] - Enhance error messages when @FieldBridge is wrongly used (no impl or impl not implementing the right interfaces)
+ * [HSEARCH-176] - Permits alignment properties to lucene default (Sanne Grinovero)
+ * [HSEARCH-179] - Documentation should be explicit that @FulltextFilter filters every object, regardless which object is annotated
+ * [HSEARCH-181] - Better management of file-based index directories (Sanne Grinovero)
+ * [HSEARCH-189] - Thread management improvements for Master/Slave DirectoryProviders
+ * [HSEARCH-197] - Move to slf4j
+ * [HSEARCH-199] - Property close Search resources on SessionFactory.close()
+ * [HSEARCH-202] - Avoid many maps lookup in Workspace
+ * [HSEARCH-207] - Make DateBridge TwoWay to facilitate projection
+ * [HSEARCH-208] - Raise exception on index and purge when the entity is not an indexed entity
+ * [HSEARCH-209] - merge FullTextIndexCollectionEventListener into FullTextIndexEventListener
+ * [HSEARCH-215] - Rename Search.createFTS to Search.getFTS deprecating the old method
+ * [HSEARCH-223] - Use multiple criteria queries rather than ObjectLoader in most cases
+ * [HSEARCH-230] - Ensure initialization safety in a multi-core machine
+
+** New Feature
+ * [HSEARCH-133] - Allow overriding DefaultSimilarity for indexing and searching (Nick Vincent)
+ * [HSEARCH-141] - Allow term position information to be stored in an index
+ * [HSEARCH-153] - Provide the possibility to configure writer.setRAMBufferSizeMB() (Lucene 2.3)
+ * [HSEARCH-154] - Provide a facility to access Lucene query explanations
+ * [HSEARCH-164] - Built-in bridge to index java.lang.Class
+ * [HSEARCH-165] - URI and URL built-in bridges
+ * [HSEARCH-174] - Improve transparent filter caching by wrapping filters into our own CachingWrapperFilter
+ * [HSEARCH-186] - Enhance analyzer to support the Solr model
+ * [HSEARCH-190] - Add pom
+ * [HSEARCH-191] - Make build independent of Hibernate Core structure
+ * [HSEARCH-192] - Move to Hibernate Core 3.3
+ * [HSEARCH-193] - Use dependency on Solr-analyzer JAR rather than the full Solr JAR
+ * [HSEARCH-195] - Expose Analyzers instance by name: searchFactory.getAnalyzer(String)
+ * [HSEARCH-200] - Expose IndexWriter setting MAX_FIELD_LENGTH via IndexWriterSetting
+ * [HSEARCH-212] - Added ReaderProvider strategy reusing unchanged segments (using reader.reopen())
+ * [HSEARCH-220] - introduce session.flushToIndexes API and deprecate batch_size
+
+
+** Task
+ * [HSEARCH-169] - Migrate to Lucene 2.3.1 (index corruption possiblity in 2.3.0)
+ * [HSEARCH-187] - Clarify which directories need read-write access, verify readonly behaviour on others.
+ * [HSEARCH-214] - Upgrade Lucene to 2.3.2
+ * [HSEARCH-229] - Deprecate FullTextQuery.BOOST
+
+
+3.0.1.GA (20-02-2008)
+---------------------
+
+** Bug
+ * [HSEARCH-56] - Updating a collection does not reindex
+ * [HSEARCH-123] - Use mkdirs instead of mkdir to create necessary parent directory in the DirectoryProviderHelper
+ * [HSEARCH-128] - Indexing embedded children's child
+ * [HSEARCH-136] - CachingWrapperFilter does not cache
+ * [HSEARCH-137] - Wrong class name in Exception when a FieldBridge does not implement TwoWayFieldBridge for a document id property
+ * [HSEARCH-138] - JNDI Property names have first character cut off
+ * [HSEARCH-140] - @IndexedEmbedded default depth is effectively 1 due to integer overflow
+ * [HSEARCH-146] - ObjectLoader doesn't catch javax.persistence.EntityNotFoundException
+ * [HSEARCH-149] - Default FieldBridge for enums passing wrong class to EnumBridge constructor
+
+
+** Improvement
+ * [HSEARCH-125] - Add support for fields declared by interface or unmapped superclass
+ * [HSEARCH-127] - Wrong prefix for worker configurations
+ * [HSEARCH-129] - IndexedEmbedded for Collections Documentation
+ * [HSEARCH-130] - Should provide better log infos (on the indexBase parameter for the FSDirectoryProvider)
+ * [HSEARCH-144] - Keep indexer running till finished on VM shutdown
+ * [HSEARCH-147] - Allow projection of Lucene DocId
+
+** New Feature
+ * [HSEARCH-114] - Introduce ResultTransformer to the query API
+ * [HSEARCH-150] - Migrate to Lucene 2.3
+
+** Patch
+ * [HSEARCH-126] - Better diagnostic when Search index directory cannot be opened (Ian)
+
+
+3.0.0.GA (23-09-2007)
+---------------------
+
+** Bug
+ * [HSEARCH-116] - FullTextEntityManager acessing getDelegate() in the constructor leads to NPE in JBoss AS + Seam
+ * [HSEARCH-117] - FullTextEntityManagerImpl and others should implement Serializable
+
+** Deprecation
+ * [HSEARCH-122] - Remove query.setIndexProjection (replaced by query.setProjection)
+
+** Improvement
+ * [HSEARCH-118] - Add ClassBridges (plural) functionality
+
+** New Feature
+ * [HSEARCH-81] - Create a @ClassBridge Annotation (John Griffin)
+
+
+** Task
+ * [HSEARCH-98] - Add a Getting started section to the reference documentation
+
+
+3.0.0.CR1 (4-09-2007)
+---------------------
+
+** Bug
+ * [HSEARCH-108] - id of embedded object is not indexed when using @IndexedEmbedded
+ * [HSEARCH-109] - Lazy loaded entity could not be indexed
+ * [HSEARCH-110] - ScrollableResults does not obey out of bounds rules (John Griffin)
+ * [HSEARCH-112] - Unkown @FullTextFilter when attempting to associate a filter
+
+** Deprecation
+ * [HSEARCH-113] - Remove @Text, @Keyword and @Unstored (old mapping annotations)
+
+** Improvement
+ * [HSEARCH-107] - DirectoryProvider should have a start() method
+
+** New Feature
+ * [HSEARCH-14] - introduce fetch_size for Hibernate Search scrollable resultsets (John Griffin)
+ * [HSEARCH-69] - Ability to purge an index by class (John Griffin)
+ * [HSEARCH-111] - Ability to disable event based indexing (for read only or batch based indexing)
+
+
+3.0.0.Beta4 (1-08-2007)
+-----------------------
+
+** Bug
+ * [HSEARCH-88] - Unable to update 2 entity types in the same transaction if they share the same index
+ * [HSEARCH-90] - Use of setFirstResult / setMaxResults can lead to a list with negative capacity (John Griffin)
+ * [HSEARCH-92] - NPE for null fields on projection
+ * [HSEARCH-99] - Avoid returning non initialized proxies in scroll() and iterate() (loader.load(EntityInfo))
+
+
+** Improvement
+ * [HSEARCH-79] - Recommend to use FlushMode.APPLICATION on massive indexing
+ * [HSEARCH-84] - Migrate to Lucene 2.2
+ * [HSEARCH-91] - Avoid wrapping a Session object if the Session is already FullTextSession
+ * [HSEARCH-100] - Rename fullTextSession.setIndexProjection() to fullTextSession.setProjection()
+ * [HSEARCH-102] - Default index operation in @Field to TOKENIZED
+ * [HSEARCH-106] - Use the shared reader strategy as the default strategy
+
+** New Feature
+ * [HSEARCH-6] - Provide access to the Hit.getScore() and potentially the Document on a query
+ * [HSEARCH-15] - Notion of Filtered Lucene queries (Hardy Ferentschik)
+ * [HSEARCH-41] - Allow fine grained analyzers (Entity, attribute, @Field)
+ * [HSEARCH-45] - Support @Fields() for multiple indexing per property (useful for sorting)
+ * [HSEARCH-58] - Support named Filters (and caching)
+ * [HSEARCH-67] - Expose mergeFactor, maxMergeDocs and minMergeDocs (Hardy Ferentschik)
+ * [HSEARCH-73] - IncrementalOptimizerStrategy triggered on transactions or operations limits
+ * [HSEARCH-74] - Ability to project Lucene meta information (Score, Boost, Document, Id, This) (John Griffin)
+ * [HSEARCH-83] - Introduce OptimizerStrategy
+ * [HSEARCH-86] - Index sharding: multiple Lucene indexes per entity type
+ * [HSEARCH-89] - FullText wrapper for JPA APIs
+ * [HSEARCH-103] - Ability to override the indexName in the FSDirectoryProviders family
+
+
+** Task
+ * [HSEARCH-94] - Deprecate ContextHelper
+
+
+3.0.0.Beta3 (6-06-2007)
+-----------------------
+
+** Bug
+ * [HSEARCH-64] - Exception Thrown If Index Directory Does Not Exist
+ * [HSEARCH-66] - Some results not returned in some circumstances (Brandon Munroe)
+
+
+** Improvement
+ * [HSEARCH-60] - Introduce SearchFactory / SearchFactoryImpl
+ * [HSEARCH-68] - Set index copy threads as daemon
+ * [HSEARCH-70] - Create the index base directory if it does not exists
+
+** New Feature
+ * [HSEARCH-11] - Provide access to IndexWriter.optimize()
+ * [HSEARCH-33] - hibernate.search.worker.batch_size to prevent OutOfMemoryException while inserting many objects
+ * [HSEARCH-71] - Provide fullTextSession.getSearchFactory()
+ * [HSEARCH-72] - searchFactory.optimize() and searchFactory.optimize(Class) (Andrew Hahn)
+
+
+3.0.0.Beta2 (31-05-2007)
+------------------------
+
+** Bug
+ * [HSEARCH-37] - Verify that Serializable return type are not resolved by StringBridge built in type
+ * [HSEARCH-39] - event listener declaration example is wrong
+ * [HSEARCH-44] - Build the Lucene Document in the beforeComplete transaction phase
+ * [HSEARCH-50] - Null Booleans lead to NPE
+ * [HSEARCH-59] - Unable to index @indexEmbedded object through session.index when object is lazy and field access is used in object
+
+
+** Improvement
+ * [HSEARCH-36] - Meaningful exception message when Search Listeners are not initialized
+ * [HSEARCH-38] - Make the @IndexedEmbedded documentation example easier to understand
+ * [HSEARCH-51] - Optimization: Use a query rather than batch-size to load objects when a single entity (hierarchy) is expected
+ * [HSEARCH-63] - rename query.resultSize() to getResultSize()
+
+** New Feature
+ * [HSEARCH-4] - Be able to use a Lucene Sort on queries (Hardy Ferentschik)
+ * [HSEARCH-13] - Cache IndexReaders per SearchFactory
+ * [HSEARCH-40] - Be able to embed collections in lucene index (@IndexedEmbeddable in collections)
+ * [HSEARCH-43] - Expose resultSize and do not load object when only resultSize is retrieved
+ * [HSEARCH-52] - Ability to load more efficiently an object graph from a lucene query by customizing the fetch modes
+ * [HSEARCH-53] - Add support for projection (ie read the data from the index only)
+ * [HSEARCH-61] - Move from MultiSearcher to MultiReader
+ * [HSEARCH-62] - Support pluggable ReaderProvider strategies
+
+
+** Task
+ * [HSEARCH-65] - Update to JBoss Embedded beta2
+
+
+3.0.0.Beta1 (19-03-2007)
+------------------------
+
+Initial release as a standalone product (see Hibernate Annotations changelog for previous informations)
+
+
+Release Notes - Hibernate Search - Version 3.0.0.beta1
+
+** Bug
+ * [HSEARCH-7] - Ignore object found in the index but no longer present in the database (for out of date indexes)
+ * [HSEARCH-21] - NPE in SearchFactory while using different threads
+ * [HSEARCH-22] - Enum value Index.UN_TOKENISED is misspelled
+ * [HSEARCH-24] - Potential deadlock when using multiple DirectoryProviders in a highly concurrent index update
+ * [HSEARCH-25] - Class cast exception in org.hibernate.search.impl.FullTextSessionImpl<init>(FullTextSessionImpl.java:54)
+ * [HSEARCH-28] - Wrong indexDir property in Apache Lucene Integration
+
+
+** Improvement
+ * [HSEARCH-29] - Share the initialization state across all Search event listeners instance
+ * [HSEARCH-30] - @FieldBridge now use o.h.s.a.Parameter rather than o.h.a.Parameter
+ * [HSEARCH-31] - Move to Lucene 2.1.0
+
+** New Feature
+ * [HSEARCH-1] - Give access to Directory providers
+ * [HSEARCH-2] - Default FieldBridge for enums (Sylvain Vieujot)
+ * [HSEARCH-3] - Default FieldBridge for booleans (Sylvain Vieujot)
+ * [HSEARCH-9] - Introduce a worker factory and its configuration
+ * [HSEARCH-16] - Cluster capability through JMS
+ * [HSEARCH-23] - Support asynchronous batch worker queue
+ * [HSEARCH-27] - Ability to index associated / embedded objects
Deleted: search/tags/v3_1_0_GA/doc/reference/en/master.xml
===================================================================
--- search/trunk/doc/reference/en/master.xml 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/tags/v3_1_0_GA/doc/reference/en/master.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -1,92 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!-- $Id$ -->
-<!--
- ~ Hibernate, Relational Persistence for Idiomatic Java
- ~
- ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
- ~ indicated by the @author tags or express copyright attribution
- ~ statements applied by the authors. All third-party contributions are
- ~ distributed under license by Red Hat Middleware LLC.
- ~
- ~ This copyrighted material is made available to anyone wishing to use, modify,
- ~ copy, or redistribute it subject to the terms and conditions of the GNU
- ~ Lesser General Public License, as published by the Free Software Foundation.
- ~
- ~ This program is distributed in the hope that it will be useful,
- ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
- ~ for more details.
- ~
- ~ You should have received a copy of the GNU Lesser General Public License
- ~ along with this distribution; if not, write to:
- ~ Free Software Foundation, Inc.
- ~ 51 Franklin Street, Fifth Floor
- ~ Boston, MA 02110-1301 USA
- -->
-<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
-<!ENTITY versionNumber "3.1.0.GA">
-<!ENTITY copyrightYear "2004">
-<!ENTITY copyrightHolder "Red Hat Middleware, LLC.">
-]>
-<book lang="en">
- <bookinfo>
- <title>Hibernate Search</title>
-
- <subtitle>Apache <trademark>Lucene</trademark> Integration</subtitle>
-
- <subtitle>Reference Guide</subtitle>
-
- <releaseinfo>&versionNumber;</releaseinfo>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/hibernate_logo_a.png" format="PNG" />
- </imageobject>
- </mediaobject>
- </bookinfo>
-
- <toc></toc>
-
- <preface id="preface" revision="2">
- <title>Preface</title>
-
- <para>Full text search engines like Apache Lucene are very powerful
- technologies to add efficient free text search capabilities to
- applications. However, they suffer several mismatches when dealing with
- object domain models. Amongst other things indexes have to be kept up to
- date and mismatches between index structure and domain model as well as
- query mismatches have to be avoided.</para>
-
- <para>Hibernate Search indexes your domain model with the help of a few
- annotations, takes care of database/index synchronization and brings back
- regular managed objects from free text queries. To achieve this Hibernate
- Search is combining the power of <ulink
- url="http://www.hibernate.org">Hibernate</ulink> and <ulink
- url="http://lucene.apache.org">Apache Lucene</ulink>.</para>
- </preface>
-
- <xi:include href="modules/getting-started.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/architecture.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/configuration.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/mapping.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/query.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/batchindex.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/optimize.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/lucene-native.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-</book>
Copied: search/tags/v3_1_0_GA/doc/reference/en/master.xml (from rev 15660, search/trunk/doc/reference/en/master.xml)
===================================================================
--- search/tags/v3_1_0_GA/doc/reference/en/master.xml (rev 0)
+++ search/tags/v3_1_0_GA/doc/reference/en/master.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -0,0 +1,92 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- $Id$ -->
+<!--
+ ~ Hibernate, Relational Persistence for Idiomatic Java
+ ~
+ ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
+ ~ indicated by the @author tags or express copyright attribution
+ ~ statements applied by the authors. All third-party contributions are
+ ~ distributed under license by Red Hat Middleware LLC.
+ ~
+ ~ This copyrighted material is made available to anyone wishing to use, modify,
+ ~ copy, or redistribute it subject to the terms and conditions of the GNU
+ ~ Lesser General Public License, as published by the Free Software Foundation.
+ ~
+ ~ This program is distributed in the hope that it will be useful,
+ ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
+ ~ for more details.
+ ~
+ ~ You should have received a copy of the GNU Lesser General Public License
+ ~ along with this distribution; if not, write to:
+ ~ Free Software Foundation, Inc.
+ ~ 51 Franklin Street, Fifth Floor
+ ~ Boston, MA 02110-1301 USA
+ -->
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
+<!ENTITY versionNumber "3.1.0.GA">
+<!ENTITY copyrightYear "2004">
+<!ENTITY copyrightHolder "Red Hat Middleware, LLC.">
+]>
+<book lang="en">
+ <bookinfo>
+ <title>Hibernate Search</title>
+
+ <subtitle>Apache <trademark>Lucene</trademark> Integration</subtitle>
+
+ <subtitle>Reference Guide</subtitle>
+
+ <releaseinfo>&versionNumber;</releaseinfo>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/hibernate_logo_a.png" format="PNG" />
+ </imageobject>
+ </mediaobject>
+ </bookinfo>
+
+ <toc></toc>
+
+ <preface id="preface" revision="2">
+ <title>Preface</title>
+
+ <para>Full text search engines like Apache Lucene are very powerful
+ technologies to add efficient free text search capabilities to
+ applications. However, Lucene suffers several mismatches when dealing with
+ object domain model. Amongst other things indexes have to be kept up to
+ date and mismatches between index structure and domain model as well as
+ query mismatches have to be avoided.</para>
+
+ <para>Hibernate Search addresses these shortcomings - it indexes your
+ domain model with the help of a few annotations, takes care of
+ database/index synchronization and brings back regular managed objects
+ from free text queries. To achieve this Hibernate Search is combining the
+ power of <ulink url="http://www.hibernate.org">Hibernate</ulink> and
+ <ulink url="http://lucene.apache.org">Apache Lucene</ulink>.</para>
+ </preface>
+
+ <xi:include href="modules/getting-started.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/architecture.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/configuration.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/mapping.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/query.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/batchindex.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/optimize.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/lucene-native.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+</book>
Deleted: search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml
===================================================================
--- search/trunk/doc/reference/en/modules/mapping.xml 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -1,1451 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- ~ Hibernate, Relational Persistence for Idiomatic Java
- ~
- ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
- ~ indicated by the @author tags or express copyright attribution
- ~ statements applied by the authors. All third-party contributions are
- ~ distributed under license by Red Hat Middleware LLC.
- ~
- ~ This copyrighted material is made available to anyone wishing to use, modify,
- ~ copy, or redistribute it subject to the terms and conditions of the GNU
- ~ Lesser General Public License, as published by the Free Software Foundation.
- ~
- ~ This program is distributed in the hope that it will be useful,
- ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
- ~ for more details.
- ~
- ~ You should have received a copy of the GNU Lesser General Public License
- ~ along with this distribution; if not, write to:
- ~ Free Software Foundation, Inc.
- ~ 51 Franklin Street, Fifth Floor
- ~ Boston, MA 02110-1301 USA
- -->
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
-<chapter id="search-mapping" revision="3">
- <!-- $Id$ -->
-
- <title>Mapping entities to the index structure</title>
-
- <para>All the metadata information needed to index entities is described
- through annotations. There is no need for xml mapping files. In fact there
- is currently no xml configuration option available (see <ulink
- url="http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-210">HSEARCH-210</ulink>).
- You can still use hibernate mapping files for the basic Hibernate
- configuration, but the Search specific configuration has to be expressed via
- annotations.</para>
-
- <section id="search-mapping-entity" revision="3">
- <title>Mapping an entity</title>
-
- <section id="basic-mapping">
- <title>Basic mapping</title>
-
- <para>First, we must declare a persistent class as indexable. This is
- done by annotating the class with <literal>@Indexed</literal> (all
- entities not annotated with <literal>@Indexed</literal> will be ignored
- by the indexing process):</para>
-
- <example>
- <title>Making a class indexable using the
- <classname>@Indexed</classname> annotation</title>
-
- <programlisting>@Entity
-<emphasis role="bold">@Indexed(index="indexes/essays")</emphasis>
-public class Essay {
- ...
-}</programlisting>
- </example>
-
- <para>The <literal>index</literal> attribute tells Hibernate what the
- Lucene directory name is (usually a directory on your file system). It
- is recommended to define a base directory for all Lucene indexes using
- the <literal>hibernate.search.default.indexBase</literal> property in
- your configuration file. Alternatively you can specify a base directory
- per indexed entity by specifying
- <literal>hibernate.search.<index>.indexBase, </literal>where
- <literal><index></literal> is the fully qualified classname of the
- indexed entity. Each entity instance will be represented by a Lucene
- <classname>Document</classname> inside the given index (aka
- Directory).</para>
-
- <para>For each property (or attribute) of your entity, you have the
- ability to describe how it will be indexed. The default (no annotation
- present) means that the property is completly ignored by the indexing
- process. <literal>@Field</literal> does declare a property as indexed.
- When indexing an element to a Lucene document you can specify how it is
- indexed:</para>
-
- <itemizedlist>
- <listitem>
- <para><literal>name</literal> : describe under which name, the
- property should be stored in the Lucene Document. The default value
- is the property name (following the JavaBeans convention)</para>
- </listitem>
-
- <listitem>
- <para><literal>store</literal> : describe whether or not the
- property is stored in the Lucene index. You can store the value
- <literal>Store.YES</literal> (comsuming more space in the index but
- allowing projection, see <xref linkend="projections" /> for more
- information), store it in a compressed way
- <literal>Store.COMPRESS</literal> (this does consume more CPU), or
- avoid any storage <literal>Store.NO</literal> (this is the default
- value). When a property is stored, you can retrieve its original
- value from the Lucene Document. This is not related to whether the
- element is indexed or not.</para>
- </listitem>
-
- <listitem>
- <para>index: describe how the element is indexed and the type of
- information store. The different values are
- <literal>Index.NO</literal> (no indexing, ie cannot be found by a
- query), <literal>Index.TOKENIZED</literal> (use an analyzer to
- process the property), <literal>Index.UN_TOKENISED</literal> (no
- analyzer pre processing), <literal>Index.NO_NORM</literal> (do not
- store the normalization data). The default value is
- <literal>TOKENIZED</literal>.</para>
- </listitem>
-
- <listitem>
- <para>termVector: describes collections of term-frequency pairs.
- This attribute enables term vectors being stored during indexing so
- they are available within documents. The default value is
- TermVector.NO.</para>
-
- <para>The different values of this attribute are:</para>
-
- <informaltable align="left" width="">
- <tgroup cols="2">
- <thead>
- <row>
- <entry align="center">Value</entry>
-
- <entry align="center">Definition</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry align="left">TermVector.YES</entry>
-
- <entry>Store the term vectors of each document. This
- produces two synchronized arrays, one contains document
- terms and the other contains the term's frequency.</entry>
- </row>
-
- <row>
- <entry align="left">TermVector.NO</entry>
-
- <entry>Do not store term vectors.</entry>
- </row>
-
- <row>
- <entry align="left">TermVector.WITH_OFFSETS</entry>
-
- <entry>Store the term vector and token offset information.
- This is the same as TermVector.YES plus it contains the
- starting and ending offset position information for the
- terms.</entry>
- </row>
-
- <row>
- <entry align="left">TermVector.WITH_POSITIONS</entry>
-
- <entry>Store the term vector and token position information.
- This is the same as TermVector.YES plus it contains the
- ordinal positions of each occurrence of a term in a
- document.</entry>
- </row>
-
- <row>
- <entry
- align="left">TermVector.WITH_POSITIONS_OFFSETS</entry>
-
- <entry>Store the term vector, token position and offset
- information. This is a combination of the YES, WITH_OFFSETS
- and WITH_POSITIONS.</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- </listitem>
- </itemizedlist>
-
- <para>Whether or not you want to store the original data in the index
- depends on how you wish to use the index query result. For a regular
- Hibernate Search usage storing is not necessary. However you might want
- to store some fields to subsequently project them (see <xref
- linkend="projections" /> for more information).</para>
-
- <para>Whether or not you want to tokenize a property depends on whether
- you wish to search the element as is, or by the words it contains. It
- make sense to tokenize a text field, but tokenizing a date field
- probably not. Note that fields used for sorting must not be
- tokenized.</para>
-
- <para>Finally, the id property of an entity is a special property used
- by Hibernate Search to ensure index unicity of a given entity. By
- design, an id has to be stored and must not be tokenized. To mark a
- property as index id, use the <literal>@DocumentId</literal> annotation.
- If you are using Hibernate Annotations and you have specified @Id you
- can omit @DocumentId. The chosen entity id will also be used as document
- id.</para>
-
- <example>
- <title>Adding <classname>@DocumentId</classname> ad
- <classname>@Field</classname> annotations to an indexed entity</title>
-
- <programlisting>@Entity
-@Indexed(index="indexes/essays")
-public class Essay {
- ...
-
- @Id
- <emphasis role="bold">@DocumentId</emphasis>
- public Long getId() { return id; }
-
- <emphasis role="bold">@Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES)</emphasis>
- public String getSummary() { return summary; }
-
- @Lob
- <emphasis role="bold">@Field(index=Index.TOKENIZED)</emphasis>
- public String getText() { return text; }
-}</programlisting>
- </example>
-
- <para>The above annotations define an index with three fields:
- <literal>id</literal> , <literal>Abstract</literal> and
- <literal>text</literal> . Note that by default the field name is
- decapitalized, following the JavaBean specification</para>
- </section>
-
- <section>
- <title>Mapping properties multiple times</title>
-
- <para>Sometimes one has to map a property multiple times per index, with
- slightly different indexing strategies. For example, sorting a query by
- field requires the field to be <literal>UN_TOKENIZED</literal>. If one
- wants to search by words in this property and still sort it, one need to
- index it twice - once tokenized and once untokenized. @Fields allows to
- achieve this goal.</para>
-
- <example>
- <title>Using @Fields to map a property multiple times</title>
-
- <programlisting>@Entity
-@Indexed(index = "Book" )
-public class Book {
- <emphasis role="bold">@Fields( {</emphasis>
- @Field(index = Index.TOKENIZED),
- @Field(name = "summary_forSort", index = Index.UN_TOKENIZED, store = Store.YES)
- <emphasis role="bold">} )</emphasis>
- public String getSummary() {
- return summary;
- }
-
- ...
-}</programlisting>
- </example>
-
- <para>The field <literal>summary</literal> is indexed twice, once as
- <literal>summary</literal> in a tokenized way, and once as
- <literal>summary_forSort</literal> in an untokenized way. @Field
- supports 2 attributes useful when @Fields is used:</para>
-
- <itemizedlist>
- <listitem>
- <para>analyzer: defines a @Analyzer annotation per field rather than
- per property</para>
- </listitem>
-
- <listitem>
- <para>bridge: defines a @FieldBridge annotation per field rather
- than per property</para>
- </listitem>
- </itemizedlist>
-
- <para>See below for more information about analyzers and field
- bridges.</para>
- </section>
-
- <section id="search-mapping-associated">
- <title>Embedded and associated objects</title>
-
- <para>Associated objects as well as embedded objects can be indexed as
- part of the root entity index. This is ueful if you expect to search a
- given entity based on properties of associated objects. In the following
- example the aim is to return places where the associated city is Atlanta
- (In the Lucene query parser language, it would translate into
- <code>address.city:Atlanta</code>).</para>
-
- <example>
- <title>Using @IndexedEmbedded to index associations</title>
-
- <programlisting>@Entity
-@Indexed
-public class Place {
- @Id
- @GeneratedValue
- @DocumentId
- private Long id;
-
- @Field( index = Index.TOKENIZED )
- private String name;
-
- @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
- <emphasis role="bold">@IndexedEmbedded</emphasis>
- private Address address;
- ....
-}
-
-@Entity
-public class Address {
- @Id
- @GeneratedValue
- private Long id;
-
- @Field(index=Index.TOKENIZED)
- private String street;
-
- @Field(index=Index.TOKENIZED)
- private String city;
-
- <emphasis role="bold">@ContainedIn</emphasis>
- @OneToMany(mappedBy="address")
- private Set<Place> places;
- ...
-}</programlisting>
- </example>
-
- <para>In this example, the place fields will be indexed in the
- <literal>Place</literal> index. The <literal>Place</literal> index
- documents will also contain the fields <literal>address.id</literal>,
- <literal>address.street</literal>, and <literal>address.city</literal>
- which you will be able to query. This is enabled by the
- <literal>@IndexedEmbedded</literal> annotation.</para>
-
- <para>Be careful. Because the data is denormalized in the Lucene index
- when using the <classname>@IndexedEmbedded</classname> technique,
- Hibernate Search needs to be aware of any change in the
- <classname>Place</classname> object and any change in the
- <classname>Address</classname> object to keep the index up to date. To
- make sure the <literal><classname>Place</classname></literal> Lucene
- document is updated when it's <classname>Address</classname> changes,
- you need to mark the other side of the birirectional relationship with
- <classname>@ContainedIn</classname>.</para>
-
- <para><literal>@ContainedIn</literal> is only useful on associations
- pointing to entities as opposed to embedded (collection of)
- objects.</para>
-
- <para>Let's make our example a bit more complex:</para>
-
- <example>
- <title>Nested usage of <classname>@IndexedEmbedded</classname> and
- <classname>@ContainedIn</classname></title>
-
- <programlisting>@Entity
-@Indexed
-public class Place {
- @Id
- @GeneratedValue
- @DocumentId
- private Long id;
-
- @Field( index = Index.TOKENIZED )
- private String name;
-
- @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
- <emphasis role="bold">@IndexedEmbedded</emphasis>
- private Address address;
- ....
-}
-
-@Entity
-public class Address {
- @Id
- @GeneratedValue
- private Long id;
-
- @Field(index=Index.TOKENIZED)
- private String street;
-
- @Field(index=Index.TOKENIZED)
- private String city;
-
- <emphasis role="bold">@IndexedEmbedded(depth = 1, prefix = "ownedBy_")</emphasis>
- private Owner ownedBy;
-
- <emphasis role="bold">@ContainedIn</emphasis>
- @OneToMany(mappedBy="address")
- private Set<Place> places;
- ...
-}
-
-@Embeddable
-public class Owner {
- @Field(index = Index.TOKENIZED)
- private String name;
- ...
-}</programlisting>
- </example>
-
- <para>Any <literal>@*ToMany, @*ToOne</literal> and
- <literal>@Embedded</literal> attribute can be annotated with
- <literal>@IndexedEmbedded</literal>. The attributes of the associated
- class will then be added to the main entity index. In the previous
- example, the index will contain the following fields</para>
-
- <itemizedlist>
- <listitem>
- <para>id</para>
- </listitem>
-
- <listitem>
- <para>name</para>
- </listitem>
-
- <listitem>
- <para>address.street</para>
- </listitem>
-
- <listitem>
- <para>address.city</para>
- </listitem>
-
- <listitem>
- <para>addess.ownedBy_name</para>
- </listitem>
- </itemizedlist>
-
- <para>The default prefix is <literal>propertyName.</literal>, following
- the traditional object navigation convention. You can override it using
- the <literal>prefix</literal> attribute as it is shown on the
- <literal>ownedBy</literal> property.</para>
-
- <note>
- <para>The prefix cannot be set to the empty string. </para>
- </note>
-
- <para>The<literal> depth</literal> property is necessary when the object
- graph contains a cyclic dependency of classes (not instances). For
- example, if <classname>Owner</classname> points to
- <classname>Place</classname>. Hibernate Search will stop including
- Indexed embedded atttributes after reaching the expected depth (or the
- object graph boundaries are reached). A class having a self reference is
- an example of cyclic dependency. In our example, because
- <literal>depth</literal> is set to 1, any
- <literal>@IndexedEmbedded</literal> attribute in Owner (if any) will be
- ignored. </para>
-
- <para>Using <literal>@IndexedEmbedded</literal> for object associations
- allows you to express queries such as:</para>
-
- <itemizedlist>
- <listitem>
- <para>Return places where name contains JBoss and where address city
- is Atlanta. In Lucene query this would be</para>
-
- <programlisting>+name:jboss +address.city:atlanta </programlisting>
- </listitem>
-
- <listitem>
- <para>Return places where name contains JBoss and where owner's name
- contain Joe. In Lucene query this would be</para>
-
- <programlisting>+name:jboss +address.orderBy_name:joe </programlisting>
- </listitem>
- </itemizedlist>
-
- <para>In a way it mimics the relational join operation in a more
- efficient way (at the cost of data duplication). Remember that, out of
- the box, Lucene indexes have no notion of association, the join
- operation is simply non-existent. It might help to keep the relational
- model normalized while benefiting from the full text index speed and
- feature richness.</para>
-
- <para><note>
- <para>An associated object can itself (but does not have to) be
- <literal>@Indexed</literal></para>
- </note></para>
-
- <para>When @IndexedEmbedded points to an entity, the association has to
- be directional and the other side has to be annotated
- <literal>@ContainedIn</literal> (as seen in the previous example). If
- not, Hibernate Search has no way to update the root index when the
- associated entity is updated (in our example, a <literal>Place</literal>
- index document has to be updated when the associated
- <classname>Address</classname> instance is updated).</para>
-
- <para>Sometimes, the object type annotated by
- <classname>@IndexedEmbedded</classname> is not the object type targeted
- by Hibernate and Hibernate Search. This is especially the case when
- interfaces are used in lieu of their implementation. For this reason you
- can override the object type targeted by Hibernate Search using the
- <methodname>targetElement</methodname> parameter.</para>
-
- <example>
- <title>Using the <literal>targetElement</literal> property of
- <classname>@IndexedEmbedded</classname></title>
-
- <programlisting>@Entity
-@Indexed
-public class Address {
- @Id
- @GeneratedValue
- @DocumentId
- private Long id;
-
- @Field(index= Index.TOKENIZED)
- private String street;
-
- @IndexedEmbedded(depth = 1, prefix = "ownedBy_", <emphasis role="bold">targetElement = Owner.class</emphasis>)
- @Target(Owner.class)
- private Person ownedBy;
-
-
- ...
-}
-
-@Embeddable
-public class Owner implements Person { ... }</programlisting>
- </example>
- </section>
-
- <section>
- <title>Boost factor</title>
-
- <para>Lucene has the notion of <emphasis>boost factor</emphasis>. It's a
- way to give more weigth to a field or to an indexed element over others
- during the indexation process. You can use <literal>@Boost</literal> at
- the @Field, method or class level.</para>
-
- <example>
- <title>Using different ways of increasing the weight of an indexed
- element using a boost factor</title>
-
- <programlisting>@Entity
-@Indexed(index="indexes/essays")
-<emphasis role="bold">@Boost(1.7f)</emphasis>
-public class Essay {
- ...
-
- @Id
- @DocumentId
- public Long getId() { return id; }
-
- @Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES, boost=<emphasis
- role="bold">@Boost(2f)</emphasis>)
- <emphasis role="bold">@Boost(1.5f)</emphasis>
- public String getSummary() { return summary; }
-
- @Lob
- @Field(index=Index.TOKENIZED, boost=<emphasis role="bold">@Boost(1.2f)</emphasis>)
- public String getText() { return text; }
-
- @Field
- public String getISBN() { return isbn; }
-
-} </programlisting>
- </example>
-
- <para>In our example, <classname>Essay</classname>'s probability to
- reach the top of the search list will be multiplied by 1.7. The
- <methodname>summary</methodname> field will be 3.0 (2 * 1.5 -
- <methodname>@Field.boost</methodname> and <classname>@Boost</classname>
- on a property are cumulative) more important than the
- <methodname>isbn</methodname> field. The <methodname>text</methodname>
- field will be 1.2 times more important than the
- <methodname>isbn</methodname> field. Note that this explanation in
- strictest terms is actually wrong, but it is simple and close enough to
- reality for all practical purposes. Please check the Lucene
- documentation or the excellent <citetitle>Lucene In Action </citetitle>
- from Otis Gospodnetic and Erik Hatcher.</para>
- </section>
-
- <section id="analyzer">
- <title>Analyzer</title>
-
- <para>The default analyzer class used to index tokenized fields is
- configurable through the <literal>hibernate.search.analyzer</literal>
- property. The default value for this property is
- <classname>org.apache.lucene.analysis.standard.StandardAnalyzer</classname>.</para>
-
- <para>You can also define the analyzer class per entity, property and
- even per @Field (useful when multiple fields are indexed from a single
- property).</para>
-
- <example>
- <title>Different ways of specifying an analyzer</title>
-
- <programlisting>@Entity
-@Indexed
-<emphasis role="bold">@Analyzer(impl = EntityAnalyzer.class)</emphasis>
-public class MyEntity {
- @Id
- @GeneratedValue
- @DocumentId
- private Integer id;
-
- @Field(index = Index.TOKENIZED)
- private String name;
-
- @Field(index = Index.TOKENIZED)
- <emphasis role="bold">@Analyzer(impl = PropertyAnalyzer.class)</emphasis>
- private String summary;
-
- @Field(index = Index.TOKENIZED, <emphasis><emphasis role="bold">analyzer = @Analyzer(impl = FieldAnalyzer.class</emphasis>)</emphasis>
- private String body;
-
- ...
-}</programlisting>
- </example>
-
- <para>In this example, <classname>EntityAnalyzer</classname> is used to
- index all tokenized properties (eg. <literal>name</literal>), except
- <literal>summary</literal> and <literal>body</literal> which are indexed
- with <classname>PropertyAnalyzer</classname> and
- <classname>FieldAnalyzer</classname> respectively.</para>
-
- <caution>
- <para>Mixing different analyzers in the same entity is most of the
- time a bad practice. It makes query building more complex and results
- less predictable (for the novice), especially if you are using a
- QueryParser (which uses the same analyzer for the whole query). As a
- rule of thumb, for any given field the same analyzer should be used
- for indexing and querying.</para>
- </caution>
-
- <section>
- <title>Analyzer definitions</title>
-
- <para>Analyzers can become quite complex to deal with for which reason
- Hibernate Search introduces the notion of analyzer definitions. An
- analyzer definition can be reused by many
- <classname>@Analyzer</classname> declarations. An analyzer definition
- is composed of:</para>
-
- <itemizedlist>
- <listitem>
- <para>a name: the unique string used to refer to the
- definition</para>
- </listitem>
-
- <listitem>
- <para>a tokenizer: responsible for tokenizing the input stream
- into individual words</para>
- </listitem>
-
- <listitem>
- <para>a list of filters: each filter is responsible to remove,
- modify or sometimes even add words into the stream provided by the
- tokenizer</para>
- </listitem>
- </itemizedlist>
-
- <para>This separation of tasks - a tokenizer followed by a list of
- filters - allows for easy reuse of each individual component and let
- you build your customized analyzer in a very flexible way (just like
- lego). Generally speaking the <classname>Tokenizer</classname> starts
- the analysis process by turning the character input into tokens which
- are then further processed by the <classname>TokenFilter</classname>s.
- Hibernate Search supports this infrastructure by utilizing the Solr
- analyzer framework. Make sure to add<filename> solr-core.jar and
- </filename><filename>solr-common.jar</filename> to your classpath to
- use analyzer definitions. In case you also want to utilizing a
- snowball stemmer also include the
- <filename>lucene-snowball.jar.</filename> Other Solr analyzers might
- depend on more libraries. For example, the
- <classname>PhoneticFilterFactory</classname> depends on <ulink
- url="http://commons.apache.org/codec">commons-codec</ulink>. Your
- distribution of Hibernate Search provides these dependecies in its
- <filename>lib</filename> directory.</para>
-
- <example>
- <title><classname>@AnalyzerDef</classname> and the Solr
- framework</title>
-
- <programlisting>@AnalyzerDef(name="customanalyzer",
- tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
- filters = {
- @TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
- @TokenFilterDef(factory = LowerCaseFilterFactory.class),
- @TokenFilterDef(factory = StopFilterFactory.class, params = {
- @Parameter(name="words", value= "org/hibernate/search/test/analyzer/solr/stoplist.properties" ),
- @Parameter(name="ignoreCase", value="true")
- })
-})
-public class Team {
- ...
-}</programlisting>
- </example>
-
- <para>A tokenizer is defined by its factory which is responsible for
- building the tokenizer and using the optional list of parameters. This
- example use the standard tokenizer. A filter is defined by its factory
- which is responsible for creating the filter instance using the
- optional parameters. In our example, the StopFilter filter is built
- reading the dedicated words property file and is expected to ignore
- case. The list of parameters is dependent on the tokenizer or filter
- factory.</para>
-
- <warning>
- <para>Filters are applied in the order they are defined in the
- <classname>@AnalyzerDef</classname> annotation. Make sure to think
- twice about this order.</para>
- </warning>
-
- <para>Once defined, an analyzer definition can be reused by an
- <classname>@Analyzer</classname> declaration using the definition name
- rather than declaring an implementation class.</para>
-
- <example>
- <title>Referencing an analyzer by name</title>
-
- <programlisting>@Entity
-@Indexed
-@AnalyzerDef(name="customanalyzer", ... )
-public class Team {
- @Id
- @DocumentId
- @GeneratedValue
- private Integer id;
-
- @Field
- private String name;
-
- @Field
- private String location;
-
- @Field <emphasis role="bold">@Analyzer(definition = "customanalyzer")</emphasis>
- private String description;
-}</programlisting>
- </example>
-
- <para>Analyzer instances declared by
- <classname>@AnalyzerDef</classname> are available by their name in the
- <classname>SearchFactory</classname>.</para>
-
- <programlisting>Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("customanalyzer");</programlisting>
-
- <para>This is quite useful wen building queries. Fields in queries
- should be analyzed with the same analyzer used to index the field so
- that they speak a common "language": the same tokens are reused
- between the query and the indexing process. This rule has some
- exceptions but is true most of the time. Respect it unless you know
- what you are doing.</para>
- </section>
-
- <section>
- <title>Available analyzers</title>
-
- <para>Solr and Lucene come with a lot of useful default tokenizers and
- filters. You can find a complete list of tokenizer factories and
- filter factories at <ulink
- url="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters">http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters</ulink>.
- Let check a few of them.</para>
-
- <table>
- <title>Some of the tokenizers avalable</title>
-
- <tgroup cols="3">
- <thead>
- <row>
- <entry align="center">Factory</entry>
-
- <entry align="center">Description</entry>
-
- <entry align="center">parameters</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry>StandardTokenizerFactory</entry>
-
- <entry>Use the Lucene StandardTokenizer</entry>
-
- <entry>none</entry>
- </row>
-
- <row>
- <entry>HTMLStripStandardTokenizerFactory</entry>
-
- <entry>Remove HTML tags, keep the text and pass it to a
- StandardTokenizer</entry>
-
- <entry>none</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <table>
- <title>Some of the filters avalable</title>
-
- <tgroup cols="3">
- <thead>
- <row>
- <entry align="center">Factory</entry>
-
- <entry align="center">Description</entry>
-
- <entry align="center">parameters</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry>StandardFilterFactory</entry>
-
- <entry>Remove dots from acronyms and 's from words</entry>
-
- <entry>none</entry>
- </row>
-
- <row>
- <entry>LowerCaseFilterFactory</entry>
-
- <entry>Lowercase words</entry>
-
- <entry>none</entry>
- </row>
-
- <row>
- <entry>StopFilterFactory</entry>
-
- <entry>remove words (tokens) matching a list of stop
- words</entry>
-
- <entry><para><literal>words</literal>: points to a resource
- file containing the stop words</para><para>ignoreCase: true if
- <literal>case</literal> should be ignore when comparing stop
- words, <literal>false</literal> otherwise </para></entry>
- </row>
-
- <row>
- <entry>SnowballPorterFilterFactory</entry>
-
- <entry>Reduces a word to it's root in a given language. (eg.
- protect, protects, protection share the same root). Using such
- a filter allows searches matching related words.</entry>
-
- <entry><para><literal>language</literal>: Danish, Dutch,
- English, Finnish, French, German, Italian, Norwegian,
- Portuguese, Russian, Spanish, Swedish</para>and a few
- more</entry>
- </row>
-
- <row>
- <entry>ISOLatin1AccentFilterFactory</entry>
-
- <entry>remove accents for languages like French</entry>
-
- <entry>none</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <para>We recommend to check all the implementations of
- <classname>org.apache.solr.analysis.TokenizerFactory</classname> and
- <classname>org.apache.solr.analysis.TokenFilterFactory</classname> in
- your IDE to see the implementations available.</para>
- </section>
-
- <section>
- <title>Analyzer discriminator (experimental)</title>
-
- <para>So far all the introduced ways to specify an analyzer were
- static. However, there are usecases where it is useful to select an
- analyzer depending on the current state of the entity to be indexed,
- for example in multilingual application. For an
- <classname>BlogEntry</classname> class for example the analyzer could
- depend on the language property of the entry. Depending on this
- property the correct language specific stemmer should be chosen to
- index the actual text. </para>
-
- <para>To enable this dynamic analyzer selection Hibernate Search
- introduces the <classname>AnalyzerDiscriminator</classname>
- annotation. The following example demonstrates the usage of this
- annotation:</para>
-
- <para><example>
- <title>Usage of @AnalyzerDiscriminator in order to select an
- analyzer depending on the entity state</title>
-
- <programlisting>@Entity
-@Indexed
-@AnalyzerDefs({
- @AnalyzerDef(name = "en",
- tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
- filters = {
- @TokenFilterDef(factory = LowerCaseFilterFactory.class),
- @TokenFilterDef(factory = EnglishPorterFilterFactory.class
- )
- }),
- @AnalyzerDef(name = "de",
- tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
- filters = {
- @TokenFilterDef(factory = LowerCaseFilterFactory.class),
- @TokenFilterDef(factory = GermanStemFilterFactory.class)
- })
-})
-public class BlogEntry {
-
- @Id
- @GeneratedValue
- @DocumentId
- private Integer id;
-
- @Field
- @AnalyzerDiscriminator(impl = LanguageDiscriminator.class)
- private String language;
-
- @Field
- private String text;
-
- private Set<BlogEntry> references;
-
- // standard getter/setter
- ...
-}</programlisting>
-
- <programlisting>public class LanguageDiscriminator implements Discriminator {
-
- public String getAnanyzerDefinitionName(Object value, Object entity, String field) {
- if ( value == null || !( entity instanceof Article ) ) {
- return null;
- }
- return (String) value;
- }
-}</programlisting>
- </example>The prerequisite for using
- <classname>@AnalyzerDiscriminator</classname> is that all analyzers
- which are going to be used are predefined via
- <classname>@AnalyzerDef</classname> definitions. If this is the case
- one can place the <classname>@AnalyzerDiscriminator</classname>
- annotation either on the class or on a specific property of the entity
- for which to dynamically select an analyzer. Via the
- <literal>impl</literal> parameter of the
- <classname>AnalyzerDiscriminator</classname> you specify a concrete
- implementation of the <classname>Discriminator</classname> interface.
- It is up to you to provide an implementation for this interface. The
- only method you have to implement is
- <classname>getAnanyzerDefinitionName()</classname> which gets called
- for each field added to the Lucene document. The entity which is
- getting indexed is also passed to the interface method. The
- <literal>value</literal> parameter is only set if the
- <classname>AnalyzerDiscriminator</classname> is placed on property
- level instead of class level. In this case the value represents the
- current value of this property.</para>
-
- <para>An implemention of the <classname>Discriminator</classname>
- interface has to return the name of an existing analyzer definition if
- the analyzer should be set dynamically or <classname>null</classname>
- if the default analyzer should not be overridden. The given example
- assumes that the language paramter is either 'de' or 'en' which
- matches the specified names in the
- <classname>@AnalyzerDef</classname>s.</para>
-
- <note>
- <para>The <classname>@AnalyzerDiscriminator</classname> is currently
- still experimental and the API might still change. We are hoping for
- some feedback from the community about the usefulness and usability
- of this feature.</para>
- </note>
- </section>
-
- <section id="analyzer-retrievinganalyzer">
- <title>Retrieving an analyzer</title>
-
- <para>During indexing time, Hibernate Search is using analyzers under
- the hood for you. In some situations, retrieving analyzers can be
- handy. If your domain model makes use of multiple analyzers (maybe to
- benefit from stemming, use phonetic approximation and so on), you need
- to make sure to use the same analyzers when you build your
- query.</para>
-
- <note>
- <para>This rule can be broken but you need a good reason for it. If
- you are unsure, use the same analyzers.</para>
- </note>
-
- <para>You can retrieve the scoped analyzer for a given entity used at
- indexing time by Hibernate Search. A scoped analyzer is an analyzer
- which applies the right analyzers depending on the field indexed:
- multiple analyzers can be defined on a given entity each one working
- on an individual field, a scoped analyzer unify all these analyzers
- into a context-aware analyzer. While the theory seems a bit complex,
- using the right analyzer in a query is very easy.</para>
-
- <example>
- <title>Using the scoped analyzer when building a full-text
- query</title>
-
- <programlisting>org.apache.lucene.queryParser.QueryParser parser = new QueryParser(
- "title",
- fullTextSession.getSearchFactory().getAnalyzer( Song.class )
-);
-
-org.apache.lucene.search.Query luceneQuery =
- parser.parse( "title:sky Or title_stemmed:diamond" );
-
-org.hibernate.Query fullTextQuery =
- fullTextSession.createFullTextQuery( luceneQuery, Song.class );
-
-List result = fullTextQuery.list(); //return a list of managed objects </programlisting>
- </example>
-
- <para>In the example above, the song title is indexed in two fields:
- the standard analyzer is used in the field <literal>title</literal>
- and a stemming analyzer is used in the field
- <literal>title_stemmed</literal>. By using the analyzer provided by
- the search factory, the query uses the appropriate analyzer depending
- on the field targeted.</para>
-
- <para>If your query targets more that one query and you wish to use
- your standard analyzer, make sure to describe it using an analyzer
- definition. You can retrieve analyzers by their definition name using
- <code>searchFactory.getAnalyzer(String)</code>.</para>
- </section>
- </section>
- </section>
-
- <section id="search-mapping-bridge">
- <title>Property/Field Bridge</title>
-
- <para>In Lucene all index fields have to be represented as Strings. For
- this reason all entity properties annotated with <literal>@Field</literal>
- have to be indexed in a String form. For most of your properties,
- Hibernate Search does the translation job for you thanks to a built-in set
- of bridges. In some cases, though you need a more fine grain control over
- the translation process.</para>
-
- <section>
- <title>Built-in bridges</title>
-
- <para>Hibernate Search comes bundled with a set of built-in bridges
- between a Java property type and its full text representation.</para>
-
- <variablelist>
- <varlistentry>
- <term>null</term>
-
- <listitem>
- <para>null elements are not indexed. Lucene does not support null
- elements and this does not make much sense either.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.lang.String</term>
-
- <listitem>
- <para>String are indexed as is</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>short, Short, integer, Integer, long, Long, float, Float,
- double, Double, BigInteger, BigDecimal</term>
-
- <listitem>
- <para>Numbers are converted in their String representation. Note
- that numbers cannot be compared by Lucene (ie used in ranged
- queries) out of the box: they have to be padded <note>
- <para>Using a Range query is debatable and has drawbacks, an
- alternative approach is to use a Filter query which will
- filter the result query to the appropriate range.</para>
-
- <para>Hibernate Search will support a padding mechanism</para>
- </note></para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.util.Date</term>
-
- <listitem>
- <para>Dates are stored as yyyyMMddHHmmssSSS in GMT time
- (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You
- shouldn't really bother with the internal format. What is
- important is that when using a DateRange Query, you should know
- that the dates have to be expressed in GMT time.</para>
-
- <para>Usually, storing the date up to the milisecond is not
- necessary. <literal>@DateBridge</literal> defines the appropriate
- resolution you are willing to store in the index ( <literal>
- <literal>@DateBridge(resolution=Resolution.DAY)</literal>
- </literal> ). The date pattern will then be truncated
- accordingly.</para>
-
- <programlisting>@Entity
-@Indexed
-public class Meeting {
- @Field(index=Index.UN_TOKENIZED)
- <emphasis role="bold">@DateBridge(resolution=Resolution.MINUTE)</emphasis>
- private Date date;
- ... </programlisting>
-
- <warning>
- <para>A Date whose resolution is lower than
- <literal>MILLISECOND</literal> cannot be a
- <literal>@DocumentId</literal></para>
- </warning>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.net.URI, java.net.URL</term>
-
- <listitem>
- <para>URI and URL are converted to their string
- representation</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.lang.Class</term>
-
- <listitem>
- <para>Class are converted to their fully qualified class name. The
- thread context classloader is used when the class is
- rehydrated</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section>
- <title>Custom Bridge</title>
-
- <para>Sometimes, the built-in bridges of Hibernate Search do not cover
- some of your property types, or the String representation used by the
- bridge does not meet your requirements. The following paragraphs
- describe several solutions to this problem.</para>
-
- <section>
- <title>StringBridge</title>
-
- <para>The simplest custom solution is to give Hibernate Search an
- implementation of your expected
- <emphasis><classname>Object</classname> </emphasis>to
- <classname>String</classname> bridge. To do so you need to implements
- the <literal>org.hibernate.search.bridge.StringBridge</literal>
- interface. All implementations have to be thread-safe as they are used
- concurrently.</para>
-
- <example>
- <title>Implementing your own
- <classname>StringBridge</classname></title>
-
- <programlisting>/**
- * Padding Integer bridge.
- * All numbers will be padded with 0 to match 5 digits
- *
- * @author Emmanuel Bernard
- */
-public class PaddedIntegerBridge implements <emphasis role="bold">StringBridge</emphasis> {
-
- private int PADDING = 5;
-
- <emphasis role="bold">public String objectToString(Object object)</emphasis> {
- String rawInteger = ( (Integer) object ).toString();
- if (rawInteger.length() > PADDING)
- throw new IllegalArgumentException( "Try to pad on a number too big" );
- StringBuilder paddedInteger = new StringBuilder( );
- for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) {
- paddedInteger.append('0');
- }
- return paddedInteger.append( rawInteger ).toString();
- }
-} </programlisting>
- </example>
-
- <para>Then any property or field can use this bridge thanks to the
- <literal>@FieldBridge</literal> annotation</para>
-
- <programlisting><emphasis role="bold">@FieldBridge(impl = PaddedIntegerBridge.class)</emphasis>
-private Integer length; </programlisting>
-
- <para>Parameters can be passed to the Bridge implementation making it
- more flexible. The Bridge implementation implements a
- <classname>ParameterizedBridge</classname> interface, and the
- parameters are passed through the <literal>@FieldBridge</literal>
- annotation.</para>
-
- <example>
- <title>Passing parameters to your bridge implementation</title>
-
- <programlisting>public class PaddedIntegerBridge implements StringBridge, <emphasis
- role="bold">ParameterizedBridge</emphasis> {
-
- public static String PADDING_PROPERTY = "padding";
- private int padding = 5; //default
-
- <emphasis role="bold">public void setParameterValues(Map parameters)</emphasis> {
- Object padding = parameters.get( PADDING_PROPERTY );
- if (padding != null) this.padding = (Integer) padding;
- }
-
- public String objectToString(Object object) {
- String rawInteger = ( (Integer) object ).toString();
- if (rawInteger.length() > padding)
- throw new IllegalArgumentException( "Try to pad on a number too big" );
- StringBuilder paddedInteger = new StringBuilder( );
- for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
- paddedInteger.append('0');
- }
- return paddedInteger.append( rawInteger ).toString();
- }
-}
-
-
-//property
-@FieldBridge(impl = PaddedIntegerBridge.class,
- <emphasis role="bold">params = @Parameter(name="padding", value="10")</emphasis>
- )
-private Integer length; </programlisting>
- </example>
-
- <para>The <classname>ParameterizedBridge</classname> interface can be
- implemented by <classname>StringBridge</classname> ,
- <classname>TwoWayStringBridge</classname> ,
- <classname>FieldBridge</classname> implementations.</para>
-
- <para>All implementations have to be thread-safe, but the parameters
- are set during initialization and no special care is required at this
- stage.</para>
-
- <para>If you expect to use your bridge implementation on an id
- property (ie annotated with <literal>@DocumentId</literal> ), you need
- to use a slightly extended version of <literal>StringBridge</literal>
- named <classname>TwoWayStringBridge</classname>. Hibernate Search
- needs to read the string representation of the identifier and generate
- the object out of it. There is not difference in the way the
- <literal>@FieldBridge</literal> annotation is used.</para>
-
- <example>
- <title>Implementing a TwoWayStringBridge which can for example be
- used for id properties</title>
-
- <programlisting>public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge {
-
- public static String PADDING_PROPERTY = "padding";
- private int padding = 5; //default
-
- public void setParameterValues(Map parameters) {
- Object padding = parameters.get( PADDING_PROPERTY );
- if (padding != null) this.padding = (Integer) padding;
- }
-
- public String objectToString(Object object) {
- String rawInteger = ( (Integer) object ).toString();
- if (rawInteger.length() > padding)
- throw new IllegalArgumentException( "Try to pad on a number too big" );
- StringBuilder paddedInteger = new StringBuilder( );
- for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
- paddedInteger.append('0');
- }
- return paddedInteger.append( rawInteger ).toString();
- }
-
- <emphasis role="bold">public Object stringToObject(String stringValue)</emphasis> {
- return new Integer(stringValue);
- }
-}
-
-
-//id property
-@DocumentId
-@FieldBridge(impl = PaddedIntegerBridge.class,
- params = @Parameter(name="padding", value="10")
-private Integer id;
- </programlisting>
- </example>
-
- <para>It is critically important for the two-way process to be
- idempotent (ie object = stringToObject( objectToString( object ) )
- ).</para>
- </section>
-
- <section>
- <title>FieldBridge</title>
-
- <para>Some usecases require more than a simple object to string
- translation when mapping a property to a Lucene index. To give you the
- greatest possible flexibility you can also implement a bridge as a
- <classname>FieldBridge</classname>. This interface gives you a
- property value and let you map it the way you want in your Lucene
- <classname>Document</classname>.The interface is very similar in its
- concept to the Hibernate<classname> UserType</classname>'s.</para>
-
- <para>You can for example store a given property in two different
- document fields:</para>
-
- <example>
- <title>Implementing the FieldBridge interface in order to a given
- property into multiple document fields</title>
-
- <programlisting>/**
- * Store the date in 3 different fields - year, month, day - to ease Range Query per
- * year, month or day (eg get all the elements of December for the last 5 years).
- *
- * @author Emmanuel Bernard
- */
-public class DateSplitBridge implements FieldBridge {
- private final static TimeZone GMT = TimeZone.getTimeZone("GMT");
-
- <emphasis role="bold">public void set(String name, Object value, Document document,
- LuceneOptions luceneOptions)</emphasis> {
- Date date = (Date) value;
- Calendar cal = GregorianCalendar.getInstance(GMT);
- cal.setTime(date);
- int year = cal.get(Calendar.YEAR);
- int month = cal.get(Calendar.MONTH) + 1;
- int day = cal.get(Calendar.DAY_OF_MONTH);
-
- // set year
- Field field = new Field(name + ".year", String.valueOf(year),
- luceneOptions.getStore(), luceneOptions.getIndex(),
- luceneOptions.getTermVector());
- field.setBoost(luceneOptions.getBoost());
- document.add(field);
-
- // set month and pad it if needed
- field = new Field(name + ".month", month < 10 ? "0" : ""
- + String.valueOf(month), luceneOptions.getStore(),
- luceneOptions.getIndex(), luceneOptions.getTermVector());
- field.setBoost(luceneOptions.getBoost());
- document.add(field);
-
- // set day and pad it if needed
- field = new Field(name + ".day", day < 10 ? "0" : ""
- + String.valueOf(day), luceneOptions.getStore(),
- luceneOptions.getIndex(), luceneOptions.getTermVector());
- field.setBoost(luceneOptions.getBoost());
- document.add(field);
- }
-}
-
-//property
-<emphasis role="bold">@FieldBridge(impl = DateSplitBridge.class)</emphasis>
-private Date date; </programlisting>
- </example>
- </section>
-
- <section>
- <title>ClassBridge</title>
-
- <para>It is sometimes useful to combine more than one property of a
- given entity and index this combination in a specific way into the
- Lucene index. The <classname>@ClassBridge</classname> and
- <classname>@ClassBridge</classname> annotations can be defined at the
- class level (as opposed to the property level). In this case the
- custom field bridge implementation receives the entity instance as the
- value parameter instead of a particular property. Though not shown in
- this example, <classname>@ClassBridge</classname> supports the
- <methodname>termVector</methodname> attribute discussed in section
- <xref linkend="basic-mapping" />.</para>
-
- <example>
- <title>Implementing a class bridge</title>
-
- <programlisting>@Entity
-@Indexed
-<emphasis role="bold">@ClassBridge</emphasis>(name="branchnetwork",
- index=Index.TOKENIZED,
- store=Store.YES,
- impl = <emphasis role="bold">CatFieldsClassBridge.class</emphasis>,
- params = @Parameter( name="sepChar", value=" " ) )
-public class Department {
- private int id;
- private String network;
- private String branchHead;
- private String branch;
- private Integer maxEmployees
- ...
-}
-
-
-public class CatFieldsClassBridge implements FieldBridge, ParameterizedBridge {
- private String sepChar;
-
- public void setParameterValues(Map parameters) {
- this.sepChar = (String) parameters.get( "sepChar" );
- }
-
- <emphasis role="bold">public void set(String name, Object value, Document document, LuceneOptions luceneOptions)</emphasis> {
- // In this particular class the name of the new field was passed
- // from the name field of the ClassBridge Annotation. This is not
- // a requirement. It just works that way in this instance. The
- // actual name could be supplied by hard coding it below.
- Department dep = (Department) value;
- String fieldValue1 = dep.getBranch();
- if ( fieldValue1 == null ) {
- fieldValue1 = "";
- }
- String fieldValue2 = dep.getNetwork();
- if ( fieldValue2 == null ) {
- fieldValue2 = "";
- }
- String fieldValue = fieldValue1 + sepChar + fieldValue2;
- Field field = new Field( name, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector() );
- field.setBoost( luceneOptions.getBoost() );
- document.add( field );
- }
-}</programlisting>
- </example>
-
- <para>In this example, the particular
- <classname>CatFieldsClassBridge</classname> is applied to the
- <literal>department</literal> instance, the field bridge then
- concatenate both branch and network and index the
- concatenation.</para>
- </section>
- </section>
- </section>
-
- <section id="provided-id">
- <title>Providing your own id</title>
-
- <warning>
- <para>This part of the documentation is a work in progress.</para>
- </warning>
-
- <para>You can provide your own id for Hibernate Search if you are
- extending the internals. You will have to generate a unique value so it
- can be given to Lucene to be indexed. This will have to be given to
- Hibernate Search when you create an org.hibernate.search.Work object - the
- document id is required in the constructor.</para>
-
- <section id="ProvidedId">
- <title>The @ProvidedId annotation</title>
-
- <para>Unlike conventional Hibernate Search API and @DocumentId, this
- annotation is used on the class and not a field. You also can provide
- your own bridge implementation when you put in this annotation by
- calling the bridge() which is on @ProvidedId. Also, if you annotate a
- class with @ProvidedId, your subclasses will also get the annotation -
- but it is not done by using the java.lang.annotations.@Inherited. Be
- sure however, to <emphasis>not</emphasis> use this annotation with
- @DocumentId as your system will break.</para>
-
- <example>
- <title>Providing your own id</title>
-
- <programlisting>@ProvidedId (bridge = org.my.own.package.MyCustomBridge)
-@Indexed
-public class MyClass{
- @Field
- String MyString;
- ...
-}</programlisting>
- </example>
- </section>
- </section>
-</chapter>
Copied: search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml (from rev 15661, search/trunk/doc/reference/en/modules/mapping.xml)
===================================================================
--- search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml (rev 0)
+++ search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -0,0 +1,1451 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ ~ Hibernate, Relational Persistence for Idiomatic Java
+ ~
+ ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
+ ~ indicated by the @author tags or express copyright attribution
+ ~ statements applied by the authors. All third-party contributions are
+ ~ distributed under license by Red Hat Middleware LLC.
+ ~
+ ~ This copyrighted material is made available to anyone wishing to use, modify,
+ ~ copy, or redistribute it subject to the terms and conditions of the GNU
+ ~ Lesser General Public License, as published by the Free Software Foundation.
+ ~
+ ~ This program is distributed in the hope that it will be useful,
+ ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
+ ~ for more details.
+ ~
+ ~ You should have received a copy of the GNU Lesser General Public License
+ ~ along with this distribution; if not, write to:
+ ~ Free Software Foundation, Inc.
+ ~ 51 Franklin Street, Fifth Floor
+ ~ Boston, MA 02110-1301 USA
+ -->
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<chapter id="search-mapping" revision="3">
+ <!-- $Id$ -->
+
+ <title>Mapping entities to the index structure</title>
+
+ <para>All the metadata information needed to index entities is described
+ through annotations. There is no need for xml mapping files. In fact there
+ is currently no xml configuration option available (see <ulink
+ url="http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-210">HSEARCH-210</ulink>).
+ You can still use hibernate mapping files for the basic Hibernate
+ configuration, but the Search specific configuration has to be expressed via
+ annotations.</para>
+
+ <section id="search-mapping-entity" revision="3">
+ <title>Mapping an entity</title>
+
+ <section id="basic-mapping">
+ <title>Basic mapping</title>
+
+ <para>First, we must declare a persistent class as indexable. This is
+ done by annotating the class with <literal>@Indexed</literal> (all
+ entities not annotated with <literal>@Indexed</literal> will be ignored
+ by the indexing process):</para>
+
+ <example>
+ <title>Making a class indexable using the
+ <classname>@Indexed</classname> annotation</title>
+
+ <programlisting>@Entity
+<emphasis role="bold">@Indexed(index="indexes/essays")</emphasis>
+public class Essay {
+ ...
+}</programlisting>
+ </example>
+
+ <para>The <literal>index</literal> attribute tells Hibernate what the
+ Lucene directory name is (usually a directory on your file system). It
+ is recommended to define a base directory for all Lucene indexes using
+ the <literal>hibernate.search.default.indexBase</literal> property in
+ your configuration file. Alternatively you can specify a base directory
+ per indexed entity by specifying
+ <literal>hibernate.search.<index>.indexBase, </literal>where
+ <literal><index></literal> is the fully qualified classname of the
+ indexed entity. Each entity instance will be represented by a Lucene
+ <classname>Document</classname> inside the given index (aka
+ Directory).</para>
+
+ <para>For each property (or attribute) of your entity, you have the
+ ability to describe how it will be indexed. The default (no annotation
+ present) means that the property is completly ignored by the indexing
+ process. <literal>@Field</literal> does declare a property as indexed.
+ When indexing an element to a Lucene document you can specify how it is
+ indexed:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><literal>name</literal> : describe under which name, the
+ property should be stored in the Lucene Document. The default value
+ is the property name (following the JavaBeans convention)</para>
+ </listitem>
+
+ <listitem>
+ <para><literal>store</literal> : describe whether or not the
+ property is stored in the Lucene index. You can store the value
+ <literal>Store.YES</literal> (comsuming more space in the index but
+ allowing projection, see <xref linkend="projections" /> for more
+ information), store it in a compressed way
+ <literal>Store.COMPRESS</literal> (this does consume more CPU), or
+ avoid any storage <literal>Store.NO</literal> (this is the default
+ value). When a property is stored, you can retrieve its original
+ value from the Lucene Document. This is not related to whether the
+ element is indexed or not.</para>
+ </listitem>
+
+ <listitem>
+ <para>index: describe how the element is indexed and the type of
+ information store. The different values are
+ <literal>Index.NO</literal> (no indexing, ie cannot be found by a
+ query), <literal>Index.TOKENIZED</literal> (use an analyzer to
+ process the property), <literal>Index.UN_TOKENISED</literal> (no
+ analyzer pre processing), <literal>Index.NO_NORM</literal> (do not
+ store the normalization data). The default value is
+ <literal>TOKENIZED</literal>.</para>
+ </listitem>
+
+ <listitem>
+ <para>termVector: describes collections of term-frequency pairs.
+ This attribute enables term vectors being stored during indexing so
+ they are available within documents. The default value is
+ TermVector.NO.</para>
+
+ <para>The different values of this attribute are:</para>
+
+ <informaltable align="left" width="">
+ <tgroup cols="2">
+ <thead>
+ <row>
+ <entry align="center">Value</entry>
+
+ <entry align="center">Definition</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry align="left">TermVector.YES</entry>
+
+ <entry>Store the term vectors of each document. This
+ produces two synchronized arrays, one contains document
+ terms and the other contains the term's frequency.</entry>
+ </row>
+
+ <row>
+ <entry align="left">TermVector.NO</entry>
+
+ <entry>Do not store term vectors.</entry>
+ </row>
+
+ <row>
+ <entry align="left">TermVector.WITH_OFFSETS</entry>
+
+ <entry>Store the term vector and token offset information.
+ This is the same as TermVector.YES plus it contains the
+ starting and ending offset position information for the
+ terms.</entry>
+ </row>
+
+ <row>
+ <entry align="left">TermVector.WITH_POSITIONS</entry>
+
+ <entry>Store the term vector and token position information.
+ This is the same as TermVector.YES plus it contains the
+ ordinal positions of each occurrence of a term in a
+ document.</entry>
+ </row>
+
+ <row>
+ <entry
+ align="left">TermVector.WITH_POSITIONS_OFFSETS</entry>
+
+ <entry>Store the term vector, token position and offset
+ information. This is a combination of the YES, WITH_OFFSETS
+ and WITH_POSITIONS.</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </informaltable>
+ </listitem>
+ </itemizedlist>
+
+ <para>Whether or not you want to store the original data in the index
+ depends on how you wish to use the index query result. For a regular
+ Hibernate Search usage storing is not necessary. However you might want
+ to store some fields to subsequently project them (see <xref
+ linkend="projections" /> for more information).</para>
+
+ <para>Whether or not you want to tokenize a property depends on whether
+ you wish to search the element as is, or by the words it contains. It
+ make sense to tokenize a text field, but tokenizing a date field
+ probably not. Note that fields used for sorting must not be
+ tokenized.</para>
+
+ <para>Finally, the id property of an entity is a special property used
+ by Hibernate Search to ensure index unicity of a given entity. By
+ design, an id has to be stored and must not be tokenized. To mark a
+ property as index id, use the <literal>@DocumentId</literal> annotation.
+ If you are using Hibernate Annotations and you have specified @Id you
+ can omit @DocumentId. The chosen entity id will also be used as document
+ id.</para>
+
+ <example>
+ <title>Adding <classname>@DocumentId</classname> ad
+ <classname>@Field</classname> annotations to an indexed entity</title>
+
+ <programlisting>@Entity
+@Indexed(index="indexes/essays")
+public class Essay {
+ ...
+
+ @Id
+ <emphasis role="bold">@DocumentId</emphasis>
+ public Long getId() { return id; }
+
+ <emphasis role="bold">@Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES)</emphasis>
+ public String getSummary() { return summary; }
+
+ @Lob
+ <emphasis role="bold">@Field(index=Index.TOKENIZED)</emphasis>
+ public String getText() { return text; }
+}</programlisting>
+ </example>
+
+ <para>The above annotations define an index with three fields:
+ <literal>id</literal> , <literal>Abstract</literal> and
+ <literal>text</literal> . Note that by default the field name is
+ decapitalized, following the JavaBean specification</para>
+ </section>
+
+ <section>
+ <title>Mapping properties multiple times</title>
+
+ <para>Sometimes one has to map a property multiple times per index, with
+ slightly different indexing strategies. For example, sorting a query by
+ field requires the field to be <literal>UN_TOKENIZED</literal>. If one
+ wants to search by words in this property and still sort it, one need to
+ index it twice - once tokenized and once untokenized. @Fields allows to
+ achieve this goal.</para>
+
+ <example>
+ <title>Using @Fields to map a property multiple times</title>
+
+ <programlisting>@Entity
+@Indexed(index = "Book" )
+public class Book {
+ <emphasis role="bold">@Fields( {</emphasis>
+ @Field(index = Index.TOKENIZED),
+ @Field(name = "summary_forSort", index = Index.UN_TOKENIZED, store = Store.YES)
+ <emphasis role="bold">} )</emphasis>
+ public String getSummary() {
+ return summary;
+ }
+
+ ...
+}</programlisting>
+ </example>
+
+ <para>The field <literal>summary</literal> is indexed twice, once as
+ <literal>summary</literal> in a tokenized way, and once as
+ <literal>summary_forSort</literal> in an untokenized way. @Field
+ supports 2 attributes useful when @Fields is used:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>analyzer: defines a @Analyzer annotation per field rather than
+ per property</para>
+ </listitem>
+
+ <listitem>
+ <para>bridge: defines a @FieldBridge annotation per field rather
+ than per property</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>See below for more information about analyzers and field
+ bridges.</para>
+ </section>
+
+ <section id="search-mapping-associated">
+ <title>Embedded and associated objects</title>
+
+ <para>Associated objects as well as embedded objects can be indexed as
+ part of the root entity index. This is ueful if you expect to search a
+ given entity based on properties of associated objects. In the following
+ example the aim is to return places where the associated city is Atlanta
+ (In the Lucene query parser language, it would translate into
+ <code>address.city:Atlanta</code>).</para>
+
+ <example>
+ <title>Using @IndexedEmbedded to index associations</title>
+
+ <programlisting>@Entity
+@Indexed
+public class Place {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Long id;
+
+ @Field( index = Index.TOKENIZED )
+ private String name;
+
+ @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
+ <emphasis role="bold">@IndexedEmbedded</emphasis>
+ private Address address;
+ ....
+}
+
+@Entity
+public class Address {
+ @Id
+ @GeneratedValue
+ private Long id;
+
+ @Field(index=Index.TOKENIZED)
+ private String street;
+
+ @Field(index=Index.TOKENIZED)
+ private String city;
+
+ <emphasis role="bold">@ContainedIn</emphasis>
+ @OneToMany(mappedBy="address")
+ private Set<Place> places;
+ ...
+}</programlisting>
+ </example>
+
+ <para>In this example, the place fields will be indexed in the
+ <literal>Place</literal> index. The <literal>Place</literal> index
+ documents will also contain the fields <literal>address.id</literal>,
+ <literal>address.street</literal>, and <literal>address.city</literal>
+ which you will be able to query. This is enabled by the
+ <literal>@IndexedEmbedded</literal> annotation.</para>
+
+ <para>Be careful. Because the data is denormalized in the Lucene index
+ when using the <classname>@IndexedEmbedded</classname> technique,
+ Hibernate Search needs to be aware of any change in the
+ <classname>Place</classname> object and any change in the
+ <classname>Address</classname> object to keep the index up to date. To
+ make sure the <literal><classname>Place</classname></literal> Lucene
+ document is updated when it's <classname>Address</classname> changes,
+ you need to mark the other side of the birirectional relationship with
+ <classname>@ContainedIn</classname>.</para>
+
+ <para><literal>@ContainedIn</literal> is only useful on associations
+ pointing to entities as opposed to embedded (collection of)
+ objects.</para>
+
+ <para>Let's make our example a bit more complex:</para>
+
+ <example>
+ <title>Nested usage of <classname>@IndexedEmbedded</classname> and
+ <classname>@ContainedIn</classname></title>
+
+ <programlisting>@Entity
+@Indexed
+public class Place {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Long id;
+
+ @Field( index = Index.TOKENIZED )
+ private String name;
+
+ @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
+ <emphasis role="bold">@IndexedEmbedded</emphasis>
+ private Address address;
+ ....
+}
+
+@Entity
+public class Address {
+ @Id
+ @GeneratedValue
+ private Long id;
+
+ @Field(index=Index.TOKENIZED)
+ private String street;
+
+ @Field(index=Index.TOKENIZED)
+ private String city;
+
+ <emphasis role="bold">@IndexedEmbedded(depth = 1, prefix = "ownedBy_")</emphasis>
+ private Owner ownedBy;
+
+ <emphasis role="bold">@ContainedIn</emphasis>
+ @OneToMany(mappedBy="address")
+ private Set<Place> places;
+ ...
+}
+
+@Embeddable
+public class Owner {
+ @Field(index = Index.TOKENIZED)
+ private String name;
+ ...
+}</programlisting>
+ </example>
+
+ <para>Any <literal>@*ToMany, @*ToOne</literal> and
+ <literal>@Embedded</literal> attribute can be annotated with
+ <literal>@IndexedEmbedded</literal>. The attributes of the associated
+ class will then be added to the main entity index. In the previous
+ example, the index will contain the following fields</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>id</para>
+ </listitem>
+
+ <listitem>
+ <para>name</para>
+ </listitem>
+
+ <listitem>
+ <para>address.street</para>
+ </listitem>
+
+ <listitem>
+ <para>address.city</para>
+ </listitem>
+
+ <listitem>
+ <para>addess.ownedBy_name</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The default prefix is <literal>propertyName.</literal>, following
+ the traditional object navigation convention. You can override it using
+ the <literal>prefix</literal> attribute as it is shown on the
+ <literal>ownedBy</literal> property.</para>
+
+ <note>
+ <para>The prefix cannot be set to the empty string.</para>
+ </note>
+
+ <para>The<literal> depth</literal> property is necessary when the object
+ graph contains a cyclic dependency of classes (not instances). For
+ example, if <classname>Owner</classname> points to
+ <classname>Place</classname>. Hibernate Search will stop including
+ Indexed embedded atttributes after reaching the expected depth (or the
+ object graph boundaries are reached). A class having a self reference is
+ an example of cyclic dependency. In our example, because
+ <literal>depth</literal> is set to 1, any
+ <literal>@IndexedEmbedded</literal> attribute in Owner (if any) will be
+ ignored.</para>
+
+ <para>Using <literal>@IndexedEmbedded</literal> for object associations
+ allows you to express queries such as:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Return places where name contains JBoss and where address city
+ is Atlanta. In Lucene query this would be</para>
+
+ <programlisting>+name:jboss +address.city:atlanta </programlisting>
+ </listitem>
+
+ <listitem>
+ <para>Return places where name contains JBoss and where owner's name
+ contain Joe. In Lucene query this would be</para>
+
+ <programlisting>+name:jboss +address.orderBy_name:joe </programlisting>
+ </listitem>
+ </itemizedlist>
+
+ <para>In a way it mimics the relational join operation in a more
+ efficient way (at the cost of data duplication). Remember that, out of
+ the box, Lucene indexes have no notion of association, the join
+ operation is simply non-existent. It might help to keep the relational
+ model normalized while benefiting from the full text index speed and
+ feature richness.</para>
+
+ <para><note>
+ <para>An associated object can itself (but does not have to) be
+ <literal>@Indexed</literal></para>
+ </note></para>
+
+ <para>When @IndexedEmbedded points to an entity, the association has to
+ be directional and the other side has to be annotated
+ <literal>@ContainedIn</literal> (as seen in the previous example). If
+ not, Hibernate Search has no way to update the root index when the
+ associated entity is updated (in our example, a <literal>Place</literal>
+ index document has to be updated when the associated
+ <classname>Address</classname> instance is updated).</para>
+
+ <para>Sometimes, the object type annotated by
+ <classname>@IndexedEmbedded</classname> is not the object type targeted
+ by Hibernate and Hibernate Search. This is especially the case when
+ interfaces are used in lieu of their implementation. For this reason you
+ can override the object type targeted by Hibernate Search using the
+ <methodname>targetElement</methodname> parameter.</para>
+
+ <example>
+ <title>Using the <literal>targetElement</literal> property of
+ <classname>@IndexedEmbedded</classname></title>
+
+ <programlisting>@Entity
+@Indexed
+public class Address {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Long id;
+
+ @Field(index= Index.TOKENIZED)
+ private String street;
+
+ @IndexedEmbedded(depth = 1, prefix = "ownedBy_", <emphasis role="bold">targetElement = Owner.class</emphasis>)
+ @Target(Owner.class)
+ private Person ownedBy;
+
+
+ ...
+}
+
+@Embeddable
+public class Owner implements Person { ... }</programlisting>
+ </example>
+ </section>
+
+ <section>
+ <title>Boost factor</title>
+
+ <para>Lucene has the notion of <emphasis>boost factor</emphasis>. It's a
+ way to give more weigth to a field or to an indexed element over others
+ during the indexation process. You can use <literal>@Boost</literal> at
+ the @Field, method or class level.</para>
+
+ <example>
+ <title>Using different ways of increasing the weight of an indexed
+ element using a boost factor</title>
+
+ <programlisting>@Entity
+@Indexed(index="indexes/essays")
+<emphasis role="bold">@Boost(1.7f)</emphasis>
+public class Essay {
+ ...
+
+ @Id
+ @DocumentId
+ public Long getId() { return id; }
+
+ @Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES, boost=<emphasis
+ role="bold">@Boost(2f)</emphasis>)
+ <emphasis role="bold">@Boost(1.5f)</emphasis>
+ public String getSummary() { return summary; }
+
+ @Lob
+ @Field(index=Index.TOKENIZED, boost=<emphasis role="bold">@Boost(1.2f)</emphasis>)
+ public String getText() { return text; }
+
+ @Field
+ public String getISBN() { return isbn; }
+
+} </programlisting>
+ </example>
+
+ <para>In our example, <classname>Essay</classname>'s probability to
+ reach the top of the search list will be multiplied by 1.7. The
+ <methodname>summary</methodname> field will be 3.0 (2 * 1.5 -
+ <methodname>@Field.boost</methodname> and <classname>@Boost</classname>
+ on a property are cumulative) more important than the
+ <methodname>isbn</methodname> field. The <methodname>text</methodname>
+ field will be 1.2 times more important than the
+ <methodname>isbn</methodname> field. Note that this explanation in
+ strictest terms is actually wrong, but it is simple and close enough to
+ reality for all practical purposes. Please check the Lucene
+ documentation or the excellent <citetitle>Lucene In Action </citetitle>
+ from Otis Gospodnetic and Erik Hatcher.</para>
+ </section>
+
+ <section id="analyzer">
+ <title>Analyzer</title>
+
+ <para>The default analyzer class used to index tokenized fields is
+ configurable through the <literal>hibernate.search.analyzer</literal>
+ property. The default value for this property is
+ <classname>org.apache.lucene.analysis.standard.StandardAnalyzer</classname>.</para>
+
+ <para>You can also define the analyzer class per entity, property and
+ even per @Field (useful when multiple fields are indexed from a single
+ property).</para>
+
+ <example>
+ <title>Different ways of specifying an analyzer</title>
+
+ <programlisting>@Entity
+@Indexed
+<emphasis role="bold">@Analyzer(impl = EntityAnalyzer.class)</emphasis>
+public class MyEntity {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Integer id;
+
+ @Field(index = Index.TOKENIZED)
+ private String name;
+
+ @Field(index = Index.TOKENIZED)
+ <emphasis role="bold">@Analyzer(impl = PropertyAnalyzer.class)</emphasis>
+ private String summary;
+
+ @Field(index = Index.TOKENIZED, <emphasis><emphasis role="bold">analyzer = @Analyzer(impl = FieldAnalyzer.class</emphasis>)</emphasis>
+ private String body;
+
+ ...
+}</programlisting>
+ </example>
+
+ <para>In this example, <classname>EntityAnalyzer</classname> is used to
+ index all tokenized properties (eg. <literal>name</literal>), except
+ <literal>summary</literal> and <literal>body</literal> which are indexed
+ with <classname>PropertyAnalyzer</classname> and
+ <classname>FieldAnalyzer</classname> respectively.</para>
+
+ <caution>
+ <para>Mixing different analyzers in the same entity is most of the
+ time a bad practice. It makes query building more complex and results
+ less predictable (for the novice), especially if you are using a
+ QueryParser (which uses the same analyzer for the whole query). As a
+ rule of thumb, for any given field the same analyzer should be used
+ for indexing and querying.</para>
+ </caution>
+
+ <section>
+ <title>Analyzer definitions</title>
+
+ <para>Analyzers can become quite complex to deal with for which reason
+ Hibernate Search introduces the notion of analyzer definitions. An
+ analyzer definition can be reused by many
+ <classname>@Analyzer</classname> declarations. An analyzer definition
+ is composed of:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>a name: the unique string used to refer to the
+ definition</para>
+ </listitem>
+
+ <listitem>
+ <para>a tokenizer: responsible for tokenizing the input stream
+ into individual words</para>
+ </listitem>
+
+ <listitem>
+ <para>a list of filters: each filter is responsible to remove,
+ modify or sometimes even add words into the stream provided by the
+ tokenizer</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>This separation of tasks - a tokenizer followed by a list of
+ filters - allows for easy reuse of each individual component and let
+ you build your customized analyzer in a very flexible way (just like
+ lego). Generally speaking the <classname>Tokenizer</classname> starts
+ the analysis process by turning the character input into tokens which
+ are then further processed by the <classname>TokenFilter</classname>s.
+ Hibernate Search supports this infrastructure by utilizing the Solr
+ analyzer framework. Make sure to add<filename> solr-core.jar and
+ </filename><filename>solr-common.jar</filename> to your classpath to
+ use analyzer definitions. In case you also want to utilizing a
+ snowball stemmer also include the
+ <filename>lucene-snowball.jar.</filename> Other Solr analyzers might
+ depend on more libraries. For example, the
+ <classname>PhoneticFilterFactory</classname> depends on <ulink
+ url="http://commons.apache.org/codec">commons-codec</ulink>. Your
+ distribution of Hibernate Search provides these dependecies in its
+ <filename>lib</filename> directory.</para>
+
+ <example>
+ <title><classname>@AnalyzerDef</classname> and the Solr
+ framework</title>
+
+ <programlisting>@AnalyzerDef(name="customanalyzer",
+ tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
+ filters = {
+ @TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
+ @TokenFilterDef(factory = LowerCaseFilterFactory.class),
+ @TokenFilterDef(factory = StopFilterFactory.class, params = {
+ @Parameter(name="words", value= "org/hibernate/search/test/analyzer/solr/stoplist.properties" ),
+ @Parameter(name="ignoreCase", value="true")
+ })
+})
+public class Team {
+ ...
+}</programlisting>
+ </example>
+
+ <para>A tokenizer is defined by its factory which is responsible for
+ building the tokenizer and using the optional list of parameters. This
+ example use the standard tokenizer. A filter is defined by its factory
+ which is responsible for creating the filter instance using the
+ optional parameters. In our example, the StopFilter filter is built
+ reading the dedicated words property file and is expected to ignore
+ case. The list of parameters is dependent on the tokenizer or filter
+ factory.</para>
+
+ <warning>
+ <para>Filters are applied in the order they are defined in the
+ <classname>@AnalyzerDef</classname> annotation. Make sure to think
+ twice about this order.</para>
+ </warning>
+
+ <para>Once defined, an analyzer definition can be reused by an
+ <classname>@Analyzer</classname> declaration using the definition name
+ rather than declaring an implementation class.</para>
+
+ <example>
+ <title>Referencing an analyzer by name</title>
+
+ <programlisting>@Entity
+@Indexed
+@AnalyzerDef(name="customanalyzer", ... )
+public class Team {
+ @Id
+ @DocumentId
+ @GeneratedValue
+ private Integer id;
+
+ @Field
+ private String name;
+
+ @Field
+ private String location;
+
+ @Field <emphasis role="bold">@Analyzer(definition = "customanalyzer")</emphasis>
+ private String description;
+}</programlisting>
+ </example>
+
+ <para>Analyzer instances declared by
+ <classname>@AnalyzerDef</classname> are available by their name in the
+ <classname>SearchFactory</classname>.</para>
+
+ <programlisting>Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("customanalyzer");</programlisting>
+
+ <para>This is quite useful wen building queries. Fields in queries
+ should be analyzed with the same analyzer used to index the field so
+ that they speak a common "language": the same tokens are reused
+ between the query and the indexing process. This rule has some
+ exceptions but is true most of the time. Respect it unless you know
+ what you are doing.</para>
+ </section>
+
+ <section>
+ <title>Available analyzers</title>
+
+ <para>Solr and Lucene come with a lot of useful default tokenizers and
+ filters. You can find a complete list of tokenizer factories and
+ filter factories at <ulink
+ url="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters">http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters</ulink>.
+ Let check a few of them.</para>
+
+ <table>
+ <title>Some of the tokenizers avalable</title>
+
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry align="center">Factory</entry>
+
+ <entry align="center">Description</entry>
+
+ <entry align="center">parameters</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry>StandardTokenizerFactory</entry>
+
+ <entry>Use the Lucene StandardTokenizer</entry>
+
+ <entry>none</entry>
+ </row>
+
+ <row>
+ <entry>HTMLStripStandardTokenizerFactory</entry>
+
+ <entry>Remove HTML tags, keep the text and pass it to a
+ StandardTokenizer</entry>
+
+ <entry>none</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <table>
+ <title>Some of the filters avalable</title>
+
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry align="center">Factory</entry>
+
+ <entry align="center">Description</entry>
+
+ <entry align="center">parameters</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry>StandardFilterFactory</entry>
+
+ <entry>Remove dots from acronyms and 's from words</entry>
+
+ <entry>none</entry>
+ </row>
+
+ <row>
+ <entry>LowerCaseFilterFactory</entry>
+
+ <entry>Lowercase words</entry>
+
+ <entry>none</entry>
+ </row>
+
+ <row>
+ <entry>StopFilterFactory</entry>
+
+ <entry>remove words (tokens) matching a list of stop
+ words</entry>
+
+ <entry><para><literal>words</literal>: points to a resource
+ file containing the stop words</para><para>ignoreCase: true if
+ <literal>case</literal> should be ignore when comparing stop
+ words, <literal>false</literal> otherwise </para></entry>
+ </row>
+
+ <row>
+ <entry>SnowballPorterFilterFactory</entry>
+
+ <entry>Reduces a word to it's root in a given language. (eg.
+ protect, protects, protection share the same root). Using such
+ a filter allows searches matching related words.</entry>
+
+ <entry><para><literal>language</literal>: Danish, Dutch,
+ English, Finnish, French, German, Italian, Norwegian,
+ Portuguese, Russian, Spanish, Swedish</para>and a few
+ more</entry>
+ </row>
+
+ <row>
+ <entry>ISOLatin1AccentFilterFactory</entry>
+
+ <entry>remove accents for languages like French</entry>
+
+ <entry>none</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>We recommend to check all the implementations of
+ <classname>org.apache.solr.analysis.TokenizerFactory</classname> and
+ <classname>org.apache.solr.analysis.TokenFilterFactory</classname> in
+ your IDE to see the implementations available.</para>
+ </section>
+
+ <section>
+ <title>Analyzer discriminator (experimental)</title>
+
+ <para>So far all the introduced ways to specify an analyzer were
+ static. However, there are usecases where it is useful to select an
+ analyzer depending on the current state of the entity to be indexed,
+ for example in multilingual application. For an
+ <classname>BlogEntry</classname> class for example the analyzer could
+ depend on the language property of the entry. Depending on this
+ property the correct language specific stemmer should be chosen to
+ index the actual text.</para>
+
+ <para>To enable this dynamic analyzer selection Hibernate Search
+ introduces the <classname>AnalyzerDiscriminator</classname>
+ annotation. The following example demonstrates the usage of this
+ annotation:</para>
+
+ <para><example>
+ <title>Usage of @AnalyzerDiscriminator in order to select an
+ analyzer depending on the entity state</title>
+
+ <programlisting>@Entity
+@Indexed
+@AnalyzerDefs({
+ @AnalyzerDef(name = "en",
+ tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
+ filters = {
+ @TokenFilterDef(factory = LowerCaseFilterFactory.class),
+ @TokenFilterDef(factory = EnglishPorterFilterFactory.class
+ )
+ }),
+ @AnalyzerDef(name = "de",
+ tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
+ filters = {
+ @TokenFilterDef(factory = LowerCaseFilterFactory.class),
+ @TokenFilterDef(factory = GermanStemFilterFactory.class)
+ })
+})
+public class BlogEntry {
+
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Integer id;
+
+ @Field
+ @AnalyzerDiscriminator(impl = LanguageDiscriminator.class)
+ private String language;
+
+ @Field
+ private String text;
+
+ private Set<BlogEntry> references;
+
+ // standard getter/setter
+ ...
+}</programlisting>
+
+ <programlisting>public class LanguageDiscriminator implements Discriminator {
+
+ public String getAnanyzerDefinitionName(Object value, Object entity, String field) {
+ if ( value == null || !( entity instanceof Article ) ) {
+ return null;
+ }
+ return (String) value;
+ }
+}</programlisting>
+ </example>The prerequisite for using
+ <classname>@AnalyzerDiscriminator</classname> is that all analyzers
+ which are going to be used are predefined via
+ <classname>@AnalyzerDef</classname> definitions. If this is the case
+ one can place the <classname>@AnalyzerDiscriminator</classname>
+ annotation either on the class or on a specific property of the entity
+ for which to dynamically select an analyzer. Via the
+ <literal>impl</literal> parameter of the
+ <classname>AnalyzerDiscriminator</classname> you specify a concrete
+ implementation of the <classname>Discriminator</classname> interface.
+ It is up to you to provide an implementation for this interface. The
+ only method you have to implement is
+ <classname>getAnanyzerDefinitionName()</classname> which gets called
+ for each field added to the Lucene document. The entity which is
+ getting indexed is also passed to the interface method. The
+ <literal>value</literal> parameter is only set if the
+ <classname>AnalyzerDiscriminator</classname> is placed on property
+ level instead of class level. In this case the value represents the
+ current value of this property.</para>
+
+ <para>An implemention of the <classname>Discriminator</classname>
+ interface has to return the name of an existing analyzer definition if
+ the analyzer should be set dynamically or <classname>null</classname>
+ if the default analyzer should not be overridden. The given example
+ assumes that the language paramter is either 'de' or 'en' which
+ matches the specified names in the
+ <classname>@AnalyzerDef</classname>s.</para>
+
+ <note>
+ <para>The <classname>@AnalyzerDiscriminator</classname> is currently
+ still experimental and the API might still change. We are hoping for
+ some feedback from the community about the usefulness and usability
+ of this feature.</para>
+ </note>
+ </section>
+
+ <section id="analyzer-retrievinganalyzer">
+ <title>Retrieving an analyzer</title>
+
+ <para>During indexing time, Hibernate Search is using analyzers under
+ the hood for you. In some situations, retrieving analyzers can be
+ handy. If your domain model makes use of multiple analyzers (maybe to
+ benefit from stemming, use phonetic approximation and so on), you need
+ to make sure to use the same analyzers when you build your
+ query.</para>
+
+ <note>
+ <para>This rule can be broken but you need a good reason for it. If
+ you are unsure, use the same analyzers.</para>
+ </note>
+
+ <para>You can retrieve the scoped analyzer for a given entity used at
+ indexing time by Hibernate Search. A scoped analyzer is an analyzer
+ which applies the right analyzers depending on the field indexed:
+ multiple analyzers can be defined on a given entity each one working
+ on an individual field, a scoped analyzer unify all these analyzers
+ into a context-aware analyzer. While the theory seems a bit complex,
+ using the right analyzer in a query is very easy.</para>
+
+ <example>
+ <title>Using the scoped analyzer when building a full-text
+ query</title>
+
+ <programlisting>org.apache.lucene.queryParser.QueryParser parser = new QueryParser(
+ "title",
+ fullTextSession.getSearchFactory().getAnalyzer( Song.class )
+);
+
+org.apache.lucene.search.Query luceneQuery =
+ parser.parse( "title:sky Or title_stemmed:diamond" );
+
+org.hibernate.Query fullTextQuery =
+ fullTextSession.createFullTextQuery( luceneQuery, Song.class );
+
+List result = fullTextQuery.list(); //return a list of managed objects </programlisting>
+ </example>
+
+ <para>In the example above, the song title is indexed in two fields:
+ the standard analyzer is used in the field <literal>title</literal>
+ and a stemming analyzer is used in the field
+ <literal>title_stemmed</literal>. By using the analyzer provided by
+ the search factory, the query uses the appropriate analyzer depending
+ on the field targeted.</para>
+
+ <para>If your query targets more that one query and you wish to use
+ your standard analyzer, make sure to describe it using an analyzer
+ definition. You can retrieve analyzers by their definition name using
+ <code>searchFactory.getAnalyzer(String)</code>.</para>
+ </section>
+ </section>
+ </section>
+
+ <section id="search-mapping-bridge">
+ <title>Property/Field Bridge</title>
+
+ <para>In Lucene all index fields have to be represented as Strings. For
+ this reason all entity properties annotated with <literal>@Field</literal>
+ have to be indexed in a String form. For most of your properties,
+ Hibernate Search does the translation job for you thanks to a built-in set
+ of bridges. In some cases, though you need a more fine grain control over
+ the translation process.</para>
+
+ <section>
+ <title>Built-in bridges</title>
+
+ <para>Hibernate Search comes bundled with a set of built-in bridges
+ between a Java property type and its full text representation.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>null</term>
+
+ <listitem>
+ <para>null elements are not indexed. Lucene does not support null
+ elements and this does not make much sense either.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.lang.String</term>
+
+ <listitem>
+ <para>String are indexed as is</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>short, Short, integer, Integer, long, Long, float, Float,
+ double, Double, BigInteger, BigDecimal</term>
+
+ <listitem>
+ <para>Numbers are converted in their String representation. Note
+ that numbers cannot be compared by Lucene (ie used in ranged
+ queries) out of the box: they have to be padded <note>
+ <para>Using a Range query is debatable and has drawbacks, an
+ alternative approach is to use a Filter query which will
+ filter the result query to the appropriate range.</para>
+
+ <para>Hibernate Search will support a padding mechanism</para>
+ </note></para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.util.Date</term>
+
+ <listitem>
+ <para>Dates are stored as yyyyMMddHHmmssSSS in GMT time
+ (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You
+ shouldn't really bother with the internal format. What is
+ important is that when using a DateRange Query, you should know
+ that the dates have to be expressed in GMT time.</para>
+
+ <para>Usually, storing the date up to the milisecond is not
+ necessary. <literal>@DateBridge</literal> defines the appropriate
+ resolution you are willing to store in the index ( <literal>
+ <literal>@DateBridge(resolution=Resolution.DAY)</literal>
+ </literal> ). The date pattern will then be truncated
+ accordingly.</para>
+
+ <programlisting>@Entity
+@Indexed
+public class Meeting {
+ @Field(index=Index.UN_TOKENIZED)
+ <emphasis role="bold">@DateBridge(resolution=Resolution.MINUTE)</emphasis>
+ private Date date;
+ ... </programlisting>
+
+ <warning>
+ <para>A Date whose resolution is lower than
+ <literal>MILLISECOND</literal> cannot be a
+ <literal>@DocumentId</literal></para>
+ </warning>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.net.URI, java.net.URL</term>
+
+ <listitem>
+ <para>URI and URL are converted to their string
+ representation</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.lang.Class</term>
+
+ <listitem>
+ <para>Class are converted to their fully qualified class name. The
+ thread context classloader is used when the class is
+ rehydrated</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section>
+ <title>Custom Bridge</title>
+
+ <para>Sometimes, the built-in bridges of Hibernate Search do not cover
+ some of your property types, or the String representation used by the
+ bridge does not meet your requirements. The following paragraphs
+ describe several solutions to this problem.</para>
+
+ <section>
+ <title>StringBridge</title>
+
+ <para>The simplest custom solution is to give Hibernate Search an
+ implementation of your expected
+ <emphasis><classname>Object</classname> </emphasis>to
+ <classname>String</classname> bridge. To do so you need to implements
+ the <literal>org.hibernate.search.bridge.StringBridge</literal>
+ interface. All implementations have to be thread-safe as they are used
+ concurrently.</para>
+
+ <example>
+ <title>Implementing your own
+ <classname>StringBridge</classname></title>
+
+ <programlisting>/**
+ * Padding Integer bridge.
+ * All numbers will be padded with 0 to match 5 digits
+ *
+ * @author Emmanuel Bernard
+ */
+public class PaddedIntegerBridge implements <emphasis role="bold">StringBridge</emphasis> {
+
+ private int PADDING = 5;
+
+ <emphasis role="bold">public String objectToString(Object object)</emphasis> {
+ String rawInteger = ( (Integer) object ).toString();
+ if (rawInteger.length() > PADDING)
+ throw new IllegalArgumentException( "Try to pad on a number too big" );
+ StringBuilder paddedInteger = new StringBuilder( );
+ for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) {
+ paddedInteger.append('0');
+ }
+ return paddedInteger.append( rawInteger ).toString();
+ }
+} </programlisting>
+ </example>
+
+ <para>Then any property or field can use this bridge thanks to the
+ <literal>@FieldBridge</literal> annotation</para>
+
+ <programlisting><emphasis role="bold">@FieldBridge(impl = PaddedIntegerBridge.class)</emphasis>
+private Integer length; </programlisting>
+
+ <para>Parameters can be passed to the Bridge implementation making it
+ more flexible. The Bridge implementation implements a
+ <classname>ParameterizedBridge</classname> interface, and the
+ parameters are passed through the <literal>@FieldBridge</literal>
+ annotation.</para>
+
+ <example>
+ <title>Passing parameters to your bridge implementation</title>
+
+ <programlisting>public class PaddedIntegerBridge implements StringBridge, <emphasis
+ role="bold">ParameterizedBridge</emphasis> {
+
+ public static String PADDING_PROPERTY = "padding";
+ private int padding = 5; //default
+
+ <emphasis role="bold">public void setParameterValues(Map parameters)</emphasis> {
+ Object padding = parameters.get( PADDING_PROPERTY );
+ if (padding != null) this.padding = (Integer) padding;
+ }
+
+ public String objectToString(Object object) {
+ String rawInteger = ( (Integer) object ).toString();
+ if (rawInteger.length() > padding)
+ throw new IllegalArgumentException( "Try to pad on a number too big" );
+ StringBuilder paddedInteger = new StringBuilder( );
+ for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
+ paddedInteger.append('0');
+ }
+ return paddedInteger.append( rawInteger ).toString();
+ }
+}
+
+
+//property
+@FieldBridge(impl = PaddedIntegerBridge.class,
+ <emphasis role="bold">params = @Parameter(name="padding", value="10")</emphasis>
+ )
+private Integer length; </programlisting>
+ </example>
+
+ <para>The <classname>ParameterizedBridge</classname> interface can be
+ implemented by <classname>StringBridge</classname> ,
+ <classname>TwoWayStringBridge</classname> ,
+ <classname>FieldBridge</classname> implementations.</para>
+
+ <para>All implementations have to be thread-safe, but the parameters
+ are set during initialization and no special care is required at this
+ stage.</para>
+
+ <para>If you expect to use your bridge implementation on an id
+ property (ie annotated with <literal>@DocumentId</literal> ), you need
+ to use a slightly extended version of <literal>StringBridge</literal>
+ named <classname>TwoWayStringBridge</classname>. Hibernate Search
+ needs to read the string representation of the identifier and generate
+ the object out of it. There is not difference in the way the
+ <literal>@FieldBridge</literal> annotation is used.</para>
+
+ <example>
+ <title>Implementing a TwoWayStringBridge which can for example be
+ used for id properties</title>
+
+ <programlisting>public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge {
+
+ public static String PADDING_PROPERTY = "padding";
+ private int padding = 5; //default
+
+ public void setParameterValues(Map parameters) {
+ Object padding = parameters.get( PADDING_PROPERTY );
+ if (padding != null) this.padding = (Integer) padding;
+ }
+
+ public String objectToString(Object object) {
+ String rawInteger = ( (Integer) object ).toString();
+ if (rawInteger.length() > padding)
+ throw new IllegalArgumentException( "Try to pad on a number too big" );
+ StringBuilder paddedInteger = new StringBuilder( );
+ for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
+ paddedInteger.append('0');
+ }
+ return paddedInteger.append( rawInteger ).toString();
+ }
+
+ <emphasis role="bold">public Object stringToObject(String stringValue)</emphasis> {
+ return new Integer(stringValue);
+ }
+}
+
+
+//id property
+@DocumentId
+@FieldBridge(impl = PaddedIntegerBridge.class,
+ params = @Parameter(name="padding", value="10")
+private Integer id;
+ </programlisting>
+ </example>
+
+ <para>It is critically important for the two-way process to be
+ idempotent (ie object = stringToObject( objectToString( object ) )
+ ).</para>
+ </section>
+
+ <section>
+ <title>FieldBridge</title>
+
+ <para>Some usecases require more than a simple object to string
+ translation when mapping a property to a Lucene index. To give you the
+ greatest possible flexibility you can also implement a bridge as a
+ <classname>FieldBridge</classname>. This interface gives you a
+ property value and let you map it the way you want in your Lucene
+ <classname>Document</classname>.The interface is very similar in its
+ concept to the Hibernate<classname> UserType</classname>'s.</para>
+
+ <para>You can for example store a given property in two different
+ document fields:</para>
+
+ <example>
+ <title>Implementing the FieldBridge interface in order to a given
+ property into multiple document fields</title>
+
+ <programlisting>/**
+ * Store the date in 3 different fields - year, month, day - to ease Range Query per
+ * year, month or day (eg get all the elements of December for the last 5 years).
+ *
+ * @author Emmanuel Bernard
+ */
+public class DateSplitBridge implements FieldBridge {
+ private final static TimeZone GMT = TimeZone.getTimeZone("GMT");
+
+ <emphasis role="bold">public void set(String name, Object value, Document document,
+ LuceneOptions luceneOptions)</emphasis> {
+ Date date = (Date) value;
+ Calendar cal = GregorianCalendar.getInstance(GMT);
+ cal.setTime(date);
+ int year = cal.get(Calendar.YEAR);
+ int month = cal.get(Calendar.MONTH) + 1;
+ int day = cal.get(Calendar.DAY_OF_MONTH);
+
+ // set year
+ Field field = new Field(name + ".year", String.valueOf(year),
+ luceneOptions.getStore(), luceneOptions.getIndex(),
+ luceneOptions.getTermVector());
+ field.setBoost(luceneOptions.getBoost());
+ document.add(field);
+
+ // set month and pad it if needed
+ field = new Field(name + ".month", month < 10 ? "0" : ""
+ + String.valueOf(month), luceneOptions.getStore(),
+ luceneOptions.getIndex(), luceneOptions.getTermVector());
+ field.setBoost(luceneOptions.getBoost());
+ document.add(field);
+
+ // set day and pad it if needed
+ field = new Field(name + ".day", day < 10 ? "0" : ""
+ + String.valueOf(day), luceneOptions.getStore(),
+ luceneOptions.getIndex(), luceneOptions.getTermVector());
+ field.setBoost(luceneOptions.getBoost());
+ document.add(field);
+ }
+}
+
+//property
+<emphasis role="bold">@FieldBridge(impl = DateSplitBridge.class)</emphasis>
+private Date date; </programlisting>
+ </example>
+ </section>
+
+ <section>
+ <title>ClassBridge</title>
+
+ <para>It is sometimes useful to combine more than one property of a
+ given entity and index this combination in a specific way into the
+ Lucene index. The <classname>@ClassBridge</classname> and
+ <classname>@ClassBridge</classname> annotations can be defined at the
+ class level (as opposed to the property level). In this case the
+ custom field bridge implementation receives the entity instance as the
+ value parameter instead of a particular property. Though not shown in
+ this example, <classname>@ClassBridge</classname> supports the
+ <methodname>termVector</methodname> attribute discussed in section
+ <xref linkend="basic-mapping" />.</para>
+
+ <example>
+ <title>Implementing a class bridge</title>
+
+ <programlisting>@Entity
+@Indexed
+<emphasis role="bold">@ClassBridge</emphasis>(name="branchnetwork",
+ index=Index.TOKENIZED,
+ store=Store.YES,
+ impl = <emphasis role="bold">CatFieldsClassBridge.class</emphasis>,
+ params = @Parameter( name="sepChar", value=" " ) )
+public class Department {
+ private int id;
+ private String network;
+ private String branchHead;
+ private String branch;
+ private Integer maxEmployees
+ ...
+}
+
+
+public class CatFieldsClassBridge implements FieldBridge, ParameterizedBridge {
+ private String sepChar;
+
+ public void setParameterValues(Map parameters) {
+ this.sepChar = (String) parameters.get( "sepChar" );
+ }
+
+ <emphasis role="bold">public void set(String name, Object value, Document document, LuceneOptions luceneOptions)</emphasis> {
+ // In this particular class the name of the new field was passed
+ // from the name field of the ClassBridge Annotation. This is not
+ // a requirement. It just works that way in this instance. The
+ // actual name could be supplied by hard coding it below.
+ Department dep = (Department) value;
+ String fieldValue1 = dep.getBranch();
+ if ( fieldValue1 == null ) {
+ fieldValue1 = "";
+ }
+ String fieldValue2 = dep.getNetwork();
+ if ( fieldValue2 == null ) {
+ fieldValue2 = "";
+ }
+ String fieldValue = fieldValue1 + sepChar + fieldValue2;
+ Field field = new Field( name, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector() );
+ field.setBoost( luceneOptions.getBoost() );
+ document.add( field );
+ }
+}</programlisting>
+ </example>
+
+ <para>In this example, the particular
+ <classname>CatFieldsClassBridge</classname> is applied to the
+ <literal>department</literal> instance, the field bridge then
+ concatenate both branch and network and index the
+ concatenation.</para>
+ </section>
+ </section>
+ </section>
+
+ <section id="provided-id">
+ <title>Providing your own id</title>
+
+ <warning>
+ <para>This part of the documentation is a work in progress.</para>
+ </warning>
+
+ <para>You can provide your own id for Hibernate Search if you are
+ extending the internals. You will have to generate a unique value so it
+ can be given to Lucene to be indexed. This will have to be given to
+ Hibernate Search when you create an org.hibernate.search.Work object - the
+ document id is required in the constructor.</para>
+
+ <section id="ProvidedId">
+ <title>The ProvidedId annotation</title>
+
+ <para>Unlike conventional Hibernate Search API and @DocumentId, this
+ annotation is used on the class and not a field. You also can provide
+ your own bridge implementation when you put in this annotation by
+ calling the bridge() which is on @ProvidedId. Also, if you annotate a
+ class with @ProvidedId, your subclasses will also get the annotation -
+ but it is not done by using the java.lang.annotations.@Inherited. Be
+ sure however, to <emphasis>not</emphasis> use this annotation with
+ @DocumentId as your system will break.</para>
+
+ <example>
+ <title>Providing your own id</title>
+
+ <programlisting>@ProvidedId (bridge = org.my.own.package.MyCustomBridge)
+@Indexed
+public class MyClass{
+ @Field
+ String MyString;
+ ...
+}</programlisting>
+ </example>
+ </section>
+ </section>
+</chapter>
15 years, 5 months
Hibernate SVN: r15662 - search/trunk.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-04 05:42:07 -0500 (Thu, 04 Dec 2008)
New Revision: 15662
Modified:
search/trunk/changelog.txt
Log:
updated changelog
Modified: search/trunk/changelog.txt
===================================================================
--- search/trunk/changelog.txt 2008-12-04 10:25:17 UTC (rev 15661)
+++ search/trunk/changelog.txt 2008-12-04 10:42:07 UTC (rev 15662)
@@ -4,6 +4,32 @@
3.1.0.GA (4-12-2008)
------------------------
+** Bug
+ * [HSEARCH-233] - EntityNotFoundException during indexing
+ * [HSEARCH-280] - Make FSSlaveAndMasterDPTest pass against postgresql
+ * [HSEARCH-297] - Allow PatternTokenizerFactory to be used
+ * [HSEARCH-309] - PurgeAllLuceneWork duplicates in work queue
+
+** Improvement
+ * [HSEARCH-221] - Get Lucene Analyzer runtime (indexing)
+ * [HSEARCH-265] - Raise warnings when an abstract class is marked @Indexed
+ * [HSEARCH-285] - Refactor DocumentBuilder to support containedIn only and regular Indexed entities
+ * [HSEARCH-298] - Warn for dangerous IndexWriter settings
+ * [HSEARCH-299] - Use of faster Bit operations when possible to chain Filters
+ * [HSEARCH-302] - Utilize pagination settings when retrieving TopDocs from the Lucene query to only retrieve required TopDocs
+ * [HSEARCH-308] - getResultSize() implementation should not load documents
+ * [HSEARCH-311] - Add a close() method to BackendQueueProcessorFactory
+ * [HSEARCH-312] - Rename hibernate.search.filter.cache_bit_results.size to hibernate.search.filter.cache_docidresults.size
+
+** New Feature
+ * [HSEARCH-160] - Truly polymorphic queries
+ * [HSEARCH-268] - Apply changes to different indexes in parallel
+ * [HSEARCH-296] - Expose managed entity class via a Projection constant
+
+** Task
+ * [HSEARCH-303] - Review reference documentation
+
+
3.1.0.CR1 (17-10-2008)
------------------------
15 years, 5 months
Hibernate SVN: r15661 - search/trunk/doc/reference/en/modules.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-04 05:25:17 -0500 (Thu, 04 Dec 2008)
New Revision: 15661
Modified:
search/trunk/doc/reference/en/modules/mapping.xml
Log:
HSEARCH-303
Modified: search/trunk/doc/reference/en/modules/mapping.xml
===================================================================
--- search/trunk/doc/reference/en/modules/mapping.xml 2008-12-04 10:17:39 UTC (rev 15660)
+++ search/trunk/doc/reference/en/modules/mapping.xml 2008-12-04 10:25:17 UTC (rev 15661)
@@ -427,7 +427,7 @@
<literal>ownedBy</literal> property.</para>
<note>
- <para>The prefix cannot be set to the empty string. </para>
+ <para>The prefix cannot be set to the empty string.</para>
</note>
<para>The<literal> depth</literal> property is necessary when the object
@@ -439,7 +439,7 @@
an example of cyclic dependency. In our example, because
<literal>depth</literal> is set to 1, any
<literal>@IndexedEmbedded</literal> attribute in Owner (if any) will be
- ignored. </para>
+ ignored.</para>
<para>Using <literal>@IndexedEmbedded</literal> for object associations
allows you to express queries such as:</para>
@@ -866,7 +866,7 @@
<classname>BlogEntry</classname> class for example the analyzer could
depend on the language property of the entry. Depending on this
property the correct language specific stemmer should be chosen to
- index the actual text. </para>
+ index the actual text.</para>
<para>To enable this dynamic analyzer selection Hibernate Search
introduces the <classname>AnalyzerDiscriminator</classname>
@@ -1424,7 +1424,7 @@
document id is required in the constructor.</para>
<section id="ProvidedId">
- <title>The @ProvidedId annotation</title>
+ <title>The ProvidedId annotation</title>
<para>Unlike conventional Hibernate Search API and @DocumentId, this
annotation is used on the class and not a field. You also can provide
15 years, 5 months
Hibernate SVN: r15660 - search/trunk/doc/reference/en.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-04 05:17:39 -0500 (Thu, 04 Dec 2008)
New Revision: 15660
Modified:
search/trunk/doc/reference/en/master.xml
Log:
HSEARCH-303 - preface changes
Modified: search/trunk/doc/reference/en/master.xml
===================================================================
--- search/trunk/doc/reference/en/master.xml 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/trunk/doc/reference/en/master.xml 2008-12-04 10:17:39 UTC (rev 15660)
@@ -53,17 +53,17 @@
<para>Full text search engines like Apache Lucene are very powerful
technologies to add efficient free text search capabilities to
- applications. However, they suffer several mismatches when dealing with
- object domain models. Amongst other things indexes have to be kept up to
+ applications. However, Lucene suffers several mismatches when dealing with
+ object domain model. Amongst other things indexes have to be kept up to
date and mismatches between index structure and domain model as well as
query mismatches have to be avoided.</para>
- <para>Hibernate Search indexes your domain model with the help of a few
- annotations, takes care of database/index synchronization and brings back
- regular managed objects from free text queries. To achieve this Hibernate
- Search is combining the power of <ulink
- url="http://www.hibernate.org">Hibernate</ulink> and <ulink
- url="http://lucene.apache.org">Apache Lucene</ulink>.</para>
+ <para>Hibernate Search addresses these shortcomings - it indexes your
+ domain model with the help of a few annotations, takes care of
+ database/index synchronization and brings back regular managed objects
+ from free text queries. To achieve this Hibernate Search is combining the
+ power of <ulink url="http://www.hibernate.org">Hibernate</ulink> and
+ <ulink url="http://lucene.apache.org">Apache Lucene</ulink>.</para>
</preface>
<xi:include href="modules/getting-started.xml"
15 years, 5 months
Hibernate SVN: r15659 - search/trunk/src/test/org/hibernate/search/test/directoryProvider.
by hibernate-commits@lists.jboss.org
Author: jcosta(a)redhat.com
Date: 2008-12-04 04:55:05 -0500 (Thu, 04 Dec 2008)
New Revision: 15659
Modified:
search/trunk/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java
Log:
HSEARCH-319 - Added a @Column to the property date, to use a non-keyword name
Modified: search/trunk/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java
===================================================================
--- search/trunk/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java 2008-12-04 09:54:44 UTC (rev 15658)
+++ search/trunk/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java 2008-12-04 09:55:05 UTC (rev 15659)
@@ -2,15 +2,17 @@
package org.hibernate.search.test.directoryProvider;
import java.util.Date;
+
+import javax.persistence.Column;
+import javax.persistence.Entity;
+import javax.persistence.GeneratedValue;
import javax.persistence.Id;
-import javax.persistence.GeneratedValue;
-import javax.persistence.Entity;
-import org.hibernate.search.annotations.Indexed;
+import org.hibernate.search.annotations.DateBridge;
import org.hibernate.search.annotations.DocumentId;
import org.hibernate.search.annotations.Field;
import org.hibernate.search.annotations.Index;
-import org.hibernate.search.annotations.DateBridge;
+import org.hibernate.search.annotations.Indexed;
import org.hibernate.search.annotations.Resolution;
/**
@@ -26,6 +28,7 @@
@Field(index = Index.UN_TOKENIZED)
@DateBridge( resolution = Resolution.DAY )
+ @Column(name="xdate")
private Date date;
@Field(index = Index.TOKENIZED)
15 years, 5 months
Hibernate SVN: r15658 - search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/directoryProvider.
by hibernate-commits@lists.jboss.org
Author: jcosta(a)redhat.com
Date: 2008-12-04 04:54:44 -0500 (Thu, 04 Dec 2008)
New Revision: 15658
Modified:
search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java
Log:
HSEARCH-319 - Added a @Column to the property date, to use a non-keyword name
Modified: search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java
===================================================================
--- search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java 2008-12-03 20:10:03 UTC (rev 15657)
+++ search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/directoryProvider/SnowStorm.java 2008-12-04 09:54:44 UTC (rev 15658)
@@ -2,15 +2,17 @@
package org.hibernate.search.test.directoryProvider;
import java.util.Date;
+
+import javax.persistence.Column;
+import javax.persistence.Entity;
+import javax.persistence.GeneratedValue;
import javax.persistence.Id;
-import javax.persistence.GeneratedValue;
-import javax.persistence.Entity;
-import org.hibernate.search.annotations.Indexed;
+import org.hibernate.search.annotations.DateBridge;
import org.hibernate.search.annotations.DocumentId;
import org.hibernate.search.annotations.Field;
import org.hibernate.search.annotations.Index;
-import org.hibernate.search.annotations.DateBridge;
+import org.hibernate.search.annotations.Indexed;
import org.hibernate.search.annotations.Resolution;
/**
@@ -26,6 +28,7 @@
@Field(index = Index.UN_TOKENIZED)
@DateBridge( resolution = Resolution.DAY )
+ @Column(name="xdate")
private Date date;
@Field(index = Index.TOKENIZED)
15 years, 5 months
Hibernate SVN: r15657 - search/trunk/doc/reference/en/modules.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-03 15:10:03 -0500 (Wed, 03 Dec 2008)
New Revision: 15657
Modified:
search/trunk/doc/reference/en/modules/mapping.xml
Log:
doc update
Modified: search/trunk/doc/reference/en/modules/mapping.xml
===================================================================
--- search/trunk/doc/reference/en/modules/mapping.xml 2008-12-03 16:41:15 UTC (rev 15656)
+++ search/trunk/doc/reference/en/modules/mapping.xml 2008-12-03 20:10:03 UTC (rev 15657)
@@ -48,11 +48,16 @@
entities not annotated with <literal>@Indexed</literal> will be ignored
by the indexing process):</para>
- <programlisting>@Entity
+ <example>
+ <title>Making a class indexable using the
+ <classname>@Indexed</classname> annotation</title>
+
+ <programlisting>@Entity
<emphasis role="bold">@Indexed(index="indexes/essays")</emphasis>
public class Essay {
...
}</programlisting>
+ </example>
<para>The <literal>index</literal> attribute tells Hibernate what the
Lucene directory name is (usually a directory on your file system). It
@@ -189,7 +194,11 @@
can omit @DocumentId. The chosen entity id will also be used as document
id.</para>
- <programlisting>@Entity
+ <example>
+ <title>Adding <classname>@DocumentId</classname> ad
+ <classname>@Field</classname> annotations to an indexed entity</title>
+
+ <programlisting>@Entity
@Indexed(index="indexes/essays")
public class Essay {
...
@@ -205,6 +214,7 @@
<emphasis role="bold">@Field(index=Index.TOKENIZED)</emphasis>
public String getText() { return text; }
}</programlisting>
+ </example>
<para>The above annotations define an index with three fields:
<literal>id</literal> , <literal>Abstract</literal> and
@@ -222,7 +232,10 @@
index it twice - once tokenized and once untokenized. @Fields allows to
achieve this goal.</para>
- <programlisting>@Entity
+ <example>
+ <title>Using @Fields to map a property multiple times</title>
+
+ <programlisting>@Entity
@Indexed(index = "Book" )
public class Book {
<emphasis role="bold">@Fields( {</emphasis>
@@ -235,6 +248,7 @@
...
}</programlisting>
+ </example>
<para>The field <literal>summary</literal> is indexed twice, once as
<literal>summary</literal> in a tokenized way, and once as
@@ -267,7 +281,10 @@
(In the Lucene query parser language, it would translate into
<code>address.city:Atlanta</code>).</para>
- <programlisting>@Entity
+ <example>
+ <title>Using @IndexedEmbedded to index associations</title>
+
+ <programlisting>@Entity
@Indexed
public class Place {
@Id
@@ -301,6 +318,7 @@
private Set<Place> places;
...
}</programlisting>
+ </example>
<para>In this example, the place fields will be indexed in the
<literal>Place</literal> index. The <literal>Place</literal> index
@@ -325,7 +343,11 @@
<para>Let's make our example a bit more complex:</para>
- <programlisting>@Entity
+ <example>
+ <title>Nested usage of <classname>@IndexedEmbedded</classname> and
+ <classname>@ContainedIn</classname></title>
+
+ <programlisting>@Entity
@Indexed
public class Place {
@Id
@@ -369,6 +391,7 @@
private String name;
...
}</programlisting>
+ </example>
<para>Any <literal>@*ToMany, @*ToOne</literal> and
<literal>@Embedded</literal> attribute can be annotated with
@@ -464,7 +487,11 @@
can override the object type targeted by Hibernate Search using the
<methodname>targetElement</methodname> parameter.</para>
- <programlisting>@Entity
+ <example>
+ <title>Using the <literal>targetElement</literal> property of
+ <classname>@IndexedEmbedded</classname></title>
+
+ <programlisting>@Entity
@Indexed
public class Address {
@Id
@@ -485,6 +512,7 @@
@Embeddable
public class Owner implements Person { ... }</programlisting>
+ </example>
</section>
<section>
@@ -495,7 +523,11 @@
during the indexation process. You can use <literal>@Boost</literal> at
the @Field, method or class level.</para>
- <programlisting>@Entity
+ <example>
+ <title>Using different ways of increasing the weight of an indexed
+ element using a boost factor</title>
+
+ <programlisting>@Entity
@Indexed(index="indexes/essays")
<emphasis role="bold">@Boost(1.7f)</emphasis>
public class Essay {
@@ -506,7 +538,7 @@
public Long getId() { return id; }
@Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES, boost=<emphasis
- role="bold">@Boost(2f)</emphasis>)
+ role="bold">@Boost(2f)</emphasis>)
<emphasis role="bold">@Boost(1.5f)</emphasis>
public String getSummary() { return summary; }
@@ -518,6 +550,7 @@
public String getISBN() { return isbn; }
} </programlisting>
+ </example>
<para>In our example, <classname>Essay</classname>'s probability to
reach the top of the search list will be multiplied by 1.7. The
@@ -545,7 +578,10 @@
even per @Field (useful when multiple fields are indexed from a single
property).</para>
- <programlisting>@Entity
+ <example>
+ <title>Different ways of specifying an analyzer</title>
+
+ <programlisting>@Entity
@Indexed
<emphasis role="bold">@Analyzer(impl = EntityAnalyzer.class)</emphasis>
public class MyEntity {
@@ -566,6 +602,7 @@
...
}</programlisting>
+ </example>
<para>In this example, <classname>EntityAnalyzer</classname> is used to
index all tokenized properties (eg. <literal>name</literal>), except
@@ -627,7 +664,11 @@
distribution of Hibernate Search provides these dependecies in its
<filename>lib</filename> directory.</para>
- <programlisting>@AnalyzerDef(name="customanalyzer",
+ <example>
+ <title><classname>@AnalyzerDef</classname> and the Solr
+ framework</title>
+
+ <programlisting>@AnalyzerDef(name="customanalyzer",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
@@ -640,6 +681,7 @@
public class Team {
...
}</programlisting>
+ </example>
<para>A tokenizer is defined by its factory which is responsible for
building the tokenizer and using the optional list of parameters. This
@@ -660,7 +702,10 @@
<classname>@Analyzer</classname> declaration using the definition name
rather than declaring an implementation class.</para>
- <programlisting>@Entity
+ <example>
+ <title>Referencing an analyzer by name</title>
+
+ <programlisting>@Entity
@Indexed
@AnalyzerDef(name="customanalyzer", ... )
public class Team {
@@ -678,6 +723,7 @@
@Field <emphasis role="bold">@Analyzer(definition = "customanalyzer")</emphasis>
private String description;
}</programlisting>
+ </example>
<para>Analyzer instances declared by
<classname>@AnalyzerDef</classname> are available by their name in the
@@ -1093,7 +1139,11 @@
interface. All implementations have to be thread-safe as they are used
concurrently.</para>
- <programlisting>/**
+ <example>
+ <title>Implementing your own
+ <classname>StringBridge</classname></title>
+
+ <programlisting>/**
* Padding Integer bridge.
* All numbers will be padded with 0 to match 5 digits
*
@@ -1114,6 +1164,7 @@
return paddedInteger.append( rawInteger ).toString();
}
} </programlisting>
+ </example>
<para>Then any property or field can use this bridge thanks to the
<literal>@FieldBridge</literal> annotation</para>
@@ -1127,9 +1178,12 @@
parameters are passed through the <literal>@FieldBridge</literal>
annotation.</para>
- <programlisting>public class PaddedIntegerBridge implements StringBridge, <emphasis
- role="bold">ParameterizedBridge</emphasis> {
+ <example>
+ <title>Passing parameters to your bridge implementation</title>
+ <programlisting>public class PaddedIntegerBridge implements StringBridge, <emphasis
+ role="bold">ParameterizedBridge</emphasis> {
+
public static String PADDING_PROPERTY = "padding";
private int padding = 5; //default
@@ -1156,6 +1210,7 @@
<emphasis role="bold">params = @Parameter(name="padding", value="10")</emphasis>
)
private Integer length; </programlisting>
+ </example>
<para>The <classname>ParameterizedBridge</classname> interface can be
implemented by <classname>StringBridge</classname> ,
@@ -1174,8 +1229,12 @@
the object out of it. There is not difference in the way the
<literal>@FieldBridge</literal> annotation is used.</para>
- <programlisting>public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge {
+ <example>
+ <title>Implementing a TwoWayStringBridge which can for example be
+ used for id properties</title>
+ <programlisting>public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge {
+
public static String PADDING_PROPERTY = "padding";
private int padding = 5; //default
@@ -1207,6 +1266,7 @@
params = @Parameter(name="padding", value="10")
private Integer id;
</programlisting>
+ </example>
<para>It is critically important for the two-way process to be
idempotent (ie object = stringToObject( objectToString( object ) )
@@ -1227,7 +1287,11 @@
<para>You can for example store a given property in two different
document fields:</para>
- <programlisting>/**
+ <example>
+ <title>Implementing the FieldBridge interface in order to a given
+ property into multiple document fields</title>
+
+ <programlisting>/**
* Store the date in 3 different fields - year, month, day - to ease Range Query per
* year, month or day (eg get all the elements of December for the last 5 years).
*
@@ -1271,6 +1335,7 @@
//property
<emphasis role="bold">@FieldBridge(impl = DateSplitBridge.class)</emphasis>
private Date date; </programlisting>
+ </example>
</section>
<section>
@@ -1287,7 +1352,10 @@
<methodname>termVector</methodname> attribute discussed in section
<xref linkend="basic-mapping" />.</para>
- <programlisting>@Entity
+ <example>
+ <title>Implementing a class bridge</title>
+
+ <programlisting>@Entity
@Indexed
<emphasis role="bold">@ClassBridge</emphasis>(name="branchnetwork",
index=Index.TOKENIZED,
@@ -1331,6 +1399,7 @@
document.add( field );
}
}</programlisting>
+ </example>
<para>In this example, the particular
<classname>CatFieldsClassBridge</classname> is applied to the
@@ -1366,13 +1435,17 @@
sure however, to <emphasis>not</emphasis> use this annotation with
@DocumentId as your system will break.</para>
- <programlisting>@ProvidedId (bridge = org.my.own.package.MyCustomBridge)
+ <example>
+ <title>Providing your own id</title>
+
+ <programlisting>@ProvidedId (bridge = org.my.own.package.MyCustomBridge)
@Indexed
public class MyClass{
@Field
String MyString;
...
}</programlisting>
+ </example>
</section>
</section>
</chapter>
15 years, 5 months
Hibernate SVN: r15656 - search/trunk/doc/reference/en/modules.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-03 11:41:15 -0500 (Wed, 03 Dec 2008)
New Revision: 15656
Modified:
search/trunk/doc/reference/en/modules/architecture.xml
Log:
HSEARCH-303
Modified: search/trunk/doc/reference/en/modules/architecture.xml
===================================================================
--- search/trunk/doc/reference/en/modules/architecture.xml 2008-12-03 16:24:24 UTC (rev 15655)
+++ search/trunk/doc/reference/en/modules/architecture.xml 2008-12-03 16:41:15 UTC (rev 15656)
@@ -47,30 +47,27 @@
configure directory providers to adjust the directory target (see <xref
linkend="search-configuration-directory" />).</para>
- <para>Hibernate Search can also use the Lucene index to search an entity
- and return a list of managed entities saving you the tedious object to
- Lucene document mapping. The same persistence context is shared between
- Hibernate and Hibernate Search; as a matter of fact, the Search Session is
- built on top of the Hibernate Session. The application code use the
- unified <classname>org.hibernate.Query</classname> or
+ <para>Hibernate Search uses the Lucene index to search an entity and
+ return a list of managed entities saving you the tedious object to Lucene
+ document mapping. The same persistence context is shared between Hibernate
+ and Hibernate Search. As a matter of fact, the
+ <classname>FullTextSession</classname> is built on top of the Hibernate
+ Session. so that the application code can use the unified
+ <classname>org.hibernate.Query</classname> or
<classname>javax.persistence.Query</classname> APIs exactly the way a HQL,
JPA-QL or native queries would do.</para>
<para>To be more efficient, Hibernate Search batches the write
interactions with the Lucene index. There is currently two types of
- batching depending on the expected scope.</para>
+ batching depending on the expected scope. Outside a transaction, the index
+ update operation is executed right after the actual database operation.
+ This scope is really a no scoping setup and no batching is performed.
+ However, it is recommended - for both your database and Hibernate Search -
+ to execute your operation in a transaction be it JDBC or JTA. When in a
+ transaction, the index update operation is scheduled for the transaction
+ commit phase and discarded in case of transaction rollback. The batching
+ scope is the transaction. There are two immediate benefits:</para>
- <para>Outside a transaction, the index update operation is executed right
- after the actual database operation. This scope is really a no scoping
- setup and no batching is performed.</para>
-
- <para>It is however recommended, for both your database and Hibernate
- Search, to execute your operation in a transaction be it JDBC or JTA. When
- in a transaction, the index update operation is scheduled for the
- transaction commit phase and discarded in case of transaction rollback.
- The batching scope is the transaction. There are two immediate
- benefits:</para>
-
<itemizedlist>
<listitem>
<para>Performance: Lucene indexing works better when operation are
@@ -80,20 +77,16 @@
<listitem>
<para>ACIDity: The work executed has the same scoping as the one
executed by the database transaction and is executed if and only if
- the transaction is committed.</para>
-
- <note>
- <para>Disclaimer, the work in not ACID in the strict sense of it,
- but ACID behavior is rarely useful for full text search indexes
- since they can be rebuilt from the source at any time.</para>
- </note>
+ the transaction is committed. This is not ACID in the strict sense of
+ it, but ACID behavior is rarely useful for full text search indexes
+ since they can be rebuilt from the source at any time.</para>
</listitem>
</itemizedlist>
<para>You can think of those two scopes (no scope vs transactional) as the
equivalent of the (infamous) autocommit vs transactional behavior. From a
performance perspective, the <emphasis>in transaction</emphasis> mode is
- recommended. The scoping choice is made transparently: Hibernate Search
+ recommended. The scoping choice is made transparently. Hibernate Search
detects the presence of a transaction and adjust the scoping.</para>
<note>
@@ -154,12 +147,12 @@
<para>All index update operations applied on a given node are sent to
a JMS queue. A unique reader will then process the queue and update
- the master Lucene index. The master index is then replicated on a
- regular basis to the slave copies. This is known as the master /
- slaves pattern. The master is the sole responsible for updating the
- Lucene index. The slaves can accept read as well as write operations.
- However, they only process the read operation on their local index
- copy and delegate the update operations to the master.</para>
+ the master index. The master index is then replicated on a regular
+ basis to the slave copies. This is known as the master/slaves pattern.
+ The master is the sole responsible for updating the Lucene index. The
+ slaves can accept read as well as write operations. However, they only
+ process the read operation on their local index copy and delegate the
+ update operations to the master.</para>
<mediaobject>
<imageobject role="html">
15 years, 5 months
Hibernate SVN: r15655 - search/trunk/doc/reference/en/modules.
by hibernate-commits@lists.jboss.org
Author: hardy.ferentschik
Date: 2008-12-03 11:24:24 -0500 (Wed, 03 Dec 2008)
New Revision: 15655
Modified:
search/trunk/doc/reference/en/modules/getting-started.xml
search/trunk/doc/reference/en/modules/mapping.xml
Log:
HSEARCH-303
Modified: search/trunk/doc/reference/en/modules/getting-started.xml
===================================================================
--- search/trunk/doc/reference/en/modules/getting-started.xml 2008-12-03 13:59:43 UTC (rev 15654)
+++ search/trunk/doc/reference/en/modules/getting-started.xml 2008-12-03 16:24:24 UTC (rev 15655)
@@ -54,18 +54,20 @@
<row>
<entry>Hibernate Search</entry>
- <entry><literal>hibernate-search.jar</literal> and all the
+ <entry><literal>hibernate-search.jar</literal> and all runtime
dependencies from the <literal>lib</literal> directory of the
- Hibernate Search distribution, especially lucene.</entry>
+ Hibernate Search distribution. Please refer to
+ <filename>README.txt </filename>in the lib directory to understand
+ which dependencies are required.</entry>
</row>
<row>
<entry>Hibernate Core</entry>
<entry>This instructions have been tested against Hibernate 3.3.x.
- Next to the main <literal>hibernate3.jar</literal> you will need
- all required libaries from the <literal>lib</literal> directory of
- the distribution. Refer to <literal>README.txt</literal> in the
+ You will need <literal>hibernate-core.jar</literal> and its
+ transitive dependencies from the <literal>lib</literal> directory
+ of the distribution. Refer to <literal>README.txt</literal> in the
<literal>lib</literal> directory of the distribution to determine
the minimum runtime requirements.</entry>
</row>
@@ -95,7 +97,7 @@
</section>
<section>
- <title>Maven</title>
+ <title>Using Maven</title>
<para>Instead of managing all dependencies manually, maven users have the
possibility to use the <ulink
@@ -104,7 +106,11 @@
section of your <filename>pom.xml</filename> or
<filename>settings.xml</filename>:</para>
- <programlisting>
+ <example>
+ <title>Adding the JBoss maven repository to
+ <filename>settings.xml</filename></title>
+
+ <programlisting>
<repository>
<id>repository.jboss.org</id>
<name>JBoss Maven Repository</name>
@@ -112,10 +118,14 @@
<layout>default</layout>
</repository>
</programlisting>
+ </example>
<para>Then add the following dependencies to your pom.xml:</para>
- <programlisting>
+ <example>
+ <title>Maven dependencies for Hibernate Search</title>
+
+ <programlisting>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search</artifactId>
@@ -147,6 +157,7 @@
<version>2.4.0</version>
</dependency>
</programlisting>
+ </example>
<para>Not all dependencies are required. Only the
<emphasis>hibernate-search</emphasis> dependeny is mandatory. This
@@ -159,16 +170,17 @@
with the hibernate-search jar file, to configure your Lucene index.
Currently there is no XML configuration available for Hibernate Search.
<emphasis>hibernate-entitymanager</emphasis> is required if you want to
- use Hibernate Search in conjunction with JPA. Finally, the Solr
- dependencies are needed if you want to utilize Solr's analyzer framework.
- More about this later.</para>
+ use Hibernate Search in conjunction with JPA. The Solr dependencies are
+ needed if you want to utilize Solr's analyzer framework. More about this
+ later. And finally, the <literal>lucene-snowball</literal> dependency is
+ needed if you want to utililze Lucene's snowball stemmer.</para>
</section>
<section>
<title>Configuration</title>
<para>Once you have downloaded and added all required dependencies to your
- application you have to add a few properties to your hibernate
+ application you have to add a couple of properties to your hibernate
configuration file. If you are using Hibernate directly this can be done
in <literal>hibernate.properties</literal> or
<literal>hibernate.cfg.xml</literal>. If you are using Hibernate via JPA
@@ -177,14 +189,23 @@
default. An example <filename>persistence.xml</filename> configuration
could look like this:</para>
- <para><programlisting>
+ <example>
+ <title>Basic configuration options to be added to
+ <literal><filename>hibernate.properties</filename></literal>,
+ <literal><filename>hibernate.cfg.xml</filename></literal> or
+ <filename>persistence.xml</filename></title>
+
+ <programlisting>
...
<property name="hibernate.search.default.directory_provider"
value="org.hibernate.search.store.FSDirectoryProvider"/>
<property name="hibernate.search.default.indexBase" value="/var/lucene/indexes"/>
...
- </programlisting>First you have to tell Hibernate Search which
+ </programlisting>
+ </example>
+
+ <para>First you have to tell Hibernate Search which
<classname>DirectoryProvider</classname> to use. This can be achieved by
setting the <literal>hibernate.search.default.directory_provider</literal>
property. Apache Lucene has the notion of a <literal>Directory</literal>
@@ -207,7 +228,11 @@
capabilities to your application in order to search the books contained in
your database.</para>
- <programlisting>
+ <example>
+ <title>Example entities Book and Author before adding Hibernate Search
+ specific annotatons</title>
+
+ <programlisting>
package example;
...
@Entity
@@ -234,7 +259,7 @@
}
</programlisting>
- <para><programlisting>
+ <programlisting>
package example;
...
@Entity
@@ -253,7 +278,8 @@
...
}
-</programlisting></para>
+</programlisting>
+ </example>
<para>To achieve this you have to add a few annotations to the
<classname>Book</classname> and <classname>Author</classname> class. The
@@ -262,7 +288,7 @@
to store an untokenized id in the index to ensure index unicity for a
given entity. <literal>@DocumentId</literal> marks the property to use for
this purpose and is in most cases the same as the database primary key. In
- fact since the latest release of Hibernate Search
+ fact since the 3.1.0 release of Hibernate Search
<literal>@DocumentId</literal> is optional in the case where an
<classname>@Id</classname> annotation exists.</para>
@@ -276,16 +302,20 @@
talk more about analyzers a little later on. The second parameter we
specify within <literal>@Field</literal>,<literal>
store=Store.NO</literal>, ensures that the actual data will not be stored
- in the index. This is the default setting and probably a good choice
- unless you want to avoid database roundtrips and retrieve the indexed data
- via projections (<xref linkend="projections" />). Without projections,
- Hibernate Search will per default execute the Lucene query in order to
- find the database identifiers of the entities matching the query critera
- and use these identifiers to retrieve managed objects from the database.
- The decision for or against projection has to be made on a case to case
- basis. The default behaviour is recommended since it returns managed
- objects whereas projections only returns object arrays. </para>
+ in the index. Whether this data is stored in the index or not has nothing
+ to do with the ability to search for it. From Lucene's perspective it is
+ not necessary to keep the data once the index is created. The benefit of
+ storing it is the ability to retrieve it via projections (<xref
+ linkend="projections" />). </para>
+ <para>Without projections, Hibernate Search will per default execute a
+ Lucene query in order to find the database identifiers of the entities
+ matching the query critera and use these identifiers to retrieve managed
+ objects from the database. The decision for or against projection has to
+ be made on a case to case basis. The default behaviour -
+ <literal>Store.NO</literal> - is recommended since it returns managed
+ objects whereas projections only return object arrays.</para>
+
<para>After this short look under the hood let's go back to annotating the
<classname>Book</classname> class. Another annotation we have not yet
discussed is <literal>@DateBridge</literal>. This annotation is one of the
@@ -302,7 +332,7 @@
(<literal>@ManyToMany</literal>, <literal>@*ToOne</literal> and
<literal>@Embedded</literal>) as part of the owning entity. This is needed
since a Lucene index document is a flat data structure which does not know
- anything about object relations. To ensure that the author's name wil be
+ anything about object relations. To ensure that the authors' name wil be
searchable you have to make sure that the names are indexed as part of the
book itself. On top of <literal>@IndexedEmbedded</literal> you will also
have to mark all fields of the associated entity you want to have included
@@ -312,7 +342,11 @@
<para>These settings should be sufficient for now. For more details on
entity mapping refer to <xref linkend="search-mapping-entity" />.</para>
- <programlisting>
+ <example>
+ <title>Example entities after adding Hibernate Search
+ annotations</title>
+
+ <programlisting>
package example;
...
@Entity
@@ -346,7 +380,7 @@
}
</programlisting>
- <programlisting>
+ <programlisting>
package example;
...
@Entity
@@ -366,6 +400,7 @@
...
}
</programlisting>
+ </example>
</section>
<section>
@@ -379,28 +414,41 @@
achieve this by using one of the following code snipplets (see also <xref
linkend="search-batchindex" />):</para>
- <para>Example using Hibernate Session:</para>
+ <example>
+ <title>Using Hibernate Session to index data</title>
- <programlisting>
+ <programlisting>
FullTextSession fullTextSession = Search.getFullTextSession(session);
Transaction tx = fullTextSession.beginTransaction();
+
List books = session.createQuery("from Book as book").list();
for (Book book : books) {
- fullTextSession.index(book);
+ <emphasis role="bold">fullTextSession.index(book);</emphasis>
}
+
tx.commit(); //index is written at commit time
</programlisting>
+ </example>
- <para>Example using JPA:</para>
+ <example>
+ <title>Using JPA to index data</title>
- <programlisting>
+ <programlisting>
EntityManager em = entityManagerFactory.createEntityManager();
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);
+em.getTransaction().begin();
+
List books = em.createQuery("select book from Book as book").getResultList();
for (Book book : books) {
- fullTextEntityManager.index(book);
+ <emphasis role="bold">fullTextEntityManager.index(book);</emphasis>
}
+
+em.getTransaction().commit();
+em.close();
+
+
</programlisting>
+ </example>
<para>After executing the above code, you should be able to see a Lucene
index under <literal>/var/lucene/indexes/example.Book</literal>. Go ahead
@@ -412,40 +460,62 @@
<section>
<title>Searching</title>
- <para>Now it is time to execute a first search. The following code will
- prepare a query against the indexed fields, execute it and return a list
- of <classname>Book</classname>s:</para>
+ <para>Now it is time to execute a first search. The general approach is to
+ create a native Lucene query and then wrap this query into a
+ org.hibernate.Query in order to get all the functionality one is used to
+ from the Hibernate API. The following code will prepare a query against
+ the indexed fields, execute it and return a list of
+ <classname>Book</classname>s. </para>
- <para>Example using Hibernate Session:</para>
+ <example>
+ <title>Using Hibernate Session to create and execute a search</title>
- <programlisting>
+ <programlisting>
FullTextSession fullTextSession = Search.getFullTextSession(session);
-
Transaction tx = fullTextSession.beginTransaction();
+// create native Lucene query
String[] fields = new String[]{"title", "subtitle", "authors.name", "publicationDate"};
MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, new StandardAnalyzer());
Query query = parser.parse( "Java rocks!" );
+
+// wrap Lucene query in a org.hibernate.Query
org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(query, Book.class);
+
+// execute search
List result = hibQuery.list();
tx.commit();
session.close();
</programlisting>
+ </example>
- <para>Example using JPA:</para>
+ <example>
+ <title>Using JPA to create and execute a search</title>
- <programlisting>
+ <programlisting>
EntityManager em = entityManagerFactory.createEntityManager();
-
FullTextEntityManager fullTextEntityManager =
org.hibernate.hibernate.search.jpa.Search.getFullTextEntityManager(em);
+em.getTransaction().begin();
+
+// create native Lucene query
String[] fields = new String[]{"title", "subtitle", "authors.name", "publicationDate"};
MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, new StandardAnalyzer());
Query query = parser.parse( "Java rocks!" );
+
+// wrap Lucene query in a org.hibernate.Query
org.hibernate.Query hibQuery = fullTextEntityManager.createFullTextQuery(query, Book.class);
+
+// execute search
List result = hibQuery.list();
+
+em.getTransaction().commit();
+em.close();
+
+
</programlisting>
+ </example>
</section>
<section>
@@ -456,9 +526,9 @@
Design of Existing Code" and you want to get hits for all of the following
queries: "refactor", "refactors", "refactored" and "refactoring". In
Lucene this can be achieved by choosing an analyzer class which applies
- word stemming during the indexing <emphasis role="bold">and</emphasis>
- search process. Hibernate Search offers several ways to configure the
- analyzer to use (see <xref linkend="analyzer" />):</para>
+ word stemming during the indexing <emphasis role="bold">as well
+ as</emphasis> search process. Hibernate Search offers several ways to
+ configure the analyzer to use (see <xref linkend="analyzer" />):</para>
<itemizedlist>
<listitem>
@@ -497,15 +567,19 @@
<classname>SnowballPorterFilterFactory</classname>. The standard tokenizer
splits words at punctuation characters and hyphens while keeping email
addresses and internet hostnames intact. It is a good general purpose
- tokenizer. The lowercase filter lowercases then the letters in each token
- whereas the snowball filter finally applies the actual language
+ tokenizer. The lowercase filter lowercases the letters in each token
+ whereas the snowball filter finally applies language specific
stemming.</para>
<para>Generally, when using the Solr framework you have to start with a
tokenizer followed by an arbitrary number of filters.</para>
- <programlisting>
+ <example>
+ <title>Using <classname>@AnalyzerDef</classname> and the Solr framework
+ to define and use an analyzer</title>
+ <programlisting>
+
package example;
...
@Entity
@@ -549,6 +623,7 @@
}
</programlisting>
+ </example>
</section>
<section>
@@ -559,20 +634,25 @@
command you can create an initial runnable maven project structure
populated with the example code of this tutorial.</para>
- <para><programlisting>mvn archetype:create \
+ <example>
+ <title>Using the maven achetype to create tutorial sources</title>
+
+ <programlisting>mvn archetype:create \
-DarchetypeGroupId=org.hibernate \
-DarchetypeArtifactId=hibernate-search-quickstart \
-DarchetypeVersion=3.1.0.GA \
- -DgroupId=my.company -DartifactId=quickstart</programlisting>Using the
- maven project you can execute the examples, inspect the file system based
- index and search and retrieve a list of managed objects. Just run
- <emphasis>mvn package</emphasis> to compile the sources and run the unit
- tests.</para>
+ -DgroupId=my.company -DartifactId=quickstart</programlisting>
+ </example>
+ <para>Using the maven project you can execute the examples, inspect the
+ file system based index and search and retrieve a list of managed objects.
+ Just run <emphasis>mvn package</emphasis> to compile the sources and run
+ the unit tests.</para>
+
<para>The next step after this tutorial is to get more familiar with the
overall architecture of Hibernate Search (<xref
linkend="search-architecture" />) and explore the basic features in more
- detail. Two topics which where only briefly touched in this tutorial were
+ detail. Two topics which were only briefly touched in this tutorial were
analyzer configuration (<xref linkend="analyzer" />) and field bridges
(<xref linkend="search-mapping-bridge" />), both important features
required for more fine-grained indexing. More advanced topics cover
Modified: search/trunk/doc/reference/en/modules/mapping.xml
===================================================================
--- search/trunk/doc/reference/en/modules/mapping.xml 2008-12-03 13:59:43 UTC (rev 15654)
+++ search/trunk/doc/reference/en/modules/mapping.xml 2008-12-03 16:24:24 UTC (rev 15655)
@@ -1274,7 +1274,7 @@
</section>
<section>
- <title>@ClassBridge</title>
+ <title>ClassBridge</title>
<para>It is sometimes useful to combine more than one property of a
given entity and index this combination in a specific way into the
15 years, 5 months
Hibernate SVN: r15654 - search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/bridge.
by hibernate-commits@lists.jboss.org
Author: jcosta(a)redhat.com
Date: 2008-12-03 08:59:43 -0500 (Wed, 03 Dec 2008)
New Revision: 15654
Modified:
search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/bridge/Cloud.java
Log:
HSEARCH-316 - Database keywords causes tests to fail
Modified: search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/bridge/Cloud.java
===================================================================
--- search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/bridge/Cloud.java 2008-12-03 13:58:53 UTC (rev 15653)
+++ search/branches/Branch_3_0_1_GA_CP/src/test/org/hibernate/search/test/bridge/Cloud.java 2008-12-03 13:59:43 UTC (rev 15654)
@@ -2,6 +2,8 @@
package org.hibernate.search.test.bridge;
import java.util.Date;
+
+import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
@@ -94,6 +96,7 @@
}
@Field(index=Index.UN_TOKENIZED, store=Store.YES)
+ @Column(name="int1x")
public Integer getInt1() {
return int1;
}
@@ -103,6 +106,7 @@
}
@Field(index=Index.UN_TOKENIZED, store=Store.YES)
+ @Column(name="int2x")
public int getInt2() {
return int2;
}
@@ -157,6 +161,7 @@
}
@Field(index=Index.UN_TOKENIZED, store=Store.YES)
+ @Column(name="xdate")
public Date getDate() {
return date;
}
15 years, 5 months