[hibernate-commits] Hibernate SVN: r15663 - in search/tags: v3_1_0_GA and 2 other directories.
hibernate-commits at lists.jboss.org
hibernate-commits at lists.jboss.org
Thu Dec 4 05:45:13 EST 2008
Author: hardy.ferentschik
Date: 2008-12-04 05:45:13 -0500 (Thu, 04 Dec 2008)
New Revision: 15663
Added:
search/tags/v3_1_0_GA/
search/tags/v3_1_0_GA/changelog.txt
search/tags/v3_1_0_GA/doc/reference/en/master.xml
search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml
Removed:
search/tags/v3_1_0_GA/changelog.txt
search/tags/v3_1_0_GA/doc/reference/en/master.xml
search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml
Log:
Created tag v3_1_0_GA.
Copied: search/tags/v3_1_0_GA (from rev 15659, search/trunk)
Deleted: search/tags/v3_1_0_GA/changelog.txt
===================================================================
--- search/trunk/changelog.txt 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/tags/v3_1_0_GA/changelog.txt 2008-12-04 10:45:13 UTC (rev 15663)
@@ -1,342 +0,0 @@
-Hibernate Search Changelog
-==========================
-
-3.1.0.GA (4-12-2008)
-------------------------
-
-3.1.0.CR1 (17-10-2008)
-------------------------
-
-** Bug
- * [HSEARCH-250] - In ReaderStrategies, ensure that the reader is current AND that the directory returned by the DirectoryProvider are the same
- * [HSEARCH-293] - AddLuceneWork is not being removed from the queue when DeleteLuceneWork is added for the same entity
- * [HSEARCH-300] - Fix documentation on use_compound_file
-
-** Improvement
- * [HSEARCH-213] - Use FieldSelector and doc(int, fieldSelector) to only select the necessary fields
- * [HSEARCH-224] - Use MultiClassesQueryLoader in ProjectionLoader
- * [HSEARCH-255] - Create a extensive Analyzer testing suite
- * [HSEARCH-266] - Do not switch to the current directory in FSSlaveDirectoryProvider if no file has been copied
- * [HSEARCH-274] - Use Lucene's new readonly IndexReader
- * [HSEARCH-281] - Work should be Work<T>
- * [HSEARCH-283] - Replace deprecated Classes and methods calls to Lucene 2.4
-
-** New Feature
- * [HSEARCH-104] - Make @DocumentId optional and rely on @Id
- * [HSEARCH-290] - Use IndexReader = readonly on Reader strategies (see Lucene 2.4)
- * [HSEARCH-294] - Rename INSTANCE_AND_BITSETRESULTS to INSTANCE_AND_DOCIDSETRESULTS
-
-** Task
- * [HSEARCH-288] - Evaluate changes in Lucene 2.4.0
- * [HSEARCH-289] - Move to new Lucene Filter DocIdSet
- * [HSEARCH-291] - improve documentation about thread safety requirements of Bridges.
-
-
-3.1.0.Beta2 (27-10-2008)
-------------------------
-
-** Bug
- * [HSEARCH-142] - Modifications on objects indexed via @IndexedEmbedded not updated when not annotated @Indexed
- * [HSEARCH-162] - NPE on queries when no entity is marked as @Indexed
- * [HSEARCH-222] - Entities not found during concurrent update
- * [HSEARCH-225] - Avoid using IndexReader.deleteDocument when index is not shared amongst several entity types
- * [HSEARCH-232] - Using SnowballPorterFilterFactory throws NoClassDefFoundError
- * [HSEARCH-237] - IdHashShardingStrategy fails on IDs having negative hashcode
- * [HSEARCH-241] - initialize methods taking Properties cannot list available properties
- * [HSEARCH-247] - Hibernate Search cannot run without apache-solr-analyzer.jar
- * [HSEARCH-253] - Inconsistent detection of EventListeners during autoregistration into Hibernate listeners
- * [HSEARCH-257] - Ignore delete operation when Core does update then delete on the same entity
- * [HSEARCH-259] - Filter were not isolated by name in the cache
- * [HSEARCH-262] - fullTextSession.purgeAll(Class<?>) does not consider subclasses
- * [HSEARCH-263] - Wrong analyzers used in IndexWriter
- * [HSEARCH-267] - Inheritance of annotations and analyzer
- * [HSEARCH-271] - wrong Similarity used when sharing index among entities
- * [HSEARCH-287] - master.xml is mistakenly copied to the distribution
-
-** Deprecation
- * [HSEARCH-279] - deprecate SharedReaderProvider replaced by SharingBufferReaderProvider as default ReaderProvider
-
-** Improvement
- * [HSEARCH-145] - Document a configuration property
- * [HSEARCH-226] - Use Lucene ability to delete by query in IndexWriter
- * [HSEARCH-240] - Generify the IndexShardingStrategy
- * [HSEARCH-245] - Add ReaderStratregy.destroy() method
- * [HSEARCH-256] - Remove CacheBitResults.YES
- * [HSEARCH-260] - Simplify the Filter Caching definition: cache=FilterCacheModeType.[MODE]
- * [HSEARCH-272] - Improve contention on DirectoryProviders in lucene backend
- * [HSEARCH-273] - Make LuceneOptions an interface
- * [HSEARCH-282] - Make the API more Generics friendly
-
-** New Feature
- * [HSEARCH-170] - Support @Boost in @Field
- * [HSEARCH-235] - provide a destroy() method in ReaderProvider
- * [HSEARCH-252] - Document Solr integration
- * [HSEARCH-258] - Add configuration option for Lucene's UseCompoundFile
-
-** Patch
- * [HSEARCH-20] - Lucene extensions
-
-** Task
- * [HSEARCH-231] - Update the getting started guide with Solr analyzers
- * [HSEARCH-236] - Find whether or not indexWriter.optimize() requires an index lock
- * [HSEARCH-244] - Abiltiy to ask SearchFactory for the scoped analyzer of a given class
- * [HSEARCH-254] - Migrate to Solr 1.3
- * [HSEARCH-276] - upgrade to Lucene 2.4
- * [HSEARCH-286] - Align to GA versions of all dependencies
- * [HSEARCH-292] - Document the new Filter caching approach
-
-
-3.1.0.Beta1 (17-07-2008)
-------------------------
-
-** Bug
- * [HSEARCH-166] - documentation error : hibernate.search.worker.batch_size vs hibernate.worker.batch_size
- * [HSEARCH-171] - Do not log missing objects when using QueryLoader
- * [HSEARCH-173] - CachingWrapperFilter loses its WeakReference making filter caching inefficient
- * [HSEARCH-194] - Inconsistent performance between hibernate search and pure lucene access
- * [HSEARCH-196] - ObjectNotFoundException not caught in FullTextSession
- * [HSEARCH-198] - Documentation out of sync with implemented/released features
- * [HSEARCH-203] - Counter of index modification operations not always incremented
- * [HSEARCH-204] - Improper calls to Session during a projection not involving THIS
- * [HSEARCH-205] - Out of Memory on copy of large indexes
- * [HSEARCH-217] - Proper errors on parsing of all numeric configuration parameters
- * [HSEARCH-227] - Criteria based fetching is not used when objects are loaded one by one (iterate())
-
-
-** Improvement
- * [HSEARCH-19] - Do not filter classes on queries when we know that all Directories only contains the targeted classes
- * [HSEARCH-156] - Retrofit FieldBridge.set lucene parameters into a LuceneOptions class
- * [HSEARCH-157] - Make explicit in FAQ and doc that query.list() followed by query.getResultSize() triggers only one query
- * [HSEARCH-163] - Enhance error messages when @FieldBridge is wrongly used (no impl or impl not implementing the right interfaces)
- * [HSEARCH-176] - Permits alignment properties to lucene default (Sanne Grinovero)
- * [HSEARCH-179] - Documentation should be explicit that @FulltextFilter filters every object, regardless which object is annotated
- * [HSEARCH-181] - Better management of file-based index directories (Sanne Grinovero)
- * [HSEARCH-189] - Thread management improvements for Master/Slave DirectoryProviders
- * [HSEARCH-197] - Move to slf4j
- * [HSEARCH-199] - Property close Search resources on SessionFactory.close()
- * [HSEARCH-202] - Avoid many maps lookup in Workspace
- * [HSEARCH-207] - Make DateBridge TwoWay to facilitate projection
- * [HSEARCH-208] - Raise exception on index and purge when the entity is not an indexed entity
- * [HSEARCH-209] - merge FullTextIndexCollectionEventListener into FullTextIndexEventListener
- * [HSEARCH-215] - Rename Search.createFTS to Search.getFTS deprecating the old method
- * [HSEARCH-223] - Use multiple criteria queries rather than ObjectLoader in most cases
- * [HSEARCH-230] - Ensure initialization safety in a multi-core machine
-
-** New Feature
- * [HSEARCH-133] - Allow overriding DefaultSimilarity for indexing and searching (Nick Vincent)
- * [HSEARCH-141] - Allow term position information to be stored in an index
- * [HSEARCH-153] - Provide the possibility to configure writer.setRAMBufferSizeMB() (Lucene 2.3)
- * [HSEARCH-154] - Provide a facility to access Lucene query explanations
- * [HSEARCH-164] - Built-in bridge to index java.lang.Class
- * [HSEARCH-165] - URI and URL built-in bridges
- * [HSEARCH-174] - Improve transparent filter caching by wrapping filters into our own CachingWrapperFilter
- * [HSEARCH-186] - Enhance analyzer to support the Solr model
- * [HSEARCH-190] - Add pom
- * [HSEARCH-191] - Make build independent of Hibernate Core structure
- * [HSEARCH-192] - Move to Hibernate Core 3.3
- * [HSEARCH-193] - Use dependency on Solr-analyzer JAR rather than the full Solr JAR
- * [HSEARCH-195] - Expose Analyzers instance by name: searchFactory.getAnalyzer(String)
- * [HSEARCH-200] - Expose IndexWriter setting MAX_FIELD_LENGTH via IndexWriterSetting
- * [HSEARCH-212] - Added ReaderProvider strategy reusing unchanged segments (using reader.reopen())
- * [HSEARCH-220] - introduce session.flushToIndexes API and deprecate batch_size
-
-
-** Task
- * [HSEARCH-169] - Migrate to Lucene 2.3.1 (index corruption possiblity in 2.3.0)
- * [HSEARCH-187] - Clarify which directories need read-write access, verify readonly behaviour on others.
- * [HSEARCH-214] - Upgrade Lucene to 2.3.2
- * [HSEARCH-229] - Deprecate FullTextQuery.BOOST
-
-
-3.0.1.GA (20-02-2008)
----------------------
-
-** Bug
- * [HSEARCH-56] - Updating a collection does not reindex
- * [HSEARCH-123] - Use mkdirs instead of mkdir to create necessary parent directory in the DirectoryProviderHelper
- * [HSEARCH-128] - Indexing embedded children's child
- * [HSEARCH-136] - CachingWrapperFilter does not cache
- * [HSEARCH-137] - Wrong class name in Exception when a FieldBridge does not implement TwoWayFieldBridge for a document id property
- * [HSEARCH-138] - JNDI Property names have first character cut off
- * [HSEARCH-140] - @IndexedEmbedded default depth is effectively 1 due to integer overflow
- * [HSEARCH-146] - ObjectLoader doesn't catch javax.persistence.EntityNotFoundException
- * [HSEARCH-149] - Default FieldBridge for enums passing wrong class to EnumBridge constructor
-
-
-** Improvement
- * [HSEARCH-125] - Add support for fields declared by interface or unmapped superclass
- * [HSEARCH-127] - Wrong prefix for worker configurations
- * [HSEARCH-129] - IndexedEmbedded for Collections Documentation
- * [HSEARCH-130] - Should provide better log infos (on the indexBase parameter for the FSDirectoryProvider)
- * [HSEARCH-144] - Keep indexer running till finished on VM shutdown
- * [HSEARCH-147] - Allow projection of Lucene DocId
-
-** New Feature
- * [HSEARCH-114] - Introduce ResultTransformer to the query API
- * [HSEARCH-150] - Migrate to Lucene 2.3
-
-** Patch
- * [HSEARCH-126] - Better diagnostic when Search index directory cannot be opened (Ian)
-
-
-3.0.0.GA (23-09-2007)
----------------------
-
-** Bug
- * [HSEARCH-116] - FullTextEntityManager acessing getDelegate() in the constructor leads to NPE in JBoss AS + Seam
- * [HSEARCH-117] - FullTextEntityManagerImpl and others should implement Serializable
-
-** Deprecation
- * [HSEARCH-122] - Remove query.setIndexProjection (replaced by query.setProjection)
-
-** Improvement
- * [HSEARCH-118] - Add ClassBridges (plural) functionality
-
-** New Feature
- * [HSEARCH-81] - Create a @ClassBridge Annotation (John Griffin)
-
-
-** Task
- * [HSEARCH-98] - Add a Getting started section to the reference documentation
-
-
-3.0.0.CR1 (4-09-2007)
----------------------
-
-** Bug
- * [HSEARCH-108] - id of embedded object is not indexed when using @IndexedEmbedded
- * [HSEARCH-109] - Lazy loaded entity could not be indexed
- * [HSEARCH-110] - ScrollableResults does not obey out of bounds rules (John Griffin)
- * [HSEARCH-112] - Unkown @FullTextFilter when attempting to associate a filter
-
-** Deprecation
- * [HSEARCH-113] - Remove @Text, @Keyword and @Unstored (old mapping annotations)
-
-** Improvement
- * [HSEARCH-107] - DirectoryProvider should have a start() method
-
-** New Feature
- * [HSEARCH-14] - introduce fetch_size for Hibernate Search scrollable resultsets (John Griffin)
- * [HSEARCH-69] - Ability to purge an index by class (John Griffin)
- * [HSEARCH-111] - Ability to disable event based indexing (for read only or batch based indexing)
-
-
-3.0.0.Beta4 (1-08-2007)
------------------------
-
-** Bug
- * [HSEARCH-88] - Unable to update 2 entity types in the same transaction if they share the same index
- * [HSEARCH-90] - Use of setFirstResult / setMaxResults can lead to a list with negative capacity (John Griffin)
- * [HSEARCH-92] - NPE for null fields on projection
- * [HSEARCH-99] - Avoid returning non initialized proxies in scroll() and iterate() (loader.load(EntityInfo))
-
-
-** Improvement
- * [HSEARCH-79] - Recommend to use FlushMode.APPLICATION on massive indexing
- * [HSEARCH-84] - Migrate to Lucene 2.2
- * [HSEARCH-91] - Avoid wrapping a Session object if the Session is already FullTextSession
- * [HSEARCH-100] - Rename fullTextSession.setIndexProjection() to fullTextSession.setProjection()
- * [HSEARCH-102] - Default index operation in @Field to TOKENIZED
- * [HSEARCH-106] - Use the shared reader strategy as the default strategy
-
-** New Feature
- * [HSEARCH-6] - Provide access to the Hit.getScore() and potentially the Document on a query
- * [HSEARCH-15] - Notion of Filtered Lucene queries (Hardy Ferentschik)
- * [HSEARCH-41] - Allow fine grained analyzers (Entity, attribute, @Field)
- * [HSEARCH-45] - Support @Fields() for multiple indexing per property (useful for sorting)
- * [HSEARCH-58] - Support named Filters (and caching)
- * [HSEARCH-67] - Expose mergeFactor, maxMergeDocs and minMergeDocs (Hardy Ferentschik)
- * [HSEARCH-73] - IncrementalOptimizerStrategy triggered on transactions or operations limits
- * [HSEARCH-74] - Ability to project Lucene meta information (Score, Boost, Document, Id, This) (John Griffin)
- * [HSEARCH-83] - Introduce OptimizerStrategy
- * [HSEARCH-86] - Index sharding: multiple Lucene indexes per entity type
- * [HSEARCH-89] - FullText wrapper for JPA APIs
- * [HSEARCH-103] - Ability to override the indexName in the FSDirectoryProviders family
-
-
-** Task
- * [HSEARCH-94] - Deprecate ContextHelper
-
-
-3.0.0.Beta3 (6-06-2007)
------------------------
-
-** Bug
- * [HSEARCH-64] - Exception Thrown If Index Directory Does Not Exist
- * [HSEARCH-66] - Some results not returned in some circumstances (Brandon Munroe)
-
-
-** Improvement
- * [HSEARCH-60] - Introduce SearchFactory / SearchFactoryImpl
- * [HSEARCH-68] - Set index copy threads as daemon
- * [HSEARCH-70] - Create the index base directory if it does not exists
-
-** New Feature
- * [HSEARCH-11] - Provide access to IndexWriter.optimize()
- * [HSEARCH-33] - hibernate.search.worker.batch_size to prevent OutOfMemoryException while inserting many objects
- * [HSEARCH-71] - Provide fullTextSession.getSearchFactory()
- * [HSEARCH-72] - searchFactory.optimize() and searchFactory.optimize(Class) (Andrew Hahn)
-
-
-3.0.0.Beta2 (31-05-2007)
-------------------------
-
-** Bug
- * [HSEARCH-37] - Verify that Serializable return type are not resolved by StringBridge built in type
- * [HSEARCH-39] - event listener declaration example is wrong
- * [HSEARCH-44] - Build the Lucene Document in the beforeComplete transaction phase
- * [HSEARCH-50] - Null Booleans lead to NPE
- * [HSEARCH-59] - Unable to index @indexEmbedded object through session.index when object is lazy and field access is used in object
-
-
-** Improvement
- * [HSEARCH-36] - Meaningful exception message when Search Listeners are not initialized
- * [HSEARCH-38] - Make the @IndexedEmbedded documentation example easier to understand
- * [HSEARCH-51] - Optimization: Use a query rather than batch-size to load objects when a single entity (hierarchy) is expected
- * [HSEARCH-63] - rename query.resultSize() to getResultSize()
-
-** New Feature
- * [HSEARCH-4] - Be able to use a Lucene Sort on queries (Hardy Ferentschik)
- * [HSEARCH-13] - Cache IndexReaders per SearchFactory
- * [HSEARCH-40] - Be able to embed collections in lucene index (@IndexedEmbeddable in collections)
- * [HSEARCH-43] - Expose resultSize and do not load object when only resultSize is retrieved
- * [HSEARCH-52] - Ability to load more efficiently an object graph from a lucene query by customizing the fetch modes
- * [HSEARCH-53] - Add support for projection (ie read the data from the index only)
- * [HSEARCH-61] - Move from MultiSearcher to MultiReader
- * [HSEARCH-62] - Support pluggable ReaderProvider strategies
-
-
-** Task
- * [HSEARCH-65] - Update to JBoss Embedded beta2
-
-
-3.0.0.Beta1 (19-03-2007)
-------------------------
-
-Initial release as a standalone product (see Hibernate Annotations changelog for previous informations)
-
-
-Release Notes - Hibernate Search - Version 3.0.0.beta1
-
-** Bug
- * [HSEARCH-7] - Ignore object found in the index but no longer present in the database (for out of date indexes)
- * [HSEARCH-21] - NPE in SearchFactory while using different threads
- * [HSEARCH-22] - Enum value Index.UN_TOKENISED is misspelled
- * [HSEARCH-24] - Potential deadlock when using multiple DirectoryProviders in a highly concurrent index update
- * [HSEARCH-25] - Class cast exception in org.hibernate.search.impl.FullTextSessionImpl<init>(FullTextSessionImpl.java:54)
- * [HSEARCH-28] - Wrong indexDir property in Apache Lucene Integration
-
-
-** Improvement
- * [HSEARCH-29] - Share the initialization state across all Search event listeners instance
- * [HSEARCH-30] - @FieldBridge now use o.h.s.a.Parameter rather than o.h.a.Parameter
- * [HSEARCH-31] - Move to Lucene 2.1.0
-
-** New Feature
- * [HSEARCH-1] - Give access to Directory providers
- * [HSEARCH-2] - Default FieldBridge for enums (Sylvain Vieujot)
- * [HSEARCH-3] - Default FieldBridge for booleans (Sylvain Vieujot)
- * [HSEARCH-9] - Introduce a worker factory and its configuration
- * [HSEARCH-16] - Cluster capability through JMS
- * [HSEARCH-23] - Support asynchronous batch worker queue
- * [HSEARCH-27] - Ability to index associated / embedded objects
Copied: search/tags/v3_1_0_GA/changelog.txt (from rev 15662, search/trunk/changelog.txt)
===================================================================
--- search/tags/v3_1_0_GA/changelog.txt (rev 0)
+++ search/tags/v3_1_0_GA/changelog.txt 2008-12-04 10:45:13 UTC (rev 15663)
@@ -0,0 +1,368 @@
+Hibernate Search Changelog
+==========================
+
+3.1.0.GA (4-12-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-233] - EntityNotFoundException during indexing
+ * [HSEARCH-280] - Make FSSlaveAndMasterDPTest pass against postgresql
+ * [HSEARCH-297] - Allow PatternTokenizerFactory to be used
+ * [HSEARCH-309] - PurgeAllLuceneWork duplicates in work queue
+
+** Improvement
+ * [HSEARCH-221] - Get Lucene Analyzer runtime (indexing)
+ * [HSEARCH-265] - Raise warnings when an abstract class is marked @Indexed
+ * [HSEARCH-285] - Refactor DocumentBuilder to support containedIn only and regular Indexed entities
+ * [HSEARCH-298] - Warn for dangerous IndexWriter settings
+ * [HSEARCH-299] - Use of faster Bit operations when possible to chain Filters
+ * [HSEARCH-302] - Utilize pagination settings when retrieving TopDocs from the Lucene query to only retrieve required TopDocs
+ * [HSEARCH-308] - getResultSize() implementation should not load documents
+ * [HSEARCH-311] - Add a close() method to BackendQueueProcessorFactory
+ * [HSEARCH-312] - Rename hibernate.search.filter.cache_bit_results.size to hibernate.search.filter.cache_docidresults.size
+
+** New Feature
+ * [HSEARCH-160] - Truly polymorphic queries
+ * [HSEARCH-268] - Apply changes to different indexes in parallel
+ * [HSEARCH-296] - Expose managed entity class via a Projection constant
+
+** Task
+ * [HSEARCH-303] - Review reference documentation
+
+
+3.1.0.CR1 (17-10-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-250] - In ReaderStrategies, ensure that the reader is current AND that the directory returned by the DirectoryProvider are the same
+ * [HSEARCH-293] - AddLuceneWork is not being removed from the queue when DeleteLuceneWork is added for the same entity
+ * [HSEARCH-300] - Fix documentation on use_compound_file
+
+** Improvement
+ * [HSEARCH-213] - Use FieldSelector and doc(int, fieldSelector) to only select the necessary fields
+ * [HSEARCH-224] - Use MultiClassesQueryLoader in ProjectionLoader
+ * [HSEARCH-255] - Create a extensive Analyzer testing suite
+ * [HSEARCH-266] - Do not switch to the current directory in FSSlaveDirectoryProvider if no file has been copied
+ * [HSEARCH-274] - Use Lucene's new readonly IndexReader
+ * [HSEARCH-281] - Work should be Work<T>
+ * [HSEARCH-283] - Replace deprecated Classes and methods calls to Lucene 2.4
+
+** New Feature
+ * [HSEARCH-104] - Make @DocumentId optional and rely on @Id
+ * [HSEARCH-290] - Use IndexReader = readonly on Reader strategies (see Lucene 2.4)
+ * [HSEARCH-294] - Rename INSTANCE_AND_BITSETRESULTS to INSTANCE_AND_DOCIDSETRESULTS
+
+** Task
+ * [HSEARCH-288] - Evaluate changes in Lucene 2.4.0
+ * [HSEARCH-289] - Move to new Lucene Filter DocIdSet
+ * [HSEARCH-291] - improve documentation about thread safety requirements of Bridges.
+
+
+3.1.0.Beta2 (27-10-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-142] - Modifications on objects indexed via @IndexedEmbedded not updated when not annotated @Indexed
+ * [HSEARCH-162] - NPE on queries when no entity is marked as @Indexed
+ * [HSEARCH-222] - Entities not found during concurrent update
+ * [HSEARCH-225] - Avoid using IndexReader.deleteDocument when index is not shared amongst several entity types
+ * [HSEARCH-232] - Using SnowballPorterFilterFactory throws NoClassDefFoundError
+ * [HSEARCH-237] - IdHashShardingStrategy fails on IDs having negative hashcode
+ * [HSEARCH-241] - initialize methods taking Properties cannot list available properties
+ * [HSEARCH-247] - Hibernate Search cannot run without apache-solr-analyzer.jar
+ * [HSEARCH-253] - Inconsistent detection of EventListeners during autoregistration into Hibernate listeners
+ * [HSEARCH-257] - Ignore delete operation when Core does update then delete on the same entity
+ * [HSEARCH-259] - Filter were not isolated by name in the cache
+ * [HSEARCH-262] - fullTextSession.purgeAll(Class<?>) does not consider subclasses
+ * [HSEARCH-263] - Wrong analyzers used in IndexWriter
+ * [HSEARCH-267] - Inheritance of annotations and analyzer
+ * [HSEARCH-271] - wrong Similarity used when sharing index among entities
+ * [HSEARCH-287] - master.xml is mistakenly copied to the distribution
+
+** Deprecation
+ * [HSEARCH-279] - deprecate SharedReaderProvider replaced by SharingBufferReaderProvider as default ReaderProvider
+
+** Improvement
+ * [HSEARCH-145] - Document a configuration property
+ * [HSEARCH-226] - Use Lucene ability to delete by query in IndexWriter
+ * [HSEARCH-240] - Generify the IndexShardingStrategy
+ * [HSEARCH-245] - Add ReaderStratregy.destroy() method
+ * [HSEARCH-256] - Remove CacheBitResults.YES
+ * [HSEARCH-260] - Simplify the Filter Caching definition: cache=FilterCacheModeType.[MODE]
+ * [HSEARCH-272] - Improve contention on DirectoryProviders in lucene backend
+ * [HSEARCH-273] - Make LuceneOptions an interface
+ * [HSEARCH-282] - Make the API more Generics friendly
+
+** New Feature
+ * [HSEARCH-170] - Support @Boost in @Field
+ * [HSEARCH-235] - provide a destroy() method in ReaderProvider
+ * [HSEARCH-252] - Document Solr integration
+ * [HSEARCH-258] - Add configuration option for Lucene's UseCompoundFile
+
+** Patch
+ * [HSEARCH-20] - Lucene extensions
+
+** Task
+ * [HSEARCH-231] - Update the getting started guide with Solr analyzers
+ * [HSEARCH-236] - Find whether or not indexWriter.optimize() requires an index lock
+ * [HSEARCH-244] - Abiltiy to ask SearchFactory for the scoped analyzer of a given class
+ * [HSEARCH-254] - Migrate to Solr 1.3
+ * [HSEARCH-276] - upgrade to Lucene 2.4
+ * [HSEARCH-286] - Align to GA versions of all dependencies
+ * [HSEARCH-292] - Document the new Filter caching approach
+
+
+3.1.0.Beta1 (17-07-2008)
+------------------------
+
+** Bug
+ * [HSEARCH-166] - documentation error : hibernate.search.worker.batch_size vs hibernate.worker.batch_size
+ * [HSEARCH-171] - Do not log missing objects when using QueryLoader
+ * [HSEARCH-173] - CachingWrapperFilter loses its WeakReference making filter caching inefficient
+ * [HSEARCH-194] - Inconsistent performance between hibernate search and pure lucene access
+ * [HSEARCH-196] - ObjectNotFoundException not caught in FullTextSession
+ * [HSEARCH-198] - Documentation out of sync with implemented/released features
+ * [HSEARCH-203] - Counter of index modification operations not always incremented
+ * [HSEARCH-204] - Improper calls to Session during a projection not involving THIS
+ * [HSEARCH-205] - Out of Memory on copy of large indexes
+ * [HSEARCH-217] - Proper errors on parsing of all numeric configuration parameters
+ * [HSEARCH-227] - Criteria based fetching is not used when objects are loaded one by one (iterate())
+
+
+** Improvement
+ * [HSEARCH-19] - Do not filter classes on queries when we know that all Directories only contains the targeted classes
+ * [HSEARCH-156] - Retrofit FieldBridge.set lucene parameters into a LuceneOptions class
+ * [HSEARCH-157] - Make explicit in FAQ and doc that query.list() followed by query.getResultSize() triggers only one query
+ * [HSEARCH-163] - Enhance error messages when @FieldBridge is wrongly used (no impl or impl not implementing the right interfaces)
+ * [HSEARCH-176] - Permits alignment properties to lucene default (Sanne Grinovero)
+ * [HSEARCH-179] - Documentation should be explicit that @FulltextFilter filters every object, regardless which object is annotated
+ * [HSEARCH-181] - Better management of file-based index directories (Sanne Grinovero)
+ * [HSEARCH-189] - Thread management improvements for Master/Slave DirectoryProviders
+ * [HSEARCH-197] - Move to slf4j
+ * [HSEARCH-199] - Property close Search resources on SessionFactory.close()
+ * [HSEARCH-202] - Avoid many maps lookup in Workspace
+ * [HSEARCH-207] - Make DateBridge TwoWay to facilitate projection
+ * [HSEARCH-208] - Raise exception on index and purge when the entity is not an indexed entity
+ * [HSEARCH-209] - merge FullTextIndexCollectionEventListener into FullTextIndexEventListener
+ * [HSEARCH-215] - Rename Search.createFTS to Search.getFTS deprecating the old method
+ * [HSEARCH-223] - Use multiple criteria queries rather than ObjectLoader in most cases
+ * [HSEARCH-230] - Ensure initialization safety in a multi-core machine
+
+** New Feature
+ * [HSEARCH-133] - Allow overriding DefaultSimilarity for indexing and searching (Nick Vincent)
+ * [HSEARCH-141] - Allow term position information to be stored in an index
+ * [HSEARCH-153] - Provide the possibility to configure writer.setRAMBufferSizeMB() (Lucene 2.3)
+ * [HSEARCH-154] - Provide a facility to access Lucene query explanations
+ * [HSEARCH-164] - Built-in bridge to index java.lang.Class
+ * [HSEARCH-165] - URI and URL built-in bridges
+ * [HSEARCH-174] - Improve transparent filter caching by wrapping filters into our own CachingWrapperFilter
+ * [HSEARCH-186] - Enhance analyzer to support the Solr model
+ * [HSEARCH-190] - Add pom
+ * [HSEARCH-191] - Make build independent of Hibernate Core structure
+ * [HSEARCH-192] - Move to Hibernate Core 3.3
+ * [HSEARCH-193] - Use dependency on Solr-analyzer JAR rather than the full Solr JAR
+ * [HSEARCH-195] - Expose Analyzers instance by name: searchFactory.getAnalyzer(String)
+ * [HSEARCH-200] - Expose IndexWriter setting MAX_FIELD_LENGTH via IndexWriterSetting
+ * [HSEARCH-212] - Added ReaderProvider strategy reusing unchanged segments (using reader.reopen())
+ * [HSEARCH-220] - introduce session.flushToIndexes API and deprecate batch_size
+
+
+** Task
+ * [HSEARCH-169] - Migrate to Lucene 2.3.1 (index corruption possiblity in 2.3.0)
+ * [HSEARCH-187] - Clarify which directories need read-write access, verify readonly behaviour on others.
+ * [HSEARCH-214] - Upgrade Lucene to 2.3.2
+ * [HSEARCH-229] - Deprecate FullTextQuery.BOOST
+
+
+3.0.1.GA (20-02-2008)
+---------------------
+
+** Bug
+ * [HSEARCH-56] - Updating a collection does not reindex
+ * [HSEARCH-123] - Use mkdirs instead of mkdir to create necessary parent directory in the DirectoryProviderHelper
+ * [HSEARCH-128] - Indexing embedded children's child
+ * [HSEARCH-136] - CachingWrapperFilter does not cache
+ * [HSEARCH-137] - Wrong class name in Exception when a FieldBridge does not implement TwoWayFieldBridge for a document id property
+ * [HSEARCH-138] - JNDI Property names have first character cut off
+ * [HSEARCH-140] - @IndexedEmbedded default depth is effectively 1 due to integer overflow
+ * [HSEARCH-146] - ObjectLoader doesn't catch javax.persistence.EntityNotFoundException
+ * [HSEARCH-149] - Default FieldBridge for enums passing wrong class to EnumBridge constructor
+
+
+** Improvement
+ * [HSEARCH-125] - Add support for fields declared by interface or unmapped superclass
+ * [HSEARCH-127] - Wrong prefix for worker configurations
+ * [HSEARCH-129] - IndexedEmbedded for Collections Documentation
+ * [HSEARCH-130] - Should provide better log infos (on the indexBase parameter for the FSDirectoryProvider)
+ * [HSEARCH-144] - Keep indexer running till finished on VM shutdown
+ * [HSEARCH-147] - Allow projection of Lucene DocId
+
+** New Feature
+ * [HSEARCH-114] - Introduce ResultTransformer to the query API
+ * [HSEARCH-150] - Migrate to Lucene 2.3
+
+** Patch
+ * [HSEARCH-126] - Better diagnostic when Search index directory cannot be opened (Ian)
+
+
+3.0.0.GA (23-09-2007)
+---------------------
+
+** Bug
+ * [HSEARCH-116] - FullTextEntityManager acessing getDelegate() in the constructor leads to NPE in JBoss AS + Seam
+ * [HSEARCH-117] - FullTextEntityManagerImpl and others should implement Serializable
+
+** Deprecation
+ * [HSEARCH-122] - Remove query.setIndexProjection (replaced by query.setProjection)
+
+** Improvement
+ * [HSEARCH-118] - Add ClassBridges (plural) functionality
+
+** New Feature
+ * [HSEARCH-81] - Create a @ClassBridge Annotation (John Griffin)
+
+
+** Task
+ * [HSEARCH-98] - Add a Getting started section to the reference documentation
+
+
+3.0.0.CR1 (4-09-2007)
+---------------------
+
+** Bug
+ * [HSEARCH-108] - id of embedded object is not indexed when using @IndexedEmbedded
+ * [HSEARCH-109] - Lazy loaded entity could not be indexed
+ * [HSEARCH-110] - ScrollableResults does not obey out of bounds rules (John Griffin)
+ * [HSEARCH-112] - Unkown @FullTextFilter when attempting to associate a filter
+
+** Deprecation
+ * [HSEARCH-113] - Remove @Text, @Keyword and @Unstored (old mapping annotations)
+
+** Improvement
+ * [HSEARCH-107] - DirectoryProvider should have a start() method
+
+** New Feature
+ * [HSEARCH-14] - introduce fetch_size for Hibernate Search scrollable resultsets (John Griffin)
+ * [HSEARCH-69] - Ability to purge an index by class (John Griffin)
+ * [HSEARCH-111] - Ability to disable event based indexing (for read only or batch based indexing)
+
+
+3.0.0.Beta4 (1-08-2007)
+-----------------------
+
+** Bug
+ * [HSEARCH-88] - Unable to update 2 entity types in the same transaction if they share the same index
+ * [HSEARCH-90] - Use of setFirstResult / setMaxResults can lead to a list with negative capacity (John Griffin)
+ * [HSEARCH-92] - NPE for null fields on projection
+ * [HSEARCH-99] - Avoid returning non initialized proxies in scroll() and iterate() (loader.load(EntityInfo))
+
+
+** Improvement
+ * [HSEARCH-79] - Recommend to use FlushMode.APPLICATION on massive indexing
+ * [HSEARCH-84] - Migrate to Lucene 2.2
+ * [HSEARCH-91] - Avoid wrapping a Session object if the Session is already FullTextSession
+ * [HSEARCH-100] - Rename fullTextSession.setIndexProjection() to fullTextSession.setProjection()
+ * [HSEARCH-102] - Default index operation in @Field to TOKENIZED
+ * [HSEARCH-106] - Use the shared reader strategy as the default strategy
+
+** New Feature
+ * [HSEARCH-6] - Provide access to the Hit.getScore() and potentially the Document on a query
+ * [HSEARCH-15] - Notion of Filtered Lucene queries (Hardy Ferentschik)
+ * [HSEARCH-41] - Allow fine grained analyzers (Entity, attribute, @Field)
+ * [HSEARCH-45] - Support @Fields() for multiple indexing per property (useful for sorting)
+ * [HSEARCH-58] - Support named Filters (and caching)
+ * [HSEARCH-67] - Expose mergeFactor, maxMergeDocs and minMergeDocs (Hardy Ferentschik)
+ * [HSEARCH-73] - IncrementalOptimizerStrategy triggered on transactions or operations limits
+ * [HSEARCH-74] - Ability to project Lucene meta information (Score, Boost, Document, Id, This) (John Griffin)
+ * [HSEARCH-83] - Introduce OptimizerStrategy
+ * [HSEARCH-86] - Index sharding: multiple Lucene indexes per entity type
+ * [HSEARCH-89] - FullText wrapper for JPA APIs
+ * [HSEARCH-103] - Ability to override the indexName in the FSDirectoryProviders family
+
+
+** Task
+ * [HSEARCH-94] - Deprecate ContextHelper
+
+
+3.0.0.Beta3 (6-06-2007)
+-----------------------
+
+** Bug
+ * [HSEARCH-64] - Exception Thrown If Index Directory Does Not Exist
+ * [HSEARCH-66] - Some results not returned in some circumstances (Brandon Munroe)
+
+
+** Improvement
+ * [HSEARCH-60] - Introduce SearchFactory / SearchFactoryImpl
+ * [HSEARCH-68] - Set index copy threads as daemon
+ * [HSEARCH-70] - Create the index base directory if it does not exists
+
+** New Feature
+ * [HSEARCH-11] - Provide access to IndexWriter.optimize()
+ * [HSEARCH-33] - hibernate.search.worker.batch_size to prevent OutOfMemoryException while inserting many objects
+ * [HSEARCH-71] - Provide fullTextSession.getSearchFactory()
+ * [HSEARCH-72] - searchFactory.optimize() and searchFactory.optimize(Class) (Andrew Hahn)
+
+
+3.0.0.Beta2 (31-05-2007)
+------------------------
+
+** Bug
+ * [HSEARCH-37] - Verify that Serializable return type are not resolved by StringBridge built in type
+ * [HSEARCH-39] - event listener declaration example is wrong
+ * [HSEARCH-44] - Build the Lucene Document in the beforeComplete transaction phase
+ * [HSEARCH-50] - Null Booleans lead to NPE
+ * [HSEARCH-59] - Unable to index @indexEmbedded object through session.index when object is lazy and field access is used in object
+
+
+** Improvement
+ * [HSEARCH-36] - Meaningful exception message when Search Listeners are not initialized
+ * [HSEARCH-38] - Make the @IndexedEmbedded documentation example easier to understand
+ * [HSEARCH-51] - Optimization: Use a query rather than batch-size to load objects when a single entity (hierarchy) is expected
+ * [HSEARCH-63] - rename query.resultSize() to getResultSize()
+
+** New Feature
+ * [HSEARCH-4] - Be able to use a Lucene Sort on queries (Hardy Ferentschik)
+ * [HSEARCH-13] - Cache IndexReaders per SearchFactory
+ * [HSEARCH-40] - Be able to embed collections in lucene index (@IndexedEmbeddable in collections)
+ * [HSEARCH-43] - Expose resultSize and do not load object when only resultSize is retrieved
+ * [HSEARCH-52] - Ability to load more efficiently an object graph from a lucene query by customizing the fetch modes
+ * [HSEARCH-53] - Add support for projection (ie read the data from the index only)
+ * [HSEARCH-61] - Move from MultiSearcher to MultiReader
+ * [HSEARCH-62] - Support pluggable ReaderProvider strategies
+
+
+** Task
+ * [HSEARCH-65] - Update to JBoss Embedded beta2
+
+
+3.0.0.Beta1 (19-03-2007)
+------------------------
+
+Initial release as a standalone product (see Hibernate Annotations changelog for previous informations)
+
+
+Release Notes - Hibernate Search - Version 3.0.0.beta1
+
+** Bug
+ * [HSEARCH-7] - Ignore object found in the index but no longer present in the database (for out of date indexes)
+ * [HSEARCH-21] - NPE in SearchFactory while using different threads
+ * [HSEARCH-22] - Enum value Index.UN_TOKENISED is misspelled
+ * [HSEARCH-24] - Potential deadlock when using multiple DirectoryProviders in a highly concurrent index update
+ * [HSEARCH-25] - Class cast exception in org.hibernate.search.impl.FullTextSessionImpl<init>(FullTextSessionImpl.java:54)
+ * [HSEARCH-28] - Wrong indexDir property in Apache Lucene Integration
+
+
+** Improvement
+ * [HSEARCH-29] - Share the initialization state across all Search event listeners instance
+ * [HSEARCH-30] - @FieldBridge now use o.h.s.a.Parameter rather than o.h.a.Parameter
+ * [HSEARCH-31] - Move to Lucene 2.1.0
+
+** New Feature
+ * [HSEARCH-1] - Give access to Directory providers
+ * [HSEARCH-2] - Default FieldBridge for enums (Sylvain Vieujot)
+ * [HSEARCH-3] - Default FieldBridge for booleans (Sylvain Vieujot)
+ * [HSEARCH-9] - Introduce a worker factory and its configuration
+ * [HSEARCH-16] - Cluster capability through JMS
+ * [HSEARCH-23] - Support asynchronous batch worker queue
+ * [HSEARCH-27] - Ability to index associated / embedded objects
Deleted: search/tags/v3_1_0_GA/doc/reference/en/master.xml
===================================================================
--- search/trunk/doc/reference/en/master.xml 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/tags/v3_1_0_GA/doc/reference/en/master.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -1,92 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!-- $Id$ -->
-<!--
- ~ Hibernate, Relational Persistence for Idiomatic Java
- ~
- ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
- ~ indicated by the @author tags or express copyright attribution
- ~ statements applied by the authors. All third-party contributions are
- ~ distributed under license by Red Hat Middleware LLC.
- ~
- ~ This copyrighted material is made available to anyone wishing to use, modify,
- ~ copy, or redistribute it subject to the terms and conditions of the GNU
- ~ Lesser General Public License, as published by the Free Software Foundation.
- ~
- ~ This program is distributed in the hope that it will be useful,
- ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
- ~ for more details.
- ~
- ~ You should have received a copy of the GNU Lesser General Public License
- ~ along with this distribution; if not, write to:
- ~ Free Software Foundation, Inc.
- ~ 51 Franklin Street, Fifth Floor
- ~ Boston, MA 02110-1301 USA
- -->
-<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
-<!ENTITY versionNumber "3.1.0.GA">
-<!ENTITY copyrightYear "2004">
-<!ENTITY copyrightHolder "Red Hat Middleware, LLC.">
-]>
-<book lang="en">
- <bookinfo>
- <title>Hibernate Search</title>
-
- <subtitle>Apache <trademark>Lucene</trademark> Integration</subtitle>
-
- <subtitle>Reference Guide</subtitle>
-
- <releaseinfo>&versionNumber;</releaseinfo>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="images/hibernate_logo_a.png" format="PNG" />
- </imageobject>
- </mediaobject>
- </bookinfo>
-
- <toc></toc>
-
- <preface id="preface" revision="2">
- <title>Preface</title>
-
- <para>Full text search engines like Apache Lucene are very powerful
- technologies to add efficient free text search capabilities to
- applications. However, they suffer several mismatches when dealing with
- object domain models. Amongst other things indexes have to be kept up to
- date and mismatches between index structure and domain model as well as
- query mismatches have to be avoided.</para>
-
- <para>Hibernate Search indexes your domain model with the help of a few
- annotations, takes care of database/index synchronization and brings back
- regular managed objects from free text queries. To achieve this Hibernate
- Search is combining the power of <ulink
- url="http://www.hibernate.org">Hibernate</ulink> and <ulink
- url="http://lucene.apache.org">Apache Lucene</ulink>.</para>
- </preface>
-
- <xi:include href="modules/getting-started.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/architecture.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/configuration.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/mapping.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/query.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/batchindex.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/optimize.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-
- <xi:include href="modules/lucene-native.xml"
- xmlns:xi="http://www.w3.org/2001/XInclude" />
-</book>
Copied: search/tags/v3_1_0_GA/doc/reference/en/master.xml (from rev 15660, search/trunk/doc/reference/en/master.xml)
===================================================================
--- search/tags/v3_1_0_GA/doc/reference/en/master.xml (rev 0)
+++ search/tags/v3_1_0_GA/doc/reference/en/master.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -0,0 +1,92 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- $Id$ -->
+<!--
+ ~ Hibernate, Relational Persistence for Idiomatic Java
+ ~
+ ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
+ ~ indicated by the @author tags or express copyright attribution
+ ~ statements applied by the authors. All third-party contributions are
+ ~ distributed under license by Red Hat Middleware LLC.
+ ~
+ ~ This copyrighted material is made available to anyone wishing to use, modify,
+ ~ copy, or redistribute it subject to the terms and conditions of the GNU
+ ~ Lesser General Public License, as published by the Free Software Foundation.
+ ~
+ ~ This program is distributed in the hope that it will be useful,
+ ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
+ ~ for more details.
+ ~
+ ~ You should have received a copy of the GNU Lesser General Public License
+ ~ along with this distribution; if not, write to:
+ ~ Free Software Foundation, Inc.
+ ~ 51 Franklin Street, Fifth Floor
+ ~ Boston, MA 02110-1301 USA
+ -->
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
+<!ENTITY versionNumber "3.1.0.GA">
+<!ENTITY copyrightYear "2004">
+<!ENTITY copyrightHolder "Red Hat Middleware, LLC.">
+]>
+<book lang="en">
+ <bookinfo>
+ <title>Hibernate Search</title>
+
+ <subtitle>Apache <trademark>Lucene</trademark> Integration</subtitle>
+
+ <subtitle>Reference Guide</subtitle>
+
+ <releaseinfo>&versionNumber;</releaseinfo>
+
+ <mediaobject>
+ <imageobject>
+ <imagedata fileref="images/hibernate_logo_a.png" format="PNG" />
+ </imageobject>
+ </mediaobject>
+ </bookinfo>
+
+ <toc></toc>
+
+ <preface id="preface" revision="2">
+ <title>Preface</title>
+
+ <para>Full text search engines like Apache Lucene are very powerful
+ technologies to add efficient free text search capabilities to
+ applications. However, Lucene suffers several mismatches when dealing with
+ object domain model. Amongst other things indexes have to be kept up to
+ date and mismatches between index structure and domain model as well as
+ query mismatches have to be avoided.</para>
+
+ <para>Hibernate Search addresses these shortcomings - it indexes your
+ domain model with the help of a few annotations, takes care of
+ database/index synchronization and brings back regular managed objects
+ from free text queries. To achieve this Hibernate Search is combining the
+ power of <ulink url="http://www.hibernate.org">Hibernate</ulink> and
+ <ulink url="http://lucene.apache.org">Apache Lucene</ulink>.</para>
+ </preface>
+
+ <xi:include href="modules/getting-started.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/architecture.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/configuration.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/mapping.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/query.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/batchindex.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/optimize.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+ <xi:include href="modules/lucene-native.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+</book>
Deleted: search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml
===================================================================
--- search/trunk/doc/reference/en/modules/mapping.xml 2008-12-04 09:55:05 UTC (rev 15659)
+++ search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -1,1451 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- ~ Hibernate, Relational Persistence for Idiomatic Java
- ~
- ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
- ~ indicated by the @author tags or express copyright attribution
- ~ statements applied by the authors. All third-party contributions are
- ~ distributed under license by Red Hat Middleware LLC.
- ~
- ~ This copyrighted material is made available to anyone wishing to use, modify,
- ~ copy, or redistribute it subject to the terms and conditions of the GNU
- ~ Lesser General Public License, as published by the Free Software Foundation.
- ~
- ~ This program is distributed in the hope that it will be useful,
- ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
- ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
- ~ for more details.
- ~
- ~ You should have received a copy of the GNU Lesser General Public License
- ~ along with this distribution; if not, write to:
- ~ Free Software Foundation, Inc.
- ~ 51 Franklin Street, Fifth Floor
- ~ Boston, MA 02110-1301 USA
- -->
-<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
-"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
-<chapter id="search-mapping" revision="3">
- <!-- $Id$ -->
-
- <title>Mapping entities to the index structure</title>
-
- <para>All the metadata information needed to index entities is described
- through annotations. There is no need for xml mapping files. In fact there
- is currently no xml configuration option available (see <ulink
- url="http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-210">HSEARCH-210</ulink>).
- You can still use hibernate mapping files for the basic Hibernate
- configuration, but the Search specific configuration has to be expressed via
- annotations.</para>
-
- <section id="search-mapping-entity" revision="3">
- <title>Mapping an entity</title>
-
- <section id="basic-mapping">
- <title>Basic mapping</title>
-
- <para>First, we must declare a persistent class as indexable. This is
- done by annotating the class with <literal>@Indexed</literal> (all
- entities not annotated with <literal>@Indexed</literal> will be ignored
- by the indexing process):</para>
-
- <example>
- <title>Making a class indexable using the
- <classname>@Indexed</classname> annotation</title>
-
- <programlisting>@Entity
-<emphasis role="bold">@Indexed(index="indexes/essays")</emphasis>
-public class Essay {
- ...
-}</programlisting>
- </example>
-
- <para>The <literal>index</literal> attribute tells Hibernate what the
- Lucene directory name is (usually a directory on your file system). It
- is recommended to define a base directory for all Lucene indexes using
- the <literal>hibernate.search.default.indexBase</literal> property in
- your configuration file. Alternatively you can specify a base directory
- per indexed entity by specifying
- <literal>hibernate.search.<index>.indexBase, </literal>where
- <literal><index></literal> is the fully qualified classname of the
- indexed entity. Each entity instance will be represented by a Lucene
- <classname>Document</classname> inside the given index (aka
- Directory).</para>
-
- <para>For each property (or attribute) of your entity, you have the
- ability to describe how it will be indexed. The default (no annotation
- present) means that the property is completly ignored by the indexing
- process. <literal>@Field</literal> does declare a property as indexed.
- When indexing an element to a Lucene document you can specify how it is
- indexed:</para>
-
- <itemizedlist>
- <listitem>
- <para><literal>name</literal> : describe under which name, the
- property should be stored in the Lucene Document. The default value
- is the property name (following the JavaBeans convention)</para>
- </listitem>
-
- <listitem>
- <para><literal>store</literal> : describe whether or not the
- property is stored in the Lucene index. You can store the value
- <literal>Store.YES</literal> (comsuming more space in the index but
- allowing projection, see <xref linkend="projections" /> for more
- information), store it in a compressed way
- <literal>Store.COMPRESS</literal> (this does consume more CPU), or
- avoid any storage <literal>Store.NO</literal> (this is the default
- value). When a property is stored, you can retrieve its original
- value from the Lucene Document. This is not related to whether the
- element is indexed or not.</para>
- </listitem>
-
- <listitem>
- <para>index: describe how the element is indexed and the type of
- information store. The different values are
- <literal>Index.NO</literal> (no indexing, ie cannot be found by a
- query), <literal>Index.TOKENIZED</literal> (use an analyzer to
- process the property), <literal>Index.UN_TOKENISED</literal> (no
- analyzer pre processing), <literal>Index.NO_NORM</literal> (do not
- store the normalization data). The default value is
- <literal>TOKENIZED</literal>.</para>
- </listitem>
-
- <listitem>
- <para>termVector: describes collections of term-frequency pairs.
- This attribute enables term vectors being stored during indexing so
- they are available within documents. The default value is
- TermVector.NO.</para>
-
- <para>The different values of this attribute are:</para>
-
- <informaltable align="left" width="">
- <tgroup cols="2">
- <thead>
- <row>
- <entry align="center">Value</entry>
-
- <entry align="center">Definition</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry align="left">TermVector.YES</entry>
-
- <entry>Store the term vectors of each document. This
- produces two synchronized arrays, one contains document
- terms and the other contains the term's frequency.</entry>
- </row>
-
- <row>
- <entry align="left">TermVector.NO</entry>
-
- <entry>Do not store term vectors.</entry>
- </row>
-
- <row>
- <entry align="left">TermVector.WITH_OFFSETS</entry>
-
- <entry>Store the term vector and token offset information.
- This is the same as TermVector.YES plus it contains the
- starting and ending offset position information for the
- terms.</entry>
- </row>
-
- <row>
- <entry align="left">TermVector.WITH_POSITIONS</entry>
-
- <entry>Store the term vector and token position information.
- This is the same as TermVector.YES plus it contains the
- ordinal positions of each occurrence of a term in a
- document.</entry>
- </row>
-
- <row>
- <entry
- align="left">TermVector.WITH_POSITIONS_OFFSETS</entry>
-
- <entry>Store the term vector, token position and offset
- information. This is a combination of the YES, WITH_OFFSETS
- and WITH_POSITIONS.</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- </listitem>
- </itemizedlist>
-
- <para>Whether or not you want to store the original data in the index
- depends on how you wish to use the index query result. For a regular
- Hibernate Search usage storing is not necessary. However you might want
- to store some fields to subsequently project them (see <xref
- linkend="projections" /> for more information).</para>
-
- <para>Whether or not you want to tokenize a property depends on whether
- you wish to search the element as is, or by the words it contains. It
- make sense to tokenize a text field, but tokenizing a date field
- probably not. Note that fields used for sorting must not be
- tokenized.</para>
-
- <para>Finally, the id property of an entity is a special property used
- by Hibernate Search to ensure index unicity of a given entity. By
- design, an id has to be stored and must not be tokenized. To mark a
- property as index id, use the <literal>@DocumentId</literal> annotation.
- If you are using Hibernate Annotations and you have specified @Id you
- can omit @DocumentId. The chosen entity id will also be used as document
- id.</para>
-
- <example>
- <title>Adding <classname>@DocumentId</classname> ad
- <classname>@Field</classname> annotations to an indexed entity</title>
-
- <programlisting>@Entity
- at Indexed(index="indexes/essays")
-public class Essay {
- ...
-
- @Id
- <emphasis role="bold">@DocumentId</emphasis>
- public Long getId() { return id; }
-
- <emphasis role="bold">@Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES)</emphasis>
- public String getSummary() { return summary; }
-
- @Lob
- <emphasis role="bold">@Field(index=Index.TOKENIZED)</emphasis>
- public String getText() { return text; }
-}</programlisting>
- </example>
-
- <para>The above annotations define an index with three fields:
- <literal>id</literal> , <literal>Abstract</literal> and
- <literal>text</literal> . Note that by default the field name is
- decapitalized, following the JavaBean specification</para>
- </section>
-
- <section>
- <title>Mapping properties multiple times</title>
-
- <para>Sometimes one has to map a property multiple times per index, with
- slightly different indexing strategies. For example, sorting a query by
- field requires the field to be <literal>UN_TOKENIZED</literal>. If one
- wants to search by words in this property and still sort it, one need to
- index it twice - once tokenized and once untokenized. @Fields allows to
- achieve this goal.</para>
-
- <example>
- <title>Using @Fields to map a property multiple times</title>
-
- <programlisting>@Entity
- at Indexed(index = "Book" )
-public class Book {
- <emphasis role="bold">@Fields( {</emphasis>
- @Field(index = Index.TOKENIZED),
- @Field(name = "summary_forSort", index = Index.UN_TOKENIZED, store = Store.YES)
- <emphasis role="bold">} )</emphasis>
- public String getSummary() {
- return summary;
- }
-
- ...
-}</programlisting>
- </example>
-
- <para>The field <literal>summary</literal> is indexed twice, once as
- <literal>summary</literal> in a tokenized way, and once as
- <literal>summary_forSort</literal> in an untokenized way. @Field
- supports 2 attributes useful when @Fields is used:</para>
-
- <itemizedlist>
- <listitem>
- <para>analyzer: defines a @Analyzer annotation per field rather than
- per property</para>
- </listitem>
-
- <listitem>
- <para>bridge: defines a @FieldBridge annotation per field rather
- than per property</para>
- </listitem>
- </itemizedlist>
-
- <para>See below for more information about analyzers and field
- bridges.</para>
- </section>
-
- <section id="search-mapping-associated">
- <title>Embedded and associated objects</title>
-
- <para>Associated objects as well as embedded objects can be indexed as
- part of the root entity index. This is ueful if you expect to search a
- given entity based on properties of associated objects. In the following
- example the aim is to return places where the associated city is Atlanta
- (In the Lucene query parser language, it would translate into
- <code>address.city:Atlanta</code>).</para>
-
- <example>
- <title>Using @IndexedEmbedded to index associations</title>
-
- <programlisting>@Entity
- at Indexed
-public class Place {
- @Id
- @GeneratedValue
- @DocumentId
- private Long id;
-
- @Field( index = Index.TOKENIZED )
- private String name;
-
- @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
- <emphasis role="bold">@IndexedEmbedded</emphasis>
- private Address address;
- ....
-}
-
- at Entity
-public class Address {
- @Id
- @GeneratedValue
- private Long id;
-
- @Field(index=Index.TOKENIZED)
- private String street;
-
- @Field(index=Index.TOKENIZED)
- private String city;
-
- <emphasis role="bold">@ContainedIn</emphasis>
- @OneToMany(mappedBy="address")
- private Set<Place> places;
- ...
-}</programlisting>
- </example>
-
- <para>In this example, the place fields will be indexed in the
- <literal>Place</literal> index. The <literal>Place</literal> index
- documents will also contain the fields <literal>address.id</literal>,
- <literal>address.street</literal>, and <literal>address.city</literal>
- which you will be able to query. This is enabled by the
- <literal>@IndexedEmbedded</literal> annotation.</para>
-
- <para>Be careful. Because the data is denormalized in the Lucene index
- when using the <classname>@IndexedEmbedded</classname> technique,
- Hibernate Search needs to be aware of any change in the
- <classname>Place</classname> object and any change in the
- <classname>Address</classname> object to keep the index up to date. To
- make sure the <literal><classname>Place</classname></literal> Lucene
- document is updated when it's <classname>Address</classname> changes,
- you need to mark the other side of the birirectional relationship with
- <classname>@ContainedIn</classname>.</para>
-
- <para><literal>@ContainedIn</literal> is only useful on associations
- pointing to entities as opposed to embedded (collection of)
- objects.</para>
-
- <para>Let's make our example a bit more complex:</para>
-
- <example>
- <title>Nested usage of <classname>@IndexedEmbedded</classname> and
- <classname>@ContainedIn</classname></title>
-
- <programlisting>@Entity
- at Indexed
-public class Place {
- @Id
- @GeneratedValue
- @DocumentId
- private Long id;
-
- @Field( index = Index.TOKENIZED )
- private String name;
-
- @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
- <emphasis role="bold">@IndexedEmbedded</emphasis>
- private Address address;
- ....
-}
-
- at Entity
-public class Address {
- @Id
- @GeneratedValue
- private Long id;
-
- @Field(index=Index.TOKENIZED)
- private String street;
-
- @Field(index=Index.TOKENIZED)
- private String city;
-
- <emphasis role="bold">@IndexedEmbedded(depth = 1, prefix = "ownedBy_")</emphasis>
- private Owner ownedBy;
-
- <emphasis role="bold">@ContainedIn</emphasis>
- @OneToMany(mappedBy="address")
- private Set<Place> places;
- ...
-}
-
- at Embeddable
-public class Owner {
- @Field(index = Index.TOKENIZED)
- private String name;
- ...
-}</programlisting>
- </example>
-
- <para>Any <literal>@*ToMany, @*ToOne</literal> and
- <literal>@Embedded</literal> attribute can be annotated with
- <literal>@IndexedEmbedded</literal>. The attributes of the associated
- class will then be added to the main entity index. In the previous
- example, the index will contain the following fields</para>
-
- <itemizedlist>
- <listitem>
- <para>id</para>
- </listitem>
-
- <listitem>
- <para>name</para>
- </listitem>
-
- <listitem>
- <para>address.street</para>
- </listitem>
-
- <listitem>
- <para>address.city</para>
- </listitem>
-
- <listitem>
- <para>addess.ownedBy_name</para>
- </listitem>
- </itemizedlist>
-
- <para>The default prefix is <literal>propertyName.</literal>, following
- the traditional object navigation convention. You can override it using
- the <literal>prefix</literal> attribute as it is shown on the
- <literal>ownedBy</literal> property.</para>
-
- <note>
- <para>The prefix cannot be set to the empty string. </para>
- </note>
-
- <para>The<literal> depth</literal> property is necessary when the object
- graph contains a cyclic dependency of classes (not instances). For
- example, if <classname>Owner</classname> points to
- <classname>Place</classname>. Hibernate Search will stop including
- Indexed embedded atttributes after reaching the expected depth (or the
- object graph boundaries are reached). A class having a self reference is
- an example of cyclic dependency. In our example, because
- <literal>depth</literal> is set to 1, any
- <literal>@IndexedEmbedded</literal> attribute in Owner (if any) will be
- ignored. </para>
-
- <para>Using <literal>@IndexedEmbedded</literal> for object associations
- allows you to express queries such as:</para>
-
- <itemizedlist>
- <listitem>
- <para>Return places where name contains JBoss and where address city
- is Atlanta. In Lucene query this would be</para>
-
- <programlisting>+name:jboss +address.city:atlanta </programlisting>
- </listitem>
-
- <listitem>
- <para>Return places where name contains JBoss and where owner's name
- contain Joe. In Lucene query this would be</para>
-
- <programlisting>+name:jboss +address.orderBy_name:joe </programlisting>
- </listitem>
- </itemizedlist>
-
- <para>In a way it mimics the relational join operation in a more
- efficient way (at the cost of data duplication). Remember that, out of
- the box, Lucene indexes have no notion of association, the join
- operation is simply non-existent. It might help to keep the relational
- model normalized while benefiting from the full text index speed and
- feature richness.</para>
-
- <para><note>
- <para>An associated object can itself (but does not have to) be
- <literal>@Indexed</literal></para>
- </note></para>
-
- <para>When @IndexedEmbedded points to an entity, the association has to
- be directional and the other side has to be annotated
- <literal>@ContainedIn</literal> (as seen in the previous example). If
- not, Hibernate Search has no way to update the root index when the
- associated entity is updated (in our example, a <literal>Place</literal>
- index document has to be updated when the associated
- <classname>Address</classname> instance is updated).</para>
-
- <para>Sometimes, the object type annotated by
- <classname>@IndexedEmbedded</classname> is not the object type targeted
- by Hibernate and Hibernate Search. This is especially the case when
- interfaces are used in lieu of their implementation. For this reason you
- can override the object type targeted by Hibernate Search using the
- <methodname>targetElement</methodname> parameter.</para>
-
- <example>
- <title>Using the <literal>targetElement</literal> property of
- <classname>@IndexedEmbedded</classname></title>
-
- <programlisting>@Entity
- at Indexed
-public class Address {
- @Id
- @GeneratedValue
- @DocumentId
- private Long id;
-
- @Field(index= Index.TOKENIZED)
- private String street;
-
- @IndexedEmbedded(depth = 1, prefix = "ownedBy_", <emphasis role="bold">targetElement = Owner.class</emphasis>)
- @Target(Owner.class)
- private Person ownedBy;
-
-
- ...
-}
-
- at Embeddable
-public class Owner implements Person { ... }</programlisting>
- </example>
- </section>
-
- <section>
- <title>Boost factor</title>
-
- <para>Lucene has the notion of <emphasis>boost factor</emphasis>. It's a
- way to give more weigth to a field or to an indexed element over others
- during the indexation process. You can use <literal>@Boost</literal> at
- the @Field, method or class level.</para>
-
- <example>
- <title>Using different ways of increasing the weight of an indexed
- element using a boost factor</title>
-
- <programlisting>@Entity
- at Indexed(index="indexes/essays")
-<emphasis role="bold">@Boost(1.7f)</emphasis>
-public class Essay {
- ...
-
- @Id
- @DocumentId
- public Long getId() { return id; }
-
- @Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES, boost=<emphasis
- role="bold">@Boost(2f)</emphasis>)
- <emphasis role="bold">@Boost(1.5f)</emphasis>
- public String getSummary() { return summary; }
-
- @Lob
- @Field(index=Index.TOKENIZED, boost=<emphasis role="bold">@Boost(1.2f)</emphasis>)
- public String getText() { return text; }
-
- @Field
- public String getISBN() { return isbn; }
-
-} </programlisting>
- </example>
-
- <para>In our example, <classname>Essay</classname>'s probability to
- reach the top of the search list will be multiplied by 1.7. The
- <methodname>summary</methodname> field will be 3.0 (2 * 1.5 -
- <methodname>@Field.boost</methodname> and <classname>@Boost</classname>
- on a property are cumulative) more important than the
- <methodname>isbn</methodname> field. The <methodname>text</methodname>
- field will be 1.2 times more important than the
- <methodname>isbn</methodname> field. Note that this explanation in
- strictest terms is actually wrong, but it is simple and close enough to
- reality for all practical purposes. Please check the Lucene
- documentation or the excellent <citetitle>Lucene In Action </citetitle>
- from Otis Gospodnetic and Erik Hatcher.</para>
- </section>
-
- <section id="analyzer">
- <title>Analyzer</title>
-
- <para>The default analyzer class used to index tokenized fields is
- configurable through the <literal>hibernate.search.analyzer</literal>
- property. The default value for this property is
- <classname>org.apache.lucene.analysis.standard.StandardAnalyzer</classname>.</para>
-
- <para>You can also define the analyzer class per entity, property and
- even per @Field (useful when multiple fields are indexed from a single
- property).</para>
-
- <example>
- <title>Different ways of specifying an analyzer</title>
-
- <programlisting>@Entity
- at Indexed
-<emphasis role="bold">@Analyzer(impl = EntityAnalyzer.class)</emphasis>
-public class MyEntity {
- @Id
- @GeneratedValue
- @DocumentId
- private Integer id;
-
- @Field(index = Index.TOKENIZED)
- private String name;
-
- @Field(index = Index.TOKENIZED)
- <emphasis role="bold">@Analyzer(impl = PropertyAnalyzer.class)</emphasis>
- private String summary;
-
- @Field(index = Index.TOKENIZED, <emphasis><emphasis role="bold">analyzer = @Analyzer(impl = FieldAnalyzer.class</emphasis>)</emphasis>
- private String body;
-
- ...
-}</programlisting>
- </example>
-
- <para>In this example, <classname>EntityAnalyzer</classname> is used to
- index all tokenized properties (eg. <literal>name</literal>), except
- <literal>summary</literal> and <literal>body</literal> which are indexed
- with <classname>PropertyAnalyzer</classname> and
- <classname>FieldAnalyzer</classname> respectively.</para>
-
- <caution>
- <para>Mixing different analyzers in the same entity is most of the
- time a bad practice. It makes query building more complex and results
- less predictable (for the novice), especially if you are using a
- QueryParser (which uses the same analyzer for the whole query). As a
- rule of thumb, for any given field the same analyzer should be used
- for indexing and querying.</para>
- </caution>
-
- <section>
- <title>Analyzer definitions</title>
-
- <para>Analyzers can become quite complex to deal with for which reason
- Hibernate Search introduces the notion of analyzer definitions. An
- analyzer definition can be reused by many
- <classname>@Analyzer</classname> declarations. An analyzer definition
- is composed of:</para>
-
- <itemizedlist>
- <listitem>
- <para>a name: the unique string used to refer to the
- definition</para>
- </listitem>
-
- <listitem>
- <para>a tokenizer: responsible for tokenizing the input stream
- into individual words</para>
- </listitem>
-
- <listitem>
- <para>a list of filters: each filter is responsible to remove,
- modify or sometimes even add words into the stream provided by the
- tokenizer</para>
- </listitem>
- </itemizedlist>
-
- <para>This separation of tasks - a tokenizer followed by a list of
- filters - allows for easy reuse of each individual component and let
- you build your customized analyzer in a very flexible way (just like
- lego). Generally speaking the <classname>Tokenizer</classname> starts
- the analysis process by turning the character input into tokens which
- are then further processed by the <classname>TokenFilter</classname>s.
- Hibernate Search supports this infrastructure by utilizing the Solr
- analyzer framework. Make sure to add<filename> solr-core.jar and
- </filename><filename>solr-common.jar</filename> to your classpath to
- use analyzer definitions. In case you also want to utilizing a
- snowball stemmer also include the
- <filename>lucene-snowball.jar.</filename> Other Solr analyzers might
- depend on more libraries. For example, the
- <classname>PhoneticFilterFactory</classname> depends on <ulink
- url="http://commons.apache.org/codec">commons-codec</ulink>. Your
- distribution of Hibernate Search provides these dependecies in its
- <filename>lib</filename> directory.</para>
-
- <example>
- <title><classname>@AnalyzerDef</classname> and the Solr
- framework</title>
-
- <programlisting>@AnalyzerDef(name="customanalyzer",
- tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
- filters = {
- @TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
- @TokenFilterDef(factory = LowerCaseFilterFactory.class),
- @TokenFilterDef(factory = StopFilterFactory.class, params = {
- @Parameter(name="words", value= "org/hibernate/search/test/analyzer/solr/stoplist.properties" ),
- @Parameter(name="ignoreCase", value="true")
- })
-})
-public class Team {
- ...
-}</programlisting>
- </example>
-
- <para>A tokenizer is defined by its factory which is responsible for
- building the tokenizer and using the optional list of parameters. This
- example use the standard tokenizer. A filter is defined by its factory
- which is responsible for creating the filter instance using the
- optional parameters. In our example, the StopFilter filter is built
- reading the dedicated words property file and is expected to ignore
- case. The list of parameters is dependent on the tokenizer or filter
- factory.</para>
-
- <warning>
- <para>Filters are applied in the order they are defined in the
- <classname>@AnalyzerDef</classname> annotation. Make sure to think
- twice about this order.</para>
- </warning>
-
- <para>Once defined, an analyzer definition can be reused by an
- <classname>@Analyzer</classname> declaration using the definition name
- rather than declaring an implementation class.</para>
-
- <example>
- <title>Referencing an analyzer by name</title>
-
- <programlisting>@Entity
- at Indexed
- at AnalyzerDef(name="customanalyzer", ... )
-public class Team {
- @Id
- @DocumentId
- @GeneratedValue
- private Integer id;
-
- @Field
- private String name;
-
- @Field
- private String location;
-
- @Field <emphasis role="bold">@Analyzer(definition = "customanalyzer")</emphasis>
- private String description;
-}</programlisting>
- </example>
-
- <para>Analyzer instances declared by
- <classname>@AnalyzerDef</classname> are available by their name in the
- <classname>SearchFactory</classname>.</para>
-
- <programlisting>Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("customanalyzer");</programlisting>
-
- <para>This is quite useful wen building queries. Fields in queries
- should be analyzed with the same analyzer used to index the field so
- that they speak a common "language": the same tokens are reused
- between the query and the indexing process. This rule has some
- exceptions but is true most of the time. Respect it unless you know
- what you are doing.</para>
- </section>
-
- <section>
- <title>Available analyzers</title>
-
- <para>Solr and Lucene come with a lot of useful default tokenizers and
- filters. You can find a complete list of tokenizer factories and
- filter factories at <ulink
- url="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters">http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters</ulink>.
- Let check a few of them.</para>
-
- <table>
- <title>Some of the tokenizers avalable</title>
-
- <tgroup cols="3">
- <thead>
- <row>
- <entry align="center">Factory</entry>
-
- <entry align="center">Description</entry>
-
- <entry align="center">parameters</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry>StandardTokenizerFactory</entry>
-
- <entry>Use the Lucene StandardTokenizer</entry>
-
- <entry>none</entry>
- </row>
-
- <row>
- <entry>HTMLStripStandardTokenizerFactory</entry>
-
- <entry>Remove HTML tags, keep the text and pass it to a
- StandardTokenizer</entry>
-
- <entry>none</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <table>
- <title>Some of the filters avalable</title>
-
- <tgroup cols="3">
- <thead>
- <row>
- <entry align="center">Factory</entry>
-
- <entry align="center">Description</entry>
-
- <entry align="center">parameters</entry>
- </row>
- </thead>
-
- <tbody>
- <row>
- <entry>StandardFilterFactory</entry>
-
- <entry>Remove dots from acronyms and 's from words</entry>
-
- <entry>none</entry>
- </row>
-
- <row>
- <entry>LowerCaseFilterFactory</entry>
-
- <entry>Lowercase words</entry>
-
- <entry>none</entry>
- </row>
-
- <row>
- <entry>StopFilterFactory</entry>
-
- <entry>remove words (tokens) matching a list of stop
- words</entry>
-
- <entry><para><literal>words</literal>: points to a resource
- file containing the stop words</para><para>ignoreCase: true if
- <literal>case</literal> should be ignore when comparing stop
- words, <literal>false</literal> otherwise </para></entry>
- </row>
-
- <row>
- <entry>SnowballPorterFilterFactory</entry>
-
- <entry>Reduces a word to it's root in a given language. (eg.
- protect, protects, protection share the same root). Using such
- a filter allows searches matching related words.</entry>
-
- <entry><para><literal>language</literal>: Danish, Dutch,
- English, Finnish, French, German, Italian, Norwegian,
- Portuguese, Russian, Spanish, Swedish</para>and a few
- more</entry>
- </row>
-
- <row>
- <entry>ISOLatin1AccentFilterFactory</entry>
-
- <entry>remove accents for languages like French</entry>
-
- <entry>none</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
-
- <para>We recommend to check all the implementations of
- <classname>org.apache.solr.analysis.TokenizerFactory</classname> and
- <classname>org.apache.solr.analysis.TokenFilterFactory</classname> in
- your IDE to see the implementations available.</para>
- </section>
-
- <section>
- <title>Analyzer discriminator (experimental)</title>
-
- <para>So far all the introduced ways to specify an analyzer were
- static. However, there are usecases where it is useful to select an
- analyzer depending on the current state of the entity to be indexed,
- for example in multilingual application. For an
- <classname>BlogEntry</classname> class for example the analyzer could
- depend on the language property of the entry. Depending on this
- property the correct language specific stemmer should be chosen to
- index the actual text. </para>
-
- <para>To enable this dynamic analyzer selection Hibernate Search
- introduces the <classname>AnalyzerDiscriminator</classname>
- annotation. The following example demonstrates the usage of this
- annotation:</para>
-
- <para><example>
- <title>Usage of @AnalyzerDiscriminator in order to select an
- analyzer depending on the entity state</title>
-
- <programlisting>@Entity
- at Indexed
- at AnalyzerDefs({
- @AnalyzerDef(name = "en",
- tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
- filters = {
- @TokenFilterDef(factory = LowerCaseFilterFactory.class),
- @TokenFilterDef(factory = EnglishPorterFilterFactory.class
- )
- }),
- @AnalyzerDef(name = "de",
- tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
- filters = {
- @TokenFilterDef(factory = LowerCaseFilterFactory.class),
- @TokenFilterDef(factory = GermanStemFilterFactory.class)
- })
-})
-public class BlogEntry {
-
- @Id
- @GeneratedValue
- @DocumentId
- private Integer id;
-
- @Field
- @AnalyzerDiscriminator(impl = LanguageDiscriminator.class)
- private String language;
-
- @Field
- private String text;
-
- private Set<BlogEntry> references;
-
- // standard getter/setter
- ...
-}</programlisting>
-
- <programlisting>public class LanguageDiscriminator implements Discriminator {
-
- public String getAnanyzerDefinitionName(Object value, Object entity, String field) {
- if ( value == null || !( entity instanceof Article ) ) {
- return null;
- }
- return (String) value;
- }
-}</programlisting>
- </example>The prerequisite for using
- <classname>@AnalyzerDiscriminator</classname> is that all analyzers
- which are going to be used are predefined via
- <classname>@AnalyzerDef</classname> definitions. If this is the case
- one can place the <classname>@AnalyzerDiscriminator</classname>
- annotation either on the class or on a specific property of the entity
- for which to dynamically select an analyzer. Via the
- <literal>impl</literal> parameter of the
- <classname>AnalyzerDiscriminator</classname> you specify a concrete
- implementation of the <classname>Discriminator</classname> interface.
- It is up to you to provide an implementation for this interface. The
- only method you have to implement is
- <classname>getAnanyzerDefinitionName()</classname> which gets called
- for each field added to the Lucene document. The entity which is
- getting indexed is also passed to the interface method. The
- <literal>value</literal> parameter is only set if the
- <classname>AnalyzerDiscriminator</classname> is placed on property
- level instead of class level. In this case the value represents the
- current value of this property.</para>
-
- <para>An implemention of the <classname>Discriminator</classname>
- interface has to return the name of an existing analyzer definition if
- the analyzer should be set dynamically or <classname>null</classname>
- if the default analyzer should not be overridden. The given example
- assumes that the language paramter is either 'de' or 'en' which
- matches the specified names in the
- <classname>@AnalyzerDef</classname>s.</para>
-
- <note>
- <para>The <classname>@AnalyzerDiscriminator</classname> is currently
- still experimental and the API might still change. We are hoping for
- some feedback from the community about the usefulness and usability
- of this feature.</para>
- </note>
- </section>
-
- <section id="analyzer-retrievinganalyzer">
- <title>Retrieving an analyzer</title>
-
- <para>During indexing time, Hibernate Search is using analyzers under
- the hood for you. In some situations, retrieving analyzers can be
- handy. If your domain model makes use of multiple analyzers (maybe to
- benefit from stemming, use phonetic approximation and so on), you need
- to make sure to use the same analyzers when you build your
- query.</para>
-
- <note>
- <para>This rule can be broken but you need a good reason for it. If
- you are unsure, use the same analyzers.</para>
- </note>
-
- <para>You can retrieve the scoped analyzer for a given entity used at
- indexing time by Hibernate Search. A scoped analyzer is an analyzer
- which applies the right analyzers depending on the field indexed:
- multiple analyzers can be defined on a given entity each one working
- on an individual field, a scoped analyzer unify all these analyzers
- into a context-aware analyzer. While the theory seems a bit complex,
- using the right analyzer in a query is very easy.</para>
-
- <example>
- <title>Using the scoped analyzer when building a full-text
- query</title>
-
- <programlisting>org.apache.lucene.queryParser.QueryParser parser = new QueryParser(
- "title",
- fullTextSession.getSearchFactory().getAnalyzer( Song.class )
-);
-
-org.apache.lucene.search.Query luceneQuery =
- parser.parse( "title:sky Or title_stemmed:diamond" );
-
-org.hibernate.Query fullTextQuery =
- fullTextSession.createFullTextQuery( luceneQuery, Song.class );
-
-List result = fullTextQuery.list(); //return a list of managed objects </programlisting>
- </example>
-
- <para>In the example above, the song title is indexed in two fields:
- the standard analyzer is used in the field <literal>title</literal>
- and a stemming analyzer is used in the field
- <literal>title_stemmed</literal>. By using the analyzer provided by
- the search factory, the query uses the appropriate analyzer depending
- on the field targeted.</para>
-
- <para>If your query targets more that one query and you wish to use
- your standard analyzer, make sure to describe it using an analyzer
- definition. You can retrieve analyzers by their definition name using
- <code>searchFactory.getAnalyzer(String)</code>.</para>
- </section>
- </section>
- </section>
-
- <section id="search-mapping-bridge">
- <title>Property/Field Bridge</title>
-
- <para>In Lucene all index fields have to be represented as Strings. For
- this reason all entity properties annotated with <literal>@Field</literal>
- have to be indexed in a String form. For most of your properties,
- Hibernate Search does the translation job for you thanks to a built-in set
- of bridges. In some cases, though you need a more fine grain control over
- the translation process.</para>
-
- <section>
- <title>Built-in bridges</title>
-
- <para>Hibernate Search comes bundled with a set of built-in bridges
- between a Java property type and its full text representation.</para>
-
- <variablelist>
- <varlistentry>
- <term>null</term>
-
- <listitem>
- <para>null elements are not indexed. Lucene does not support null
- elements and this does not make much sense either.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.lang.String</term>
-
- <listitem>
- <para>String are indexed as is</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>short, Short, integer, Integer, long, Long, float, Float,
- double, Double, BigInteger, BigDecimal</term>
-
- <listitem>
- <para>Numbers are converted in their String representation. Note
- that numbers cannot be compared by Lucene (ie used in ranged
- queries) out of the box: they have to be padded <note>
- <para>Using a Range query is debatable and has drawbacks, an
- alternative approach is to use a Filter query which will
- filter the result query to the appropriate range.</para>
-
- <para>Hibernate Search will support a padding mechanism</para>
- </note></para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.util.Date</term>
-
- <listitem>
- <para>Dates are stored as yyyyMMddHHmmssSSS in GMT time
- (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You
- shouldn't really bother with the internal format. What is
- important is that when using a DateRange Query, you should know
- that the dates have to be expressed in GMT time.</para>
-
- <para>Usually, storing the date up to the milisecond is not
- necessary. <literal>@DateBridge</literal> defines the appropriate
- resolution you are willing to store in the index ( <literal>
- <literal>@DateBridge(resolution=Resolution.DAY)</literal>
- </literal> ). The date pattern will then be truncated
- accordingly.</para>
-
- <programlisting>@Entity
- at Indexed
-public class Meeting {
- @Field(index=Index.UN_TOKENIZED)
- <emphasis role="bold">@DateBridge(resolution=Resolution.MINUTE)</emphasis>
- private Date date;
- ... </programlisting>
-
- <warning>
- <para>A Date whose resolution is lower than
- <literal>MILLISECOND</literal> cannot be a
- <literal>@DocumentId</literal></para>
- </warning>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.net.URI, java.net.URL</term>
-
- <listitem>
- <para>URI and URL are converted to their string
- representation</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>java.lang.Class</term>
-
- <listitem>
- <para>Class are converted to their fully qualified class name. The
- thread context classloader is used when the class is
- rehydrated</para>
- </listitem>
- </varlistentry>
- </variablelist>
- </section>
-
- <section>
- <title>Custom Bridge</title>
-
- <para>Sometimes, the built-in bridges of Hibernate Search do not cover
- some of your property types, or the String representation used by the
- bridge does not meet your requirements. The following paragraphs
- describe several solutions to this problem.</para>
-
- <section>
- <title>StringBridge</title>
-
- <para>The simplest custom solution is to give Hibernate Search an
- implementation of your expected
- <emphasis><classname>Object</classname> </emphasis>to
- <classname>String</classname> bridge. To do so you need to implements
- the <literal>org.hibernate.search.bridge.StringBridge</literal>
- interface. All implementations have to be thread-safe as they are used
- concurrently.</para>
-
- <example>
- <title>Implementing your own
- <classname>StringBridge</classname></title>
-
- <programlisting>/**
- * Padding Integer bridge.
- * All numbers will be padded with 0 to match 5 digits
- *
- * @author Emmanuel Bernard
- */
-public class PaddedIntegerBridge implements <emphasis role="bold">StringBridge</emphasis> {
-
- private int PADDING = 5;
-
- <emphasis role="bold">public String objectToString(Object object)</emphasis> {
- String rawInteger = ( (Integer) object ).toString();
- if (rawInteger.length() > PADDING)
- throw new IllegalArgumentException( "Try to pad on a number too big" );
- StringBuilder paddedInteger = new StringBuilder( );
- for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) {
- paddedInteger.append('0');
- }
- return paddedInteger.append( rawInteger ).toString();
- }
-} </programlisting>
- </example>
-
- <para>Then any property or field can use this bridge thanks to the
- <literal>@FieldBridge</literal> annotation</para>
-
- <programlisting><emphasis role="bold">@FieldBridge(impl = PaddedIntegerBridge.class)</emphasis>
-private Integer length; </programlisting>
-
- <para>Parameters can be passed to the Bridge implementation making it
- more flexible. The Bridge implementation implements a
- <classname>ParameterizedBridge</classname> interface, and the
- parameters are passed through the <literal>@FieldBridge</literal>
- annotation.</para>
-
- <example>
- <title>Passing parameters to your bridge implementation</title>
-
- <programlisting>public class PaddedIntegerBridge implements StringBridge, <emphasis
- role="bold">ParameterizedBridge</emphasis> {
-
- public static String PADDING_PROPERTY = "padding";
- private int padding = 5; //default
-
- <emphasis role="bold">public void setParameterValues(Map parameters)</emphasis> {
- Object padding = parameters.get( PADDING_PROPERTY );
- if (padding != null) this.padding = (Integer) padding;
- }
-
- public String objectToString(Object object) {
- String rawInteger = ( (Integer) object ).toString();
- if (rawInteger.length() > padding)
- throw new IllegalArgumentException( "Try to pad on a number too big" );
- StringBuilder paddedInteger = new StringBuilder( );
- for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
- paddedInteger.append('0');
- }
- return paddedInteger.append( rawInteger ).toString();
- }
-}
-
-
-//property
- at FieldBridge(impl = PaddedIntegerBridge.class,
- <emphasis role="bold">params = @Parameter(name="padding", value="10")</emphasis>
- )
-private Integer length; </programlisting>
- </example>
-
- <para>The <classname>ParameterizedBridge</classname> interface can be
- implemented by <classname>StringBridge</classname> ,
- <classname>TwoWayStringBridge</classname> ,
- <classname>FieldBridge</classname> implementations.</para>
-
- <para>All implementations have to be thread-safe, but the parameters
- are set during initialization and no special care is required at this
- stage.</para>
-
- <para>If you expect to use your bridge implementation on an id
- property (ie annotated with <literal>@DocumentId</literal> ), you need
- to use a slightly extended version of <literal>StringBridge</literal>
- named <classname>TwoWayStringBridge</classname>. Hibernate Search
- needs to read the string representation of the identifier and generate
- the object out of it. There is not difference in the way the
- <literal>@FieldBridge</literal> annotation is used.</para>
-
- <example>
- <title>Implementing a TwoWayStringBridge which can for example be
- used for id properties</title>
-
- <programlisting>public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge {
-
- public static String PADDING_PROPERTY = "padding";
- private int padding = 5; //default
-
- public void setParameterValues(Map parameters) {
- Object padding = parameters.get( PADDING_PROPERTY );
- if (padding != null) this.padding = (Integer) padding;
- }
-
- public String objectToString(Object object) {
- String rawInteger = ( (Integer) object ).toString();
- if (rawInteger.length() > padding)
- throw new IllegalArgumentException( "Try to pad on a number too big" );
- StringBuilder paddedInteger = new StringBuilder( );
- for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
- paddedInteger.append('0');
- }
- return paddedInteger.append( rawInteger ).toString();
- }
-
- <emphasis role="bold">public Object stringToObject(String stringValue)</emphasis> {
- return new Integer(stringValue);
- }
-}
-
-
-//id property
- at DocumentId
- at FieldBridge(impl = PaddedIntegerBridge.class,
- params = @Parameter(name="padding", value="10")
-private Integer id;
- </programlisting>
- </example>
-
- <para>It is critically important for the two-way process to be
- idempotent (ie object = stringToObject( objectToString( object ) )
- ).</para>
- </section>
-
- <section>
- <title>FieldBridge</title>
-
- <para>Some usecases require more than a simple object to string
- translation when mapping a property to a Lucene index. To give you the
- greatest possible flexibility you can also implement a bridge as a
- <classname>FieldBridge</classname>. This interface gives you a
- property value and let you map it the way you want in your Lucene
- <classname>Document</classname>.The interface is very similar in its
- concept to the Hibernate<classname> UserType</classname>'s.</para>
-
- <para>You can for example store a given property in two different
- document fields:</para>
-
- <example>
- <title>Implementing the FieldBridge interface in order to a given
- property into multiple document fields</title>
-
- <programlisting>/**
- * Store the date in 3 different fields - year, month, day - to ease Range Query per
- * year, month or day (eg get all the elements of December for the last 5 years).
- *
- * @author Emmanuel Bernard
- */
-public class DateSplitBridge implements FieldBridge {
- private final static TimeZone GMT = TimeZone.getTimeZone("GMT");
-
- <emphasis role="bold">public void set(String name, Object value, Document document,
- LuceneOptions luceneOptions)</emphasis> {
- Date date = (Date) value;
- Calendar cal = GregorianCalendar.getInstance(GMT);
- cal.setTime(date);
- int year = cal.get(Calendar.YEAR);
- int month = cal.get(Calendar.MONTH) + 1;
- int day = cal.get(Calendar.DAY_OF_MONTH);
-
- // set year
- Field field = new Field(name + ".year", String.valueOf(year),
- luceneOptions.getStore(), luceneOptions.getIndex(),
- luceneOptions.getTermVector());
- field.setBoost(luceneOptions.getBoost());
- document.add(field);
-
- // set month and pad it if needed
- field = new Field(name + ".month", month < 10 ? "0" : ""
- + String.valueOf(month), luceneOptions.getStore(),
- luceneOptions.getIndex(), luceneOptions.getTermVector());
- field.setBoost(luceneOptions.getBoost());
- document.add(field);
-
- // set day and pad it if needed
- field = new Field(name + ".day", day < 10 ? "0" : ""
- + String.valueOf(day), luceneOptions.getStore(),
- luceneOptions.getIndex(), luceneOptions.getTermVector());
- field.setBoost(luceneOptions.getBoost());
- document.add(field);
- }
-}
-
-//property
-<emphasis role="bold">@FieldBridge(impl = DateSplitBridge.class)</emphasis>
-private Date date; </programlisting>
- </example>
- </section>
-
- <section>
- <title>ClassBridge</title>
-
- <para>It is sometimes useful to combine more than one property of a
- given entity and index this combination in a specific way into the
- Lucene index. The <classname>@ClassBridge</classname> and
- <classname>@ClassBridge</classname> annotations can be defined at the
- class level (as opposed to the property level). In this case the
- custom field bridge implementation receives the entity instance as the
- value parameter instead of a particular property. Though not shown in
- this example, <classname>@ClassBridge</classname> supports the
- <methodname>termVector</methodname> attribute discussed in section
- <xref linkend="basic-mapping" />.</para>
-
- <example>
- <title>Implementing a class bridge</title>
-
- <programlisting>@Entity
- at Indexed
-<emphasis role="bold">@ClassBridge</emphasis>(name="branchnetwork",
- index=Index.TOKENIZED,
- store=Store.YES,
- impl = <emphasis role="bold">CatFieldsClassBridge.class</emphasis>,
- params = @Parameter( name="sepChar", value=" " ) )
-public class Department {
- private int id;
- private String network;
- private String branchHead;
- private String branch;
- private Integer maxEmployees
- ...
-}
-
-
-public class CatFieldsClassBridge implements FieldBridge, ParameterizedBridge {
- private String sepChar;
-
- public void setParameterValues(Map parameters) {
- this.sepChar = (String) parameters.get( "sepChar" );
- }
-
- <emphasis role="bold">public void set(String name, Object value, Document document, LuceneOptions luceneOptions)</emphasis> {
- // In this particular class the name of the new field was passed
- // from the name field of the ClassBridge Annotation. This is not
- // a requirement. It just works that way in this instance. The
- // actual name could be supplied by hard coding it below.
- Department dep = (Department) value;
- String fieldValue1 = dep.getBranch();
- if ( fieldValue1 == null ) {
- fieldValue1 = "";
- }
- String fieldValue2 = dep.getNetwork();
- if ( fieldValue2 == null ) {
- fieldValue2 = "";
- }
- String fieldValue = fieldValue1 + sepChar + fieldValue2;
- Field field = new Field( name, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector() );
- field.setBoost( luceneOptions.getBoost() );
- document.add( field );
- }
-}</programlisting>
- </example>
-
- <para>In this example, the particular
- <classname>CatFieldsClassBridge</classname> is applied to the
- <literal>department</literal> instance, the field bridge then
- concatenate both branch and network and index the
- concatenation.</para>
- </section>
- </section>
- </section>
-
- <section id="provided-id">
- <title>Providing your own id</title>
-
- <warning>
- <para>This part of the documentation is a work in progress.</para>
- </warning>
-
- <para>You can provide your own id for Hibernate Search if you are
- extending the internals. You will have to generate a unique value so it
- can be given to Lucene to be indexed. This will have to be given to
- Hibernate Search when you create an org.hibernate.search.Work object - the
- document id is required in the constructor.</para>
-
- <section id="ProvidedId">
- <title>The @ProvidedId annotation</title>
-
- <para>Unlike conventional Hibernate Search API and @DocumentId, this
- annotation is used on the class and not a field. You also can provide
- your own bridge implementation when you put in this annotation by
- calling the bridge() which is on @ProvidedId. Also, if you annotate a
- class with @ProvidedId, your subclasses will also get the annotation -
- but it is not done by using the java.lang.annotations. at Inherited. Be
- sure however, to <emphasis>not</emphasis> use this annotation with
- @DocumentId as your system will break.</para>
-
- <example>
- <title>Providing your own id</title>
-
- <programlisting>@ProvidedId (bridge = org.my.own.package.MyCustomBridge)
- at Indexed
-public class MyClass{
- @Field
- String MyString;
- ...
-}</programlisting>
- </example>
- </section>
- </section>
-</chapter>
Copied: search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml (from rev 15661, search/trunk/doc/reference/en/modules/mapping.xml)
===================================================================
--- search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml (rev 0)
+++ search/tags/v3_1_0_GA/doc/reference/en/modules/mapping.xml 2008-12-04 10:45:13 UTC (rev 15663)
@@ -0,0 +1,1451 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ ~ Hibernate, Relational Persistence for Idiomatic Java
+ ~
+ ~ Copyright (c) 2008, Red Hat Middleware LLC or third-party contributors as
+ ~ indicated by the @author tags or express copyright attribution
+ ~ statements applied by the authors. All third-party contributions are
+ ~ distributed under license by Red Hat Middleware LLC.
+ ~
+ ~ This copyrighted material is made available to anyone wishing to use, modify,
+ ~ copy, or redistribute it subject to the terms and conditions of the GNU
+ ~ Lesser General Public License, as published by the Free Software Foundation.
+ ~
+ ~ This program is distributed in the hope that it will be useful,
+ ~ but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ ~ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
+ ~ for more details.
+ ~
+ ~ You should have received a copy of the GNU Lesser General Public License
+ ~ along with this distribution; if not, write to:
+ ~ Free Software Foundation, Inc.
+ ~ 51 Franklin Street, Fifth Floor
+ ~ Boston, MA 02110-1301 USA
+ -->
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
+"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
+<chapter id="search-mapping" revision="3">
+ <!-- $Id$ -->
+
+ <title>Mapping entities to the index structure</title>
+
+ <para>All the metadata information needed to index entities is described
+ through annotations. There is no need for xml mapping files. In fact there
+ is currently no xml configuration option available (see <ulink
+ url="http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-210">HSEARCH-210</ulink>).
+ You can still use hibernate mapping files for the basic Hibernate
+ configuration, but the Search specific configuration has to be expressed via
+ annotations.</para>
+
+ <section id="search-mapping-entity" revision="3">
+ <title>Mapping an entity</title>
+
+ <section id="basic-mapping">
+ <title>Basic mapping</title>
+
+ <para>First, we must declare a persistent class as indexable. This is
+ done by annotating the class with <literal>@Indexed</literal> (all
+ entities not annotated with <literal>@Indexed</literal> will be ignored
+ by the indexing process):</para>
+
+ <example>
+ <title>Making a class indexable using the
+ <classname>@Indexed</classname> annotation</title>
+
+ <programlisting>@Entity
+<emphasis role="bold">@Indexed(index="indexes/essays")</emphasis>
+public class Essay {
+ ...
+}</programlisting>
+ </example>
+
+ <para>The <literal>index</literal> attribute tells Hibernate what the
+ Lucene directory name is (usually a directory on your file system). It
+ is recommended to define a base directory for all Lucene indexes using
+ the <literal>hibernate.search.default.indexBase</literal> property in
+ your configuration file. Alternatively you can specify a base directory
+ per indexed entity by specifying
+ <literal>hibernate.search.<index>.indexBase, </literal>where
+ <literal><index></literal> is the fully qualified classname of the
+ indexed entity. Each entity instance will be represented by a Lucene
+ <classname>Document</classname> inside the given index (aka
+ Directory).</para>
+
+ <para>For each property (or attribute) of your entity, you have the
+ ability to describe how it will be indexed. The default (no annotation
+ present) means that the property is completly ignored by the indexing
+ process. <literal>@Field</literal> does declare a property as indexed.
+ When indexing an element to a Lucene document you can specify how it is
+ indexed:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para><literal>name</literal> : describe under which name, the
+ property should be stored in the Lucene Document. The default value
+ is the property name (following the JavaBeans convention)</para>
+ </listitem>
+
+ <listitem>
+ <para><literal>store</literal> : describe whether or not the
+ property is stored in the Lucene index. You can store the value
+ <literal>Store.YES</literal> (comsuming more space in the index but
+ allowing projection, see <xref linkend="projections" /> for more
+ information), store it in a compressed way
+ <literal>Store.COMPRESS</literal> (this does consume more CPU), or
+ avoid any storage <literal>Store.NO</literal> (this is the default
+ value). When a property is stored, you can retrieve its original
+ value from the Lucene Document. This is not related to whether the
+ element is indexed or not.</para>
+ </listitem>
+
+ <listitem>
+ <para>index: describe how the element is indexed and the type of
+ information store. The different values are
+ <literal>Index.NO</literal> (no indexing, ie cannot be found by a
+ query), <literal>Index.TOKENIZED</literal> (use an analyzer to
+ process the property), <literal>Index.UN_TOKENISED</literal> (no
+ analyzer pre processing), <literal>Index.NO_NORM</literal> (do not
+ store the normalization data). The default value is
+ <literal>TOKENIZED</literal>.</para>
+ </listitem>
+
+ <listitem>
+ <para>termVector: describes collections of term-frequency pairs.
+ This attribute enables term vectors being stored during indexing so
+ they are available within documents. The default value is
+ TermVector.NO.</para>
+
+ <para>The different values of this attribute are:</para>
+
+ <informaltable align="left" width="">
+ <tgroup cols="2">
+ <thead>
+ <row>
+ <entry align="center">Value</entry>
+
+ <entry align="center">Definition</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry align="left">TermVector.YES</entry>
+
+ <entry>Store the term vectors of each document. This
+ produces two synchronized arrays, one contains document
+ terms and the other contains the term's frequency.</entry>
+ </row>
+
+ <row>
+ <entry align="left">TermVector.NO</entry>
+
+ <entry>Do not store term vectors.</entry>
+ </row>
+
+ <row>
+ <entry align="left">TermVector.WITH_OFFSETS</entry>
+
+ <entry>Store the term vector and token offset information.
+ This is the same as TermVector.YES plus it contains the
+ starting and ending offset position information for the
+ terms.</entry>
+ </row>
+
+ <row>
+ <entry align="left">TermVector.WITH_POSITIONS</entry>
+
+ <entry>Store the term vector and token position information.
+ This is the same as TermVector.YES plus it contains the
+ ordinal positions of each occurrence of a term in a
+ document.</entry>
+ </row>
+
+ <row>
+ <entry
+ align="left">TermVector.WITH_POSITIONS_OFFSETS</entry>
+
+ <entry>Store the term vector, token position and offset
+ information. This is a combination of the YES, WITH_OFFSETS
+ and WITH_POSITIONS.</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </informaltable>
+ </listitem>
+ </itemizedlist>
+
+ <para>Whether or not you want to store the original data in the index
+ depends on how you wish to use the index query result. For a regular
+ Hibernate Search usage storing is not necessary. However you might want
+ to store some fields to subsequently project them (see <xref
+ linkend="projections" /> for more information).</para>
+
+ <para>Whether or not you want to tokenize a property depends on whether
+ you wish to search the element as is, or by the words it contains. It
+ make sense to tokenize a text field, but tokenizing a date field
+ probably not. Note that fields used for sorting must not be
+ tokenized.</para>
+
+ <para>Finally, the id property of an entity is a special property used
+ by Hibernate Search to ensure index unicity of a given entity. By
+ design, an id has to be stored and must not be tokenized. To mark a
+ property as index id, use the <literal>@DocumentId</literal> annotation.
+ If you are using Hibernate Annotations and you have specified @Id you
+ can omit @DocumentId. The chosen entity id will also be used as document
+ id.</para>
+
+ <example>
+ <title>Adding <classname>@DocumentId</classname> ad
+ <classname>@Field</classname> annotations to an indexed entity</title>
+
+ <programlisting>@Entity
+ at Indexed(index="indexes/essays")
+public class Essay {
+ ...
+
+ @Id
+ <emphasis role="bold">@DocumentId</emphasis>
+ public Long getId() { return id; }
+
+ <emphasis role="bold">@Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES)</emphasis>
+ public String getSummary() { return summary; }
+
+ @Lob
+ <emphasis role="bold">@Field(index=Index.TOKENIZED)</emphasis>
+ public String getText() { return text; }
+}</programlisting>
+ </example>
+
+ <para>The above annotations define an index with three fields:
+ <literal>id</literal> , <literal>Abstract</literal> and
+ <literal>text</literal> . Note that by default the field name is
+ decapitalized, following the JavaBean specification</para>
+ </section>
+
+ <section>
+ <title>Mapping properties multiple times</title>
+
+ <para>Sometimes one has to map a property multiple times per index, with
+ slightly different indexing strategies. For example, sorting a query by
+ field requires the field to be <literal>UN_TOKENIZED</literal>. If one
+ wants to search by words in this property and still sort it, one need to
+ index it twice - once tokenized and once untokenized. @Fields allows to
+ achieve this goal.</para>
+
+ <example>
+ <title>Using @Fields to map a property multiple times</title>
+
+ <programlisting>@Entity
+ at Indexed(index = "Book" )
+public class Book {
+ <emphasis role="bold">@Fields( {</emphasis>
+ @Field(index = Index.TOKENIZED),
+ @Field(name = "summary_forSort", index = Index.UN_TOKENIZED, store = Store.YES)
+ <emphasis role="bold">} )</emphasis>
+ public String getSummary() {
+ return summary;
+ }
+
+ ...
+}</programlisting>
+ </example>
+
+ <para>The field <literal>summary</literal> is indexed twice, once as
+ <literal>summary</literal> in a tokenized way, and once as
+ <literal>summary_forSort</literal> in an untokenized way. @Field
+ supports 2 attributes useful when @Fields is used:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>analyzer: defines a @Analyzer annotation per field rather than
+ per property</para>
+ </listitem>
+
+ <listitem>
+ <para>bridge: defines a @FieldBridge annotation per field rather
+ than per property</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>See below for more information about analyzers and field
+ bridges.</para>
+ </section>
+
+ <section id="search-mapping-associated">
+ <title>Embedded and associated objects</title>
+
+ <para>Associated objects as well as embedded objects can be indexed as
+ part of the root entity index. This is ueful if you expect to search a
+ given entity based on properties of associated objects. In the following
+ example the aim is to return places where the associated city is Atlanta
+ (In the Lucene query parser language, it would translate into
+ <code>address.city:Atlanta</code>).</para>
+
+ <example>
+ <title>Using @IndexedEmbedded to index associations</title>
+
+ <programlisting>@Entity
+ at Indexed
+public class Place {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Long id;
+
+ @Field( index = Index.TOKENIZED )
+ private String name;
+
+ @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
+ <emphasis role="bold">@IndexedEmbedded</emphasis>
+ private Address address;
+ ....
+}
+
+ at Entity
+public class Address {
+ @Id
+ @GeneratedValue
+ private Long id;
+
+ @Field(index=Index.TOKENIZED)
+ private String street;
+
+ @Field(index=Index.TOKENIZED)
+ private String city;
+
+ <emphasis role="bold">@ContainedIn</emphasis>
+ @OneToMany(mappedBy="address")
+ private Set<Place> places;
+ ...
+}</programlisting>
+ </example>
+
+ <para>In this example, the place fields will be indexed in the
+ <literal>Place</literal> index. The <literal>Place</literal> index
+ documents will also contain the fields <literal>address.id</literal>,
+ <literal>address.street</literal>, and <literal>address.city</literal>
+ which you will be able to query. This is enabled by the
+ <literal>@IndexedEmbedded</literal> annotation.</para>
+
+ <para>Be careful. Because the data is denormalized in the Lucene index
+ when using the <classname>@IndexedEmbedded</classname> technique,
+ Hibernate Search needs to be aware of any change in the
+ <classname>Place</classname> object and any change in the
+ <classname>Address</classname> object to keep the index up to date. To
+ make sure the <literal><classname>Place</classname></literal> Lucene
+ document is updated when it's <classname>Address</classname> changes,
+ you need to mark the other side of the birirectional relationship with
+ <classname>@ContainedIn</classname>.</para>
+
+ <para><literal>@ContainedIn</literal> is only useful on associations
+ pointing to entities as opposed to embedded (collection of)
+ objects.</para>
+
+ <para>Let's make our example a bit more complex:</para>
+
+ <example>
+ <title>Nested usage of <classname>@IndexedEmbedded</classname> and
+ <classname>@ContainedIn</classname></title>
+
+ <programlisting>@Entity
+ at Indexed
+public class Place {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Long id;
+
+ @Field( index = Index.TOKENIZED )
+ private String name;
+
+ @OneToOne( cascade = { CascadeType.PERSIST, CascadeType.REMOVE } )
+ <emphasis role="bold">@IndexedEmbedded</emphasis>
+ private Address address;
+ ....
+}
+
+ at Entity
+public class Address {
+ @Id
+ @GeneratedValue
+ private Long id;
+
+ @Field(index=Index.TOKENIZED)
+ private String street;
+
+ @Field(index=Index.TOKENIZED)
+ private String city;
+
+ <emphasis role="bold">@IndexedEmbedded(depth = 1, prefix = "ownedBy_")</emphasis>
+ private Owner ownedBy;
+
+ <emphasis role="bold">@ContainedIn</emphasis>
+ @OneToMany(mappedBy="address")
+ private Set<Place> places;
+ ...
+}
+
+ at Embeddable
+public class Owner {
+ @Field(index = Index.TOKENIZED)
+ private String name;
+ ...
+}</programlisting>
+ </example>
+
+ <para>Any <literal>@*ToMany, @*ToOne</literal> and
+ <literal>@Embedded</literal> attribute can be annotated with
+ <literal>@IndexedEmbedded</literal>. The attributes of the associated
+ class will then be added to the main entity index. In the previous
+ example, the index will contain the following fields</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>id</para>
+ </listitem>
+
+ <listitem>
+ <para>name</para>
+ </listitem>
+
+ <listitem>
+ <para>address.street</para>
+ </listitem>
+
+ <listitem>
+ <para>address.city</para>
+ </listitem>
+
+ <listitem>
+ <para>addess.ownedBy_name</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The default prefix is <literal>propertyName.</literal>, following
+ the traditional object navigation convention. You can override it using
+ the <literal>prefix</literal> attribute as it is shown on the
+ <literal>ownedBy</literal> property.</para>
+
+ <note>
+ <para>The prefix cannot be set to the empty string.</para>
+ </note>
+
+ <para>The<literal> depth</literal> property is necessary when the object
+ graph contains a cyclic dependency of classes (not instances). For
+ example, if <classname>Owner</classname> points to
+ <classname>Place</classname>. Hibernate Search will stop including
+ Indexed embedded atttributes after reaching the expected depth (or the
+ object graph boundaries are reached). A class having a self reference is
+ an example of cyclic dependency. In our example, because
+ <literal>depth</literal> is set to 1, any
+ <literal>@IndexedEmbedded</literal> attribute in Owner (if any) will be
+ ignored.</para>
+
+ <para>Using <literal>@IndexedEmbedded</literal> for object associations
+ allows you to express queries such as:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Return places where name contains JBoss and where address city
+ is Atlanta. In Lucene query this would be</para>
+
+ <programlisting>+name:jboss +address.city:atlanta </programlisting>
+ </listitem>
+
+ <listitem>
+ <para>Return places where name contains JBoss and where owner's name
+ contain Joe. In Lucene query this would be</para>
+
+ <programlisting>+name:jboss +address.orderBy_name:joe </programlisting>
+ </listitem>
+ </itemizedlist>
+
+ <para>In a way it mimics the relational join operation in a more
+ efficient way (at the cost of data duplication). Remember that, out of
+ the box, Lucene indexes have no notion of association, the join
+ operation is simply non-existent. It might help to keep the relational
+ model normalized while benefiting from the full text index speed and
+ feature richness.</para>
+
+ <para><note>
+ <para>An associated object can itself (but does not have to) be
+ <literal>@Indexed</literal></para>
+ </note></para>
+
+ <para>When @IndexedEmbedded points to an entity, the association has to
+ be directional and the other side has to be annotated
+ <literal>@ContainedIn</literal> (as seen in the previous example). If
+ not, Hibernate Search has no way to update the root index when the
+ associated entity is updated (in our example, a <literal>Place</literal>
+ index document has to be updated when the associated
+ <classname>Address</classname> instance is updated).</para>
+
+ <para>Sometimes, the object type annotated by
+ <classname>@IndexedEmbedded</classname> is not the object type targeted
+ by Hibernate and Hibernate Search. This is especially the case when
+ interfaces are used in lieu of their implementation. For this reason you
+ can override the object type targeted by Hibernate Search using the
+ <methodname>targetElement</methodname> parameter.</para>
+
+ <example>
+ <title>Using the <literal>targetElement</literal> property of
+ <classname>@IndexedEmbedded</classname></title>
+
+ <programlisting>@Entity
+ at Indexed
+public class Address {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Long id;
+
+ @Field(index= Index.TOKENIZED)
+ private String street;
+
+ @IndexedEmbedded(depth = 1, prefix = "ownedBy_", <emphasis role="bold">targetElement = Owner.class</emphasis>)
+ @Target(Owner.class)
+ private Person ownedBy;
+
+
+ ...
+}
+
+ at Embeddable
+public class Owner implements Person { ... }</programlisting>
+ </example>
+ </section>
+
+ <section>
+ <title>Boost factor</title>
+
+ <para>Lucene has the notion of <emphasis>boost factor</emphasis>. It's a
+ way to give more weigth to a field or to an indexed element over others
+ during the indexation process. You can use <literal>@Boost</literal> at
+ the @Field, method or class level.</para>
+
+ <example>
+ <title>Using different ways of increasing the weight of an indexed
+ element using a boost factor</title>
+
+ <programlisting>@Entity
+ at Indexed(index="indexes/essays")
+<emphasis role="bold">@Boost(1.7f)</emphasis>
+public class Essay {
+ ...
+
+ @Id
+ @DocumentId
+ public Long getId() { return id; }
+
+ @Field(name="Abstract", index=Index.TOKENIZED, store=Store.YES, boost=<emphasis
+ role="bold">@Boost(2f)</emphasis>)
+ <emphasis role="bold">@Boost(1.5f)</emphasis>
+ public String getSummary() { return summary; }
+
+ @Lob
+ @Field(index=Index.TOKENIZED, boost=<emphasis role="bold">@Boost(1.2f)</emphasis>)
+ public String getText() { return text; }
+
+ @Field
+ public String getISBN() { return isbn; }
+
+} </programlisting>
+ </example>
+
+ <para>In our example, <classname>Essay</classname>'s probability to
+ reach the top of the search list will be multiplied by 1.7. The
+ <methodname>summary</methodname> field will be 3.0 (2 * 1.5 -
+ <methodname>@Field.boost</methodname> and <classname>@Boost</classname>
+ on a property are cumulative) more important than the
+ <methodname>isbn</methodname> field. The <methodname>text</methodname>
+ field will be 1.2 times more important than the
+ <methodname>isbn</methodname> field. Note that this explanation in
+ strictest terms is actually wrong, but it is simple and close enough to
+ reality for all practical purposes. Please check the Lucene
+ documentation or the excellent <citetitle>Lucene In Action </citetitle>
+ from Otis Gospodnetic and Erik Hatcher.</para>
+ </section>
+
+ <section id="analyzer">
+ <title>Analyzer</title>
+
+ <para>The default analyzer class used to index tokenized fields is
+ configurable through the <literal>hibernate.search.analyzer</literal>
+ property. The default value for this property is
+ <classname>org.apache.lucene.analysis.standard.StandardAnalyzer</classname>.</para>
+
+ <para>You can also define the analyzer class per entity, property and
+ even per @Field (useful when multiple fields are indexed from a single
+ property).</para>
+
+ <example>
+ <title>Different ways of specifying an analyzer</title>
+
+ <programlisting>@Entity
+ at Indexed
+<emphasis role="bold">@Analyzer(impl = EntityAnalyzer.class)</emphasis>
+public class MyEntity {
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Integer id;
+
+ @Field(index = Index.TOKENIZED)
+ private String name;
+
+ @Field(index = Index.TOKENIZED)
+ <emphasis role="bold">@Analyzer(impl = PropertyAnalyzer.class)</emphasis>
+ private String summary;
+
+ @Field(index = Index.TOKENIZED, <emphasis><emphasis role="bold">analyzer = @Analyzer(impl = FieldAnalyzer.class</emphasis>)</emphasis>
+ private String body;
+
+ ...
+}</programlisting>
+ </example>
+
+ <para>In this example, <classname>EntityAnalyzer</classname> is used to
+ index all tokenized properties (eg. <literal>name</literal>), except
+ <literal>summary</literal> and <literal>body</literal> which are indexed
+ with <classname>PropertyAnalyzer</classname> and
+ <classname>FieldAnalyzer</classname> respectively.</para>
+
+ <caution>
+ <para>Mixing different analyzers in the same entity is most of the
+ time a bad practice. It makes query building more complex and results
+ less predictable (for the novice), especially if you are using a
+ QueryParser (which uses the same analyzer for the whole query). As a
+ rule of thumb, for any given field the same analyzer should be used
+ for indexing and querying.</para>
+ </caution>
+
+ <section>
+ <title>Analyzer definitions</title>
+
+ <para>Analyzers can become quite complex to deal with for which reason
+ Hibernate Search introduces the notion of analyzer definitions. An
+ analyzer definition can be reused by many
+ <classname>@Analyzer</classname> declarations. An analyzer definition
+ is composed of:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>a name: the unique string used to refer to the
+ definition</para>
+ </listitem>
+
+ <listitem>
+ <para>a tokenizer: responsible for tokenizing the input stream
+ into individual words</para>
+ </listitem>
+
+ <listitem>
+ <para>a list of filters: each filter is responsible to remove,
+ modify or sometimes even add words into the stream provided by the
+ tokenizer</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>This separation of tasks - a tokenizer followed by a list of
+ filters - allows for easy reuse of each individual component and let
+ you build your customized analyzer in a very flexible way (just like
+ lego). Generally speaking the <classname>Tokenizer</classname> starts
+ the analysis process by turning the character input into tokens which
+ are then further processed by the <classname>TokenFilter</classname>s.
+ Hibernate Search supports this infrastructure by utilizing the Solr
+ analyzer framework. Make sure to add<filename> solr-core.jar and
+ </filename><filename>solr-common.jar</filename> to your classpath to
+ use analyzer definitions. In case you also want to utilizing a
+ snowball stemmer also include the
+ <filename>lucene-snowball.jar.</filename> Other Solr analyzers might
+ depend on more libraries. For example, the
+ <classname>PhoneticFilterFactory</classname> depends on <ulink
+ url="http://commons.apache.org/codec">commons-codec</ulink>. Your
+ distribution of Hibernate Search provides these dependecies in its
+ <filename>lib</filename> directory.</para>
+
+ <example>
+ <title><classname>@AnalyzerDef</classname> and the Solr
+ framework</title>
+
+ <programlisting>@AnalyzerDef(name="customanalyzer",
+ tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
+ filters = {
+ @TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
+ @TokenFilterDef(factory = LowerCaseFilterFactory.class),
+ @TokenFilterDef(factory = StopFilterFactory.class, params = {
+ @Parameter(name="words", value= "org/hibernate/search/test/analyzer/solr/stoplist.properties" ),
+ @Parameter(name="ignoreCase", value="true")
+ })
+})
+public class Team {
+ ...
+}</programlisting>
+ </example>
+
+ <para>A tokenizer is defined by its factory which is responsible for
+ building the tokenizer and using the optional list of parameters. This
+ example use the standard tokenizer. A filter is defined by its factory
+ which is responsible for creating the filter instance using the
+ optional parameters. In our example, the StopFilter filter is built
+ reading the dedicated words property file and is expected to ignore
+ case. The list of parameters is dependent on the tokenizer or filter
+ factory.</para>
+
+ <warning>
+ <para>Filters are applied in the order they are defined in the
+ <classname>@AnalyzerDef</classname> annotation. Make sure to think
+ twice about this order.</para>
+ </warning>
+
+ <para>Once defined, an analyzer definition can be reused by an
+ <classname>@Analyzer</classname> declaration using the definition name
+ rather than declaring an implementation class.</para>
+
+ <example>
+ <title>Referencing an analyzer by name</title>
+
+ <programlisting>@Entity
+ at Indexed
+ at AnalyzerDef(name="customanalyzer", ... )
+public class Team {
+ @Id
+ @DocumentId
+ @GeneratedValue
+ private Integer id;
+
+ @Field
+ private String name;
+
+ @Field
+ private String location;
+
+ @Field <emphasis role="bold">@Analyzer(definition = "customanalyzer")</emphasis>
+ private String description;
+}</programlisting>
+ </example>
+
+ <para>Analyzer instances declared by
+ <classname>@AnalyzerDef</classname> are available by their name in the
+ <classname>SearchFactory</classname>.</para>
+
+ <programlisting>Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("customanalyzer");</programlisting>
+
+ <para>This is quite useful wen building queries. Fields in queries
+ should be analyzed with the same analyzer used to index the field so
+ that they speak a common "language": the same tokens are reused
+ between the query and the indexing process. This rule has some
+ exceptions but is true most of the time. Respect it unless you know
+ what you are doing.</para>
+ </section>
+
+ <section>
+ <title>Available analyzers</title>
+
+ <para>Solr and Lucene come with a lot of useful default tokenizers and
+ filters. You can find a complete list of tokenizer factories and
+ filter factories at <ulink
+ url="http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters">http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters</ulink>.
+ Let check a few of them.</para>
+
+ <table>
+ <title>Some of the tokenizers avalable</title>
+
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry align="center">Factory</entry>
+
+ <entry align="center">Description</entry>
+
+ <entry align="center">parameters</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry>StandardTokenizerFactory</entry>
+
+ <entry>Use the Lucene StandardTokenizer</entry>
+
+ <entry>none</entry>
+ </row>
+
+ <row>
+ <entry>HTMLStripStandardTokenizerFactory</entry>
+
+ <entry>Remove HTML tags, keep the text and pass it to a
+ StandardTokenizer</entry>
+
+ <entry>none</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <table>
+ <title>Some of the filters avalable</title>
+
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry align="center">Factory</entry>
+
+ <entry align="center">Description</entry>
+
+ <entry align="center">parameters</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry>StandardFilterFactory</entry>
+
+ <entry>Remove dots from acronyms and 's from words</entry>
+
+ <entry>none</entry>
+ </row>
+
+ <row>
+ <entry>LowerCaseFilterFactory</entry>
+
+ <entry>Lowercase words</entry>
+
+ <entry>none</entry>
+ </row>
+
+ <row>
+ <entry>StopFilterFactory</entry>
+
+ <entry>remove words (tokens) matching a list of stop
+ words</entry>
+
+ <entry><para><literal>words</literal>: points to a resource
+ file containing the stop words</para><para>ignoreCase: true if
+ <literal>case</literal> should be ignore when comparing stop
+ words, <literal>false</literal> otherwise </para></entry>
+ </row>
+
+ <row>
+ <entry>SnowballPorterFilterFactory</entry>
+
+ <entry>Reduces a word to it's root in a given language. (eg.
+ protect, protects, protection share the same root). Using such
+ a filter allows searches matching related words.</entry>
+
+ <entry><para><literal>language</literal>: Danish, Dutch,
+ English, Finnish, French, German, Italian, Norwegian,
+ Portuguese, Russian, Spanish, Swedish</para>and a few
+ more</entry>
+ </row>
+
+ <row>
+ <entry>ISOLatin1AccentFilterFactory</entry>
+
+ <entry>remove accents for languages like French</entry>
+
+ <entry>none</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>We recommend to check all the implementations of
+ <classname>org.apache.solr.analysis.TokenizerFactory</classname> and
+ <classname>org.apache.solr.analysis.TokenFilterFactory</classname> in
+ your IDE to see the implementations available.</para>
+ </section>
+
+ <section>
+ <title>Analyzer discriminator (experimental)</title>
+
+ <para>So far all the introduced ways to specify an analyzer were
+ static. However, there are usecases where it is useful to select an
+ analyzer depending on the current state of the entity to be indexed,
+ for example in multilingual application. For an
+ <classname>BlogEntry</classname> class for example the analyzer could
+ depend on the language property of the entry. Depending on this
+ property the correct language specific stemmer should be chosen to
+ index the actual text.</para>
+
+ <para>To enable this dynamic analyzer selection Hibernate Search
+ introduces the <classname>AnalyzerDiscriminator</classname>
+ annotation. The following example demonstrates the usage of this
+ annotation:</para>
+
+ <para><example>
+ <title>Usage of @AnalyzerDiscriminator in order to select an
+ analyzer depending on the entity state</title>
+
+ <programlisting>@Entity
+ at Indexed
+ at AnalyzerDefs({
+ @AnalyzerDef(name = "en",
+ tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
+ filters = {
+ @TokenFilterDef(factory = LowerCaseFilterFactory.class),
+ @TokenFilterDef(factory = EnglishPorterFilterFactory.class
+ )
+ }),
+ @AnalyzerDef(name = "de",
+ tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
+ filters = {
+ @TokenFilterDef(factory = LowerCaseFilterFactory.class),
+ @TokenFilterDef(factory = GermanStemFilterFactory.class)
+ })
+})
+public class BlogEntry {
+
+ @Id
+ @GeneratedValue
+ @DocumentId
+ private Integer id;
+
+ @Field
+ @AnalyzerDiscriminator(impl = LanguageDiscriminator.class)
+ private String language;
+
+ @Field
+ private String text;
+
+ private Set<BlogEntry> references;
+
+ // standard getter/setter
+ ...
+}</programlisting>
+
+ <programlisting>public class LanguageDiscriminator implements Discriminator {
+
+ public String getAnanyzerDefinitionName(Object value, Object entity, String field) {
+ if ( value == null || !( entity instanceof Article ) ) {
+ return null;
+ }
+ return (String) value;
+ }
+}</programlisting>
+ </example>The prerequisite for using
+ <classname>@AnalyzerDiscriminator</classname> is that all analyzers
+ which are going to be used are predefined via
+ <classname>@AnalyzerDef</classname> definitions. If this is the case
+ one can place the <classname>@AnalyzerDiscriminator</classname>
+ annotation either on the class or on a specific property of the entity
+ for which to dynamically select an analyzer. Via the
+ <literal>impl</literal> parameter of the
+ <classname>AnalyzerDiscriminator</classname> you specify a concrete
+ implementation of the <classname>Discriminator</classname> interface.
+ It is up to you to provide an implementation for this interface. The
+ only method you have to implement is
+ <classname>getAnanyzerDefinitionName()</classname> which gets called
+ for each field added to the Lucene document. The entity which is
+ getting indexed is also passed to the interface method. The
+ <literal>value</literal> parameter is only set if the
+ <classname>AnalyzerDiscriminator</classname> is placed on property
+ level instead of class level. In this case the value represents the
+ current value of this property.</para>
+
+ <para>An implemention of the <classname>Discriminator</classname>
+ interface has to return the name of an existing analyzer definition if
+ the analyzer should be set dynamically or <classname>null</classname>
+ if the default analyzer should not be overridden. The given example
+ assumes that the language paramter is either 'de' or 'en' which
+ matches the specified names in the
+ <classname>@AnalyzerDef</classname>s.</para>
+
+ <note>
+ <para>The <classname>@AnalyzerDiscriminator</classname> is currently
+ still experimental and the API might still change. We are hoping for
+ some feedback from the community about the usefulness and usability
+ of this feature.</para>
+ </note>
+ </section>
+
+ <section id="analyzer-retrievinganalyzer">
+ <title>Retrieving an analyzer</title>
+
+ <para>During indexing time, Hibernate Search is using analyzers under
+ the hood for you. In some situations, retrieving analyzers can be
+ handy. If your domain model makes use of multiple analyzers (maybe to
+ benefit from stemming, use phonetic approximation and so on), you need
+ to make sure to use the same analyzers when you build your
+ query.</para>
+
+ <note>
+ <para>This rule can be broken but you need a good reason for it. If
+ you are unsure, use the same analyzers.</para>
+ </note>
+
+ <para>You can retrieve the scoped analyzer for a given entity used at
+ indexing time by Hibernate Search. A scoped analyzer is an analyzer
+ which applies the right analyzers depending on the field indexed:
+ multiple analyzers can be defined on a given entity each one working
+ on an individual field, a scoped analyzer unify all these analyzers
+ into a context-aware analyzer. While the theory seems a bit complex,
+ using the right analyzer in a query is very easy.</para>
+
+ <example>
+ <title>Using the scoped analyzer when building a full-text
+ query</title>
+
+ <programlisting>org.apache.lucene.queryParser.QueryParser parser = new QueryParser(
+ "title",
+ fullTextSession.getSearchFactory().getAnalyzer( Song.class )
+);
+
+org.apache.lucene.search.Query luceneQuery =
+ parser.parse( "title:sky Or title_stemmed:diamond" );
+
+org.hibernate.Query fullTextQuery =
+ fullTextSession.createFullTextQuery( luceneQuery, Song.class );
+
+List result = fullTextQuery.list(); //return a list of managed objects </programlisting>
+ </example>
+
+ <para>In the example above, the song title is indexed in two fields:
+ the standard analyzer is used in the field <literal>title</literal>
+ and a stemming analyzer is used in the field
+ <literal>title_stemmed</literal>. By using the analyzer provided by
+ the search factory, the query uses the appropriate analyzer depending
+ on the field targeted.</para>
+
+ <para>If your query targets more that one query and you wish to use
+ your standard analyzer, make sure to describe it using an analyzer
+ definition. You can retrieve analyzers by their definition name using
+ <code>searchFactory.getAnalyzer(String)</code>.</para>
+ </section>
+ </section>
+ </section>
+
+ <section id="search-mapping-bridge">
+ <title>Property/Field Bridge</title>
+
+ <para>In Lucene all index fields have to be represented as Strings. For
+ this reason all entity properties annotated with <literal>@Field</literal>
+ have to be indexed in a String form. For most of your properties,
+ Hibernate Search does the translation job for you thanks to a built-in set
+ of bridges. In some cases, though you need a more fine grain control over
+ the translation process.</para>
+
+ <section>
+ <title>Built-in bridges</title>
+
+ <para>Hibernate Search comes bundled with a set of built-in bridges
+ between a Java property type and its full text representation.</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>null</term>
+
+ <listitem>
+ <para>null elements are not indexed. Lucene does not support null
+ elements and this does not make much sense either.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.lang.String</term>
+
+ <listitem>
+ <para>String are indexed as is</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>short, Short, integer, Integer, long, Long, float, Float,
+ double, Double, BigInteger, BigDecimal</term>
+
+ <listitem>
+ <para>Numbers are converted in their String representation. Note
+ that numbers cannot be compared by Lucene (ie used in ranged
+ queries) out of the box: they have to be padded <note>
+ <para>Using a Range query is debatable and has drawbacks, an
+ alternative approach is to use a Filter query which will
+ filter the result query to the appropriate range.</para>
+
+ <para>Hibernate Search will support a padding mechanism</para>
+ </note></para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.util.Date</term>
+
+ <listitem>
+ <para>Dates are stored as yyyyMMddHHmmssSSS in GMT time
+ (200611072203012 for Nov 7th of 2006 4:03PM and 12ms EST). You
+ shouldn't really bother with the internal format. What is
+ important is that when using a DateRange Query, you should know
+ that the dates have to be expressed in GMT time.</para>
+
+ <para>Usually, storing the date up to the milisecond is not
+ necessary. <literal>@DateBridge</literal> defines the appropriate
+ resolution you are willing to store in the index ( <literal>
+ <literal>@DateBridge(resolution=Resolution.DAY)</literal>
+ </literal> ). The date pattern will then be truncated
+ accordingly.</para>
+
+ <programlisting>@Entity
+ at Indexed
+public class Meeting {
+ @Field(index=Index.UN_TOKENIZED)
+ <emphasis role="bold">@DateBridge(resolution=Resolution.MINUTE)</emphasis>
+ private Date date;
+ ... </programlisting>
+
+ <warning>
+ <para>A Date whose resolution is lower than
+ <literal>MILLISECOND</literal> cannot be a
+ <literal>@DocumentId</literal></para>
+ </warning>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.net.URI, java.net.URL</term>
+
+ <listitem>
+ <para>URI and URL are converted to their string
+ representation</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>java.lang.Class</term>
+
+ <listitem>
+ <para>Class are converted to their fully qualified class name. The
+ thread context classloader is used when the class is
+ rehydrated</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </section>
+
+ <section>
+ <title>Custom Bridge</title>
+
+ <para>Sometimes, the built-in bridges of Hibernate Search do not cover
+ some of your property types, or the String representation used by the
+ bridge does not meet your requirements. The following paragraphs
+ describe several solutions to this problem.</para>
+
+ <section>
+ <title>StringBridge</title>
+
+ <para>The simplest custom solution is to give Hibernate Search an
+ implementation of your expected
+ <emphasis><classname>Object</classname> </emphasis>to
+ <classname>String</classname> bridge. To do so you need to implements
+ the <literal>org.hibernate.search.bridge.StringBridge</literal>
+ interface. All implementations have to be thread-safe as they are used
+ concurrently.</para>
+
+ <example>
+ <title>Implementing your own
+ <classname>StringBridge</classname></title>
+
+ <programlisting>/**
+ * Padding Integer bridge.
+ * All numbers will be padded with 0 to match 5 digits
+ *
+ * @author Emmanuel Bernard
+ */
+public class PaddedIntegerBridge implements <emphasis role="bold">StringBridge</emphasis> {
+
+ private int PADDING = 5;
+
+ <emphasis role="bold">public String objectToString(Object object)</emphasis> {
+ String rawInteger = ( (Integer) object ).toString();
+ if (rawInteger.length() > PADDING)
+ throw new IllegalArgumentException( "Try to pad on a number too big" );
+ StringBuilder paddedInteger = new StringBuilder( );
+ for ( int padIndex = rawInteger.length() ; padIndex < PADDING ; padIndex++ ) {
+ paddedInteger.append('0');
+ }
+ return paddedInteger.append( rawInteger ).toString();
+ }
+} </programlisting>
+ </example>
+
+ <para>Then any property or field can use this bridge thanks to the
+ <literal>@FieldBridge</literal> annotation</para>
+
+ <programlisting><emphasis role="bold">@FieldBridge(impl = PaddedIntegerBridge.class)</emphasis>
+private Integer length; </programlisting>
+
+ <para>Parameters can be passed to the Bridge implementation making it
+ more flexible. The Bridge implementation implements a
+ <classname>ParameterizedBridge</classname> interface, and the
+ parameters are passed through the <literal>@FieldBridge</literal>
+ annotation.</para>
+
+ <example>
+ <title>Passing parameters to your bridge implementation</title>
+
+ <programlisting>public class PaddedIntegerBridge implements StringBridge, <emphasis
+ role="bold">ParameterizedBridge</emphasis> {
+
+ public static String PADDING_PROPERTY = "padding";
+ private int padding = 5; //default
+
+ <emphasis role="bold">public void setParameterValues(Map parameters)</emphasis> {
+ Object padding = parameters.get( PADDING_PROPERTY );
+ if (padding != null) this.padding = (Integer) padding;
+ }
+
+ public String objectToString(Object object) {
+ String rawInteger = ( (Integer) object ).toString();
+ if (rawInteger.length() > padding)
+ throw new IllegalArgumentException( "Try to pad on a number too big" );
+ StringBuilder paddedInteger = new StringBuilder( );
+ for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
+ paddedInteger.append('0');
+ }
+ return paddedInteger.append( rawInteger ).toString();
+ }
+}
+
+
+//property
+ at FieldBridge(impl = PaddedIntegerBridge.class,
+ <emphasis role="bold">params = @Parameter(name="padding", value="10")</emphasis>
+ )
+private Integer length; </programlisting>
+ </example>
+
+ <para>The <classname>ParameterizedBridge</classname> interface can be
+ implemented by <classname>StringBridge</classname> ,
+ <classname>TwoWayStringBridge</classname> ,
+ <classname>FieldBridge</classname> implementations.</para>
+
+ <para>All implementations have to be thread-safe, but the parameters
+ are set during initialization and no special care is required at this
+ stage.</para>
+
+ <para>If you expect to use your bridge implementation on an id
+ property (ie annotated with <literal>@DocumentId</literal> ), you need
+ to use a slightly extended version of <literal>StringBridge</literal>
+ named <classname>TwoWayStringBridge</classname>. Hibernate Search
+ needs to read the string representation of the identifier and generate
+ the object out of it. There is not difference in the way the
+ <literal>@FieldBridge</literal> annotation is used.</para>
+
+ <example>
+ <title>Implementing a TwoWayStringBridge which can for example be
+ used for id properties</title>
+
+ <programlisting>public class PaddedIntegerBridge implements TwoWayStringBridge, ParameterizedBridge {
+
+ public static String PADDING_PROPERTY = "padding";
+ private int padding = 5; //default
+
+ public void setParameterValues(Map parameters) {
+ Object padding = parameters.get( PADDING_PROPERTY );
+ if (padding != null) this.padding = (Integer) padding;
+ }
+
+ public String objectToString(Object object) {
+ String rawInteger = ( (Integer) object ).toString();
+ if (rawInteger.length() > padding)
+ throw new IllegalArgumentException( "Try to pad on a number too big" );
+ StringBuilder paddedInteger = new StringBuilder( );
+ for ( int padIndex = rawInteger.length() ; padIndex < padding ; padIndex++ ) {
+ paddedInteger.append('0');
+ }
+ return paddedInteger.append( rawInteger ).toString();
+ }
+
+ <emphasis role="bold">public Object stringToObject(String stringValue)</emphasis> {
+ return new Integer(stringValue);
+ }
+}
+
+
+//id property
+ at DocumentId
+ at FieldBridge(impl = PaddedIntegerBridge.class,
+ params = @Parameter(name="padding", value="10")
+private Integer id;
+ </programlisting>
+ </example>
+
+ <para>It is critically important for the two-way process to be
+ idempotent (ie object = stringToObject( objectToString( object ) )
+ ).</para>
+ </section>
+
+ <section>
+ <title>FieldBridge</title>
+
+ <para>Some usecases require more than a simple object to string
+ translation when mapping a property to a Lucene index. To give you the
+ greatest possible flexibility you can also implement a bridge as a
+ <classname>FieldBridge</classname>. This interface gives you a
+ property value and let you map it the way you want in your Lucene
+ <classname>Document</classname>.The interface is very similar in its
+ concept to the Hibernate<classname> UserType</classname>'s.</para>
+
+ <para>You can for example store a given property in two different
+ document fields:</para>
+
+ <example>
+ <title>Implementing the FieldBridge interface in order to a given
+ property into multiple document fields</title>
+
+ <programlisting>/**
+ * Store the date in 3 different fields - year, month, day - to ease Range Query per
+ * year, month or day (eg get all the elements of December for the last 5 years).
+ *
+ * @author Emmanuel Bernard
+ */
+public class DateSplitBridge implements FieldBridge {
+ private final static TimeZone GMT = TimeZone.getTimeZone("GMT");
+
+ <emphasis role="bold">public void set(String name, Object value, Document document,
+ LuceneOptions luceneOptions)</emphasis> {
+ Date date = (Date) value;
+ Calendar cal = GregorianCalendar.getInstance(GMT);
+ cal.setTime(date);
+ int year = cal.get(Calendar.YEAR);
+ int month = cal.get(Calendar.MONTH) + 1;
+ int day = cal.get(Calendar.DAY_OF_MONTH);
+
+ // set year
+ Field field = new Field(name + ".year", String.valueOf(year),
+ luceneOptions.getStore(), luceneOptions.getIndex(),
+ luceneOptions.getTermVector());
+ field.setBoost(luceneOptions.getBoost());
+ document.add(field);
+
+ // set month and pad it if needed
+ field = new Field(name + ".month", month < 10 ? "0" : ""
+ + String.valueOf(month), luceneOptions.getStore(),
+ luceneOptions.getIndex(), luceneOptions.getTermVector());
+ field.setBoost(luceneOptions.getBoost());
+ document.add(field);
+
+ // set day and pad it if needed
+ field = new Field(name + ".day", day < 10 ? "0" : ""
+ + String.valueOf(day), luceneOptions.getStore(),
+ luceneOptions.getIndex(), luceneOptions.getTermVector());
+ field.setBoost(luceneOptions.getBoost());
+ document.add(field);
+ }
+}
+
+//property
+<emphasis role="bold">@FieldBridge(impl = DateSplitBridge.class)</emphasis>
+private Date date; </programlisting>
+ </example>
+ </section>
+
+ <section>
+ <title>ClassBridge</title>
+
+ <para>It is sometimes useful to combine more than one property of a
+ given entity and index this combination in a specific way into the
+ Lucene index. The <classname>@ClassBridge</classname> and
+ <classname>@ClassBridge</classname> annotations can be defined at the
+ class level (as opposed to the property level). In this case the
+ custom field bridge implementation receives the entity instance as the
+ value parameter instead of a particular property. Though not shown in
+ this example, <classname>@ClassBridge</classname> supports the
+ <methodname>termVector</methodname> attribute discussed in section
+ <xref linkend="basic-mapping" />.</para>
+
+ <example>
+ <title>Implementing a class bridge</title>
+
+ <programlisting>@Entity
+ at Indexed
+<emphasis role="bold">@ClassBridge</emphasis>(name="branchnetwork",
+ index=Index.TOKENIZED,
+ store=Store.YES,
+ impl = <emphasis role="bold">CatFieldsClassBridge.class</emphasis>,
+ params = @Parameter( name="sepChar", value=" " ) )
+public class Department {
+ private int id;
+ private String network;
+ private String branchHead;
+ private String branch;
+ private Integer maxEmployees
+ ...
+}
+
+
+public class CatFieldsClassBridge implements FieldBridge, ParameterizedBridge {
+ private String sepChar;
+
+ public void setParameterValues(Map parameters) {
+ this.sepChar = (String) parameters.get( "sepChar" );
+ }
+
+ <emphasis role="bold">public void set(String name, Object value, Document document, LuceneOptions luceneOptions)</emphasis> {
+ // In this particular class the name of the new field was passed
+ // from the name field of the ClassBridge Annotation. This is not
+ // a requirement. It just works that way in this instance. The
+ // actual name could be supplied by hard coding it below.
+ Department dep = (Department) value;
+ String fieldValue1 = dep.getBranch();
+ if ( fieldValue1 == null ) {
+ fieldValue1 = "";
+ }
+ String fieldValue2 = dep.getNetwork();
+ if ( fieldValue2 == null ) {
+ fieldValue2 = "";
+ }
+ String fieldValue = fieldValue1 + sepChar + fieldValue2;
+ Field field = new Field( name, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector() );
+ field.setBoost( luceneOptions.getBoost() );
+ document.add( field );
+ }
+}</programlisting>
+ </example>
+
+ <para>In this example, the particular
+ <classname>CatFieldsClassBridge</classname> is applied to the
+ <literal>department</literal> instance, the field bridge then
+ concatenate both branch and network and index the
+ concatenation.</para>
+ </section>
+ </section>
+ </section>
+
+ <section id="provided-id">
+ <title>Providing your own id</title>
+
+ <warning>
+ <para>This part of the documentation is a work in progress.</para>
+ </warning>
+
+ <para>You can provide your own id for Hibernate Search if you are
+ extending the internals. You will have to generate a unique value so it
+ can be given to Lucene to be indexed. This will have to be given to
+ Hibernate Search when you create an org.hibernate.search.Work object - the
+ document id is required in the constructor.</para>
+
+ <section id="ProvidedId">
+ <title>The ProvidedId annotation</title>
+
+ <para>Unlike conventional Hibernate Search API and @DocumentId, this
+ annotation is used on the class and not a field. You also can provide
+ your own bridge implementation when you put in this annotation by
+ calling the bridge() which is on @ProvidedId. Also, if you annotate a
+ class with @ProvidedId, your subclasses will also get the annotation -
+ but it is not done by using the java.lang.annotations. at Inherited. Be
+ sure however, to <emphasis>not</emphasis> use this annotation with
+ @DocumentId as your system will break.</para>
+
+ <example>
+ <title>Providing your own id</title>
+
+ <programlisting>@ProvidedId (bridge = org.my.own.package.MyCustomBridge)
+ at Indexed
+public class MyClass{
+ @Field
+ String MyString;
+ ...
+}</programlisting>
+ </example>
+ </section>
+ </section>
+</chapter>
More information about the hibernate-commits
mailing list