[hibernate-dev] Hibernate Lucene massive rework
Emmanuel Bernard
emmanuel at hibernate.org
Thu Nov 2 18:23:15 EST 2006
Hi all,
I have had time to work on Hibernate Lucene recently and finished the
work I wanted to do. This is a major rework and will bring both API and
index breaks, but for the good. This implements the core ideas that were
floating around for a while.
*What's new
*o Index querying
Keeping the index uptodate is nice, but querying it is even better.
A LuceneSession has been introduced (a wrapper of an Hibernate session).
You can now create a lucene query and get managed object back.
luceneSession.createLuceneQuery(luceneQuery).list(); //get all matching
entities
initialize all entities one by one: setting a sensitive batch-size is
critical
future evolution: use of the Hibernate Core fetch profile once it is
exposed to the user
luceneSession.createLuceneQuery(luceneQuery, Book.class,
Clock.class).iterate(); //get all matching entities of a given type
iterate() read all the lucene index but initialize the entities on demand
luceneSession.createLuceneQuery(luceneQuery).scroll(); //use scrollable
resultset to maximize performance when a subset i needed.
scroll() keep the index opened (Hits) and let you wald through it,
this methods is the most IO/memory efficient
do not forget to close the Scrollable Resultset
All three query methods support setFirstResult/setMaxResult. As a matter
of fact, the LuceneQuery return is an implementation of
org.hibernate.Query, so your code is unaware of Lucene
o object (re)indexation.
You can now index an object, even if you do not apply any change on it.
The index operation is batched to maximize speed.
(luceneSession.index()): If no tx is in progress, the indexing is done
immediately, otherwise the operations are batched and done right after
the transaction commit.
o FieldBridge
Like the Hibernate UserType, a FieldBridge is an interface aiming to do
the translation work between a property and it's indexed (ie String)
representation. This interface is very flexible and even allows you to
map a property into several index fields.
For the simple cases: ie most cases, a StringBridge has been introduced,
it convert your property into a String to be indexed. The API is much
simpler to implement, so I expect most of the custom bridge to use this
approach
o Built-in bridges
There is a built-in support for Date (with resolution), Numbers (ie
java.lang.Number and its subclasses), and String.
I'm willing to expand the support, please tell me what you need
o New event listener / lucene interaction
The Event listener has been reimplemented. It is now threadsafe (ie it
does not depends on the underlying Directory locking mechanism - in a
single VM).
It fixes a flaw in the previous implementation that indexed entities
even when the transaction was rollbacked (yuk!). You should no longer
use the post-commit-* events but the post-* events,
If no tx is in progress, the indexing is done immediately, otherwise the
operations are done right after the transaction commit.
This reimplementation opens the doors to:
- a better batching system (need some adjustments in Hibernate Core)
- the ability to delegate the actual indexation to a remote machine
(through JMS or any other messaging mechanism).
I say 'opens the doors' because the actual implementation is not there
yet (but is not hard I think).
o New annotations
First of all the project has be repackaged, the annotations are now all
under org.hibernate.lucene.annotations
Second, the previous annotations have been deprecated to align with the
Lucene 2.0 APIs @DocumentId and @Field() are now to be used. The old
ones are still here but will be removed in a future release
You can also annotate a property to use a custom field bridge
(@FieldBridge) and inject parameters
@DateBridge allows you do define the resolution (YEAR, MONTH, ...) of a
Date to be indexed
@Boost can be defined on an entity and on a property
o Support for annotated fields
Only annotated properties were supported, you now can annotate fields as
well
o indexing
As described earlier, the interaction with Lucene has been reworked to
allow better efficiency.
Several entities per index, as well a class hierarchy indexing is now
supported
o DirectoryProvider
Still present a pluggable directory provider with 2 default
implementations (Memory and File system).
*When do we get it and Feedbacks?*
I expect to release all this right after Hibernate Core 3.2.1 and as
soon as I update the documentation.
Please have a look at the API and the feature set. Nothing is cast in
stone yet. I'll follow up with a What's next email.
Can I have a preview? Yes, you can get the code from
http://anonsvn.jboss.org/repos/hibernate/branches/Lucene_Integration/
(you need too get Hibernate Core from trunk of branch_3_2
I also have uploaded a snapshot version of the javadoc
http://www.hibernate.org/~emmanuel/lucenesnapshot20061102/doc/api/ check
for the org.hibernate.lucene.* packages
Question:
Should it be part of the Hibernate Annotation 3.2.x series?
Hibernate Lucene is considered experimental (ie still evolving). It does
break the applications using Hibernate Lucene right now, but the
migration will bring a big plus.
If I release it as part of the 3.3.x series that binds me to a Core
release and will delay the adoption.
Maybe it is time for a separate package (event if I don't think this
will really solve the problem)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20061102/17c12cd0/attachment.html
More information about the hibernate-dev
mailing list