[hibernate-dev] Hibernate search(lucene) update question & opinion.
Emmanuel Bernard
emmanuel at hibernate.org
Fri Nov 24 11:14:50 EST 2006
The problem is not solved "automatically" yet. But here is the reason why.
First of all, the elements are properly deleted, and the index file
reflect that as soon as IndexReader.close() is called (which is done by
Hibernate Search). However, the files containing the document data are
not cleaned.
Basically, to physically delete the elements, you need to optimize()
your index, this operation typically takes more time and doing it for
every single delete is not appropriate.
I see 4 solutions:
1. let the user access the Directory and call the
indexWriter.optimize() method. It works today but it sucks.
2. add an FullTextSession.optimize(Class) method, and the application
is responsible for the method trigger. This is the easy solution but put
more work on the user. And I'm not happy to push a maintenance API to
the Session. Esp since this is more of a SessionFactory like API.
2bis. publish this API through JMX
3. optimize every N operations. It is trivial to add a counter per
DirectoryProvider and trigger the optimization. N should be customizable
4. optimize every N seconds. either by setting a timer in an additional
thread or wait for the next operation and compare the timestamp
2(bis or not) and 3 are my favorites. It's really 2 or 3 hours of work.
If someone is interested, ping me.
http://opensource.atlassian.com/projects/hibernate/browse/ANN-495
Jin Yiqing wrote:
> Hi,
>
> I found the cool new feature of hibernate search in Lucene-user mail
> list by Emmanuel's mail.
>
> u guys did a very greate job! Since i am now working on a system that
> using lucene to implement a search engine, i would like to know some
> more details about Hinbernate Search.
>
> I have read some of the code in Hinbernate 3.2GA release, the code
> is pretty cool, but there is one thing i doubt:
>
> For the update operation Hibernate Search used remove &
> update.which refer to Lucene's deleteDocuments method. This will
> works fine when the operation does not have high frequency and the
> index is new.
> But as i know the remove operation in lucene only marked a tag
> for the deleted document without actually delete the data ofr index
> files. We know in some systems the data would be updated in a very
> high frequency(eg. an traffic status query system), then it will not
> take a long time that the index will be filled with lots of expired
> document data, even if the data update is not as fast as a traffic
> status, i think this problem is still very critical since things we
> stored in databases are always updating.
>
> Is there any way to solve this in Hibernate Search?
>
> thanks,
> Richie
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
More information about the hibernate-dev
mailing list