[EJB 3.0] - lucene integration questions - jboss-user

Wednesday, 1 November 2006

Hi, I'm currently working on an application that will make use of the experimental
lucene implentation. I managed to get things working in the sense that I have an
application with EJB3 entity beans that are being indexed by lucene in jboss 4.0.5.
I've examined the created index with Luke (a tool for lucene) and things seem to be
indexed properly. IMHO not bad for a few days of work.

Similar to getting the ejb 3.0 stuff going in the first place, I had a lot of problems
with inaccurate/incomplete documentation and had to figure out a lot of non trivial
configuration details myself that the documentation seems to assume self evident. I'm
not whining here, just observing. I understand this is a work in progress and am generally
very excited about this cool new functionality in jboss. BTW I'll be glad to help out
any other users with more details on how I got things working. (you can contact me offline
at jilles AT jillesvangurp.com). You may also be interested in my reply to sergiu's
question here: http://www.jboss.com/index.html?module=bb&op=viewtopic&t=92392

Anyway I still have a lot of remaining questions regarding the lucene integration in
jboss/hibernate.

1) I'm expecting my application to be used in a clustered jboss environment
eventually. I happen to know that lucene is pretty intolerant with respect to having
multiple IndexWriter objects trying to get a file lock. How does jboss deal with this (if
at all)? To be clear: my application will run on a cluster. At any time there may be
multiple entity beans being created/updated and indexed on any of the nodes. At the same
time, users may be searching for entity beans using the lucene index. So there will be
simultaneous reads & writes on the index. Right now, I'm merely assuming that the
hibernate lucene includes functionality that will deal with this in such a way that I
don't get exceptions about file locks. Is that assumption at all correct? If not, what
is the recommended course of action.

2) Is it possible to configure the index that hibernate-lucene creates. Lucene has a lot
of configuration options that tend to get highly relevant if you work with large indexes
(think a few million records in the database and an index of several gigabytes). I'm
anticipating that I might find myself trying to configure/know about those kinds of things
at some point. How/where do I do this and how sensible are the defaults used by
hibernate/lucene ?

3) Can I create an IndexReader the normal way (i.e. as the lucene documentation describes)
to search the indexes or do I need to do special things to avoid conflicts with the index
being written to simultaneously?

BTW. I understand all this stuff is under construction and not recommended yet for
production usage. My backup plan if things go wrong is to create a separate, non clustered
lucene based indexing server should things not work out with the hibernate integration.
That server will be called using rmi. I've worked with lucene before, so I know how to
do this. The reason I'm looking at hibernate lucene is that it seems to deal with most
of the boring work of interacting with lucene. Also, it seems to simplify my deployment
architecture since I won't need a seperate lucene server. My indexing needs are pretty
straightforward (one or two simple entity beans will be all that is indexed).

View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3982314#...

Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[EJB 3.0] - lucene integration questions