[hibernate-dev] Hibernate web search
Adam Warski
adam at warski.org
Sun Sep 21 05:01:36 EDT 2008
Hello,
one feature I find missing from Hibernate Search is a possibility to
easily implement a web search.
A good example is a blog app, where you can search contents of posts.
A post is a simple entity with a "body" field, which is indexed by
Hibernate Search/Lucene.
You then have the normal search box, where the user enters his/hers
query. And now is the problem: what do to with this query?
Solution 1.
Pass it unchanged to the query parser, as is done for example in the
blog example in Seam. But that will in many cases generate exceptions.
For example - when there is an unclosed " or a :. You can of course
catch that exception and return to the user an empty result list - but
that's not what the user excepts.
Solution 2.
Escape any special characters (using QueryParser.escape) and then pass
it safely to query parser - but then the semantics of all special
constructs (like phrases: "...", including/excluding words: +/-, fuzzy
searches: ~ etc) stop to work. That is also not what the user expects.
My proposed solution.
The best way out, in my opinion, is to create a custom query pre-
parser. This parser would be very "forgiving" in case of any syntax
errors.
I think it would be best to support the query syntax that google uses
(that's what the users are accustomed to):
* standard boolean operators AND, OR
* quotes "..."
* fuzzy/synonym search ~ (but in front of the word, not in the end)
* word inclusion/exclusion: +/-
Some Lucene constructs would be disabled, like boosting (^), field-
search (field_name:), *, ?.
Any special characters in invalid positions would be escaped, for
example an unmatched quote or a + without a word following it. The
parser wouldn't be only a syntax-repairer, but also perform other
operations, like moving the ~ from the beginning of a word to the end,
or (maybe) adding a * to the end of each term.
The implementation shouldn't be too complicated using either ANTLR/
JavaCC or simply regular expressions and a string builder.
Needless to say, this would also be useful in Seam :).
What do you think?
--
Adam
More information about the hibernate-dev
mailing list