I like the idea of a parser using the Google syntax (you don't
to disable explicit fields BTW - recognzing a term:term syntax
should be doable). The hard problem to crack is what's behind. I
explain that in Hibernate Search in Action, a lot of good search
engine do searches in tiers:
- exact search
- phonetic search
- fuzzy search
- replace ANDs with ORs
In one application, where I indexed relatively short names
could contain spaces etc), along with each name I also indexed a
special version, which was one term, with all spaces converted to _
(so the name "Foo Bar" became "foo_bar". When the user submitted the
query, I also created a special version of his query in the same way,
and added a fuzzy search on that - and this gave quite good results.
I guess you can simulate part of it by boosting exact fields as
opposed to approximation fields in the multi field query parser.
This was not really possible until recently but the
SearchFactory.getAnalyzer(MyEntity.class) makes it much easier.
Right :) I think it
would require some fine-tuning, but once we have
the query parsed to "our" representation (or maybe we could reuse the
parsed tree Lucene produces) we can manipulate it as we want. And the
output we produce could be configurable (which tiers to include, with
what boost factors etc.)
We should add the Google like feature to the 3.2 list amongst other
higher level query enhancement like spell checking.
Who wants to take the lead? I have always considered grammar and
parser developments awkward for my tastes :)
Well, I could do it, if my boss (Mark
Newton) and you agree, unless
there's somebody from the Hibernate team taking care of it :)