[hibernate-dev] [AND] Search: changing the way we search
emmanuel at hibernate.org
Tue Mar 4 12:27:19 EST 2014
On 04 Mar 2014, at 15:02, Guillaume Smet <guillaume.smet at gmail.com> wrote:
> On Tue, Mar 4, 2014 at 1:36 PM, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>> OK so you want the words hotel + swimming pool to be present somewhere in the sum of the corpus of title and description. That's the second case I was describing then. Indeed it kinda fails if you don't order by score but rather alphabetically or by distance.
>> Have you considered the following: your query should only consider the top n, or the results whose score reaches 70% of the top score and then do your business sort on this subset.
> It doesn't work. Users want the results which really match and we
> can't have missing results or additional results.
> They search for something and they want to find exactly what they are
> looking for.
> Note: this is really 99.9% of our use cases, probably because we
> mostly develop business applications.
>> Anyways, to address this, one need to target fields that are:
>> - using the same fieldbridge
>> - using the same analyzer
>> - do the trick I was describing around filters like ngrams (and then or)
> That's when I stopped my work on HSEARCH-917. I wasn't sure I could
> decently require such conditions, at least not in the current API.
> I started to wonder if we could introduce a text() branch in parallel
> to keyword() and phrase() but never really posted about it.
> I would like to separate the user responsibility from the developer
> - the user defines his search query. It's a little more clever than
> just a term search: he can use + - and "": that's why I would like to
> use a QueryParser directly (most of our users don't use it but some of
> them need it);
> - the developer defines how the search is done: it can search on
> several fields: for each field, the developer can define a boost (this
> is supported by the SimpleQueryParser) AND he can also define if it's
> a fuzzy query (not supported out of the box by the SimpleQueryParser).
> (we could even imagine to support minimum should match as the dismax
> parser does)
> Because, this is really what we need on a daily basis: my user don't
> really know if his search needs to be fuzzy or not. And I would like
> to be able to make the decision for him because I know the corpus of
> documents and I know it's going to be needed.
> I don't know if it looks like something interesting to you?
Yes a text() branch injecting whatever from the user and letting the developer customise what needs to be searched makes sense to me.
We can explore that but I am a bit skeptical that it will turn into a true `text()` clause rather than be a bit more friendly in other branches of the DSL.
I’m happy to be proven wrong. But I am a bit confused and would love some code example to start from (as in doing what is required).
More information about the hibernate-dev