Re: [hibernate-dev] [AND] Search: changing the way we search

Tuesday, 4 March 2014

On 04 Mar 2014, at 15:02, Guillaume Smet <guillaume.smet(a)gmail.com&gt; wrote:

...
 On Tue, Mar 4, 2014 at 1:36 PM, Emmanuel Bernard
<emmanuel(a)hibernate.org&gt; wrote:
> OK so you want the words hotel + swimming pool to be present somewhere in the sum of
the corpus of title and description. That's the second case I was describing then.
Indeed it kinda fails if you don't order by score but rather alphabetically or by
distance.
> Have you considered the following: your query should only consider the top n, or the
results whose score reaches 70% of the top score and then do your business sort on this
subset.

 It doesn't work. Users want the results which really match and we
 can't have missing results or additional results.

 They search for something and they want to find exactly what they are
 looking for.

 Note: this is really 99.9% of our use cases, probably because we
 mostly develop business applications.

> Anyways, to address this, one need to target fields that are:
> - using the same fieldbridge
> - using the same analyzer
> - do the trick I was describing around filters like ngrams (and then or)

 That's when I stopped my work on HSEARCH-917. I wasn't sure I could
 decently require such conditions, at least not in the current API.

 I started to wonder if we could introduce a text() branch in parallel
 to keyword() and phrase() but never really posted about it.

 I would like to separate the user responsibility from the developer
 responsibility:
 - the user defines his search query. It's a little more clever than
 just a term search: he can use + - and "": that's why I would like to
 use a QueryParser directly (most of our users don't use it but some of
 them need it);
 - the developer defines how the search is done: it can search on
 several fields: for each field, the developer can define a boost (this
 is supported by the SimpleQueryParser) AND he can also define if it's
 a fuzzy query (not supported out of the box by the SimpleQueryParser).
 (we could even imagine to support minimum should match as the dismax
 parser does)

 Because, this is really what we need on a daily basis: my user don't
 really know if his search needs to be fuzzy or not. And I would like
 to be able to make the decision for him because I know the corpus of
 documents and I know it's going to be needed.

 I don't know if it looks like something interesting to you? 
Yes a text() branch injecting whatever from the user and letting the developer customise
what needs to be searched makes sense to me.
We can explore that but I am a bit skeptical that it will turn into a true `text()` clause
rather than be a bit more friendly in other branches of the DSL.
I’m happy to be proven wrong. But I am a bit confused and would love some code example to
start from (as in doing what is required).

searchInput

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] [AND] Search: changing the way we search