Re: [hibernate-dev] [Wildcard] Search: changing the way we search

Tuesday, 4 March 2014

This email is about wildcard

On 03 Mar 2014, at 17:11, Guillaume Smet <guillaume.smet(a)hibernate.org&gt; wrote:
...

 And why is it not ideal:
 3/ wildcard and analyzers are really a pain with Lucene and you need
 to implement your own cleaning stuff to get a working wildcard query.

 IV. About wildcard queries
 --------------------------------------

 Let's say it frankly: wildcard queries are a pain in Lucene.

 Let's take an example:
 - You index "Parking" and you have a LowerCaseFilter so your index
 contains "parking";
 - You search for Parking without wildcard, it will work;
 - You search for Parki* with wildcard, yeah, it won't work.

 This is due to the fact that for wildcards, the analyzers are ignored.
 Usually, because if you use ? or *, they can be replaced by the
 filters you use in your analyzers.

 While we all understand the Lucene point of view from a technical
 perspective, I don't think we can keep this position for Hibernate
 Search as a user friendly search framework on top of Hibernate.

 At Open Wide, we have a quite complex method which rewrites a search
 as a working autocompletion search which might work most of the time
 (with a high value of most...). It's kinda ugly, far from perfect and
 I'm wondering if we could have something more clever in Search. I once
 talked with Emmanuel about having different analyzers for Indexing,
 Querying (this is the Solr way) and Wildcards/Fuzzy search (this is
 IMHO a good idea as the way you want to normalize your wildcard query
 highly depends on the analyzer used to index your data). 
I would like to separate the notion of autosuggestion from the wildcard problem. To me
they are separate and I would love to Hibernate Search to offer an autosuggest and spell
checker API.

Back to wildcard. If we have an analyser stack that separates normaliser filters from
filters generating additional tokens (see my email [AND]), then it is piece of cake to
apply the right filters, raise an exception if someone tries to wildcard on ngrams, and
simply ignore the synonym filter.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] [Wildcard] Search: changing the way we search