On Tue, Mar 4, 2014 at 11:09 AM, Emmanuel Bernard
<emmanuel(a)hibernate.org> wrote:
I would like to separate the notion of autosuggestion from the
wildcard problem. To me they are separate and I would love to Hibernate Search to offer an
autosuggest and spell checker API.
AFAICS from the changelog of each version, autosuggest is still a vast
work in progress in Lucene/Solr.
Back to wildcard. If we have an analyser stack that separates
normaliser filters from filters generating additional tokens (see my email [AND]), then it
is piece of cake to apply the right filters, raise an exception if someone tries to
wildcard on ngrams, and simply ignore the synonym filter.
In theory, yes.
But for the tokenization, we use WhitespaceTokenizer and
WordDelimiterFilter which generates new tokens (for example, depending
on the options you use, you can index wi-fi as wi and fi, wi-fi and
wifi).
The problem of this particular filter is also that we put it after the
ASCIIFoldingFilter because we want the input to be as clean as
possible but before the LowerCaseFilter as WordDelimiterFilter can do
its magic on case change too.
If you separate normalizer from tokenizer, I don't think it's going to
be easy to order them adequately.