[hibernate-dev] HSEARCH - Different analyzers for Indexing and Querying

Thu Aug 22 07:40:24 EDT 2013

Hi Emmanuel,

On Wed, Aug 21, 2013 at 5:20 PM, Emmanuel Bernard
<emmanuel at hibernate.org> wrote:
> Can you explain to me why you would need a different analyzer for a
> wildcard query? My brain is still tanning on the beach.

Well, it's the Lucene way. Wildcard queries are usually not analyzed
(see QueryParser or what you've done in Hibernate Search to be
consistent with Lucene).

The reason often mentioned is that you can have analyzers removing *
and ? which could be quite annoying for wildcard queries...

Apart from the filtering, you might also want a different tokenizer
for wildcard queries.

The fact is that you HAVE to "analyze" your search terms to have
results. Typically, we use analyzers to remove the accents and
lowercase the search terms:
- for a standard search, we can pass the search terms as is;
- for the wildcard and fuzzy queries, we are forced to filter the
search terms (lowercase + remove accents) before passing them to
Hibernate Search/Lucene.

That's why I thought having a specific analyzer for this might help.

> Brainstorming here we could do the following
>
> @AnalyzerDef.target
>
> enum AnalyzerTarget { ALL, INDEXING, QUERY, WILDCARD }

Instead of ALL, I would prefer DEFAULT, I think.

> But that would also change the API for the dynamic analyzer I suppose.

Yep,

> It also does not cover the @Analyzer.impl usage.

Yep.

I haven't thought thoroughly about all the consequences of this idea.
It's something which is often in our way so I thought it might be
worth it to mention it and see if people are interested in it.

If so, I can probably prototype something to check what would be the
issues to implement this change.

-- 
Guillaume