[hibernate-dev] HSEARCH: Removing dynamic analyzer mapping?

Emmanuel Bernard emmanuel at hibernate.org
Tue Jun 30 09:12:39 EDT 2015


If we feel short handed, we could do the following:

1. disable the feature and raise an exception when someone uses it with a pointer to the JIRA to restore it
    that way we will know how many people we pissed off and we can feed the use cases to our Lucene friends
2. Work on a workaround if the JIRa becomes popular or compelling. A mutable analyzer or the preanalized approach has my preference.

> On 30 Jun 2015, at 13:57, Sanne Grinovero <sanne at hibernate.org> wrote:
> 
> Among the many changes of Apache Lucene 5, it is no longer possible to
> override the Analyzer on a per-document base.
> 
> You have to pick a single Analyzer when opening the IndexWriter.
> Of course the Analyzer can still return a different tokenization chain
> for each field, but the field->tokenizer mapping has to be consistent
> for the lifecycle of the IndexWriter.
> 
> This means we might need to drop our "Dynamic Analyzer" feature:
> http://docs.jboss.org/hibernate/search/5.4/reference/en-US/html_single/#_dynamic_analyzer_selection
> 
> I did ask to restore the functionality:
> https://issues.apache.org/jira/browse/LUCENE-6212
> 
> So, the alternatives I'm seeing:
> # Dropping the Dynamic Analyzer feature
> # Cheat and pass in a mutable Analyzer - needs some caution re concurrent usage
> # Cheat and pass in a pre-analyzed Document
> # Fork & patch the IndexWriter
> 
> Patching the functionality back in Lucene is trivial, but the Lucene
> team needs  to agree on the use case and then the release time will be
> long.
> 
> We should discuss both a short-term solution and the better long-term solution.
> 
> My favourite long-term solution would be to do pre-analysis: in our
> master/slave clustering approach, that would have several other
> benefits:
> - move the analyzer work to the slaves
> - reduce the network payloads
> - remove the need to be able to serialize analyzers
> But I'd prefer to do this in a second "polishing phase" rather than
> consider such a backend rewrite as a blocker for Lucene 5.
> 
> WDYT?
> 
> Thanks,
> Sanne
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev




More information about the hibernate-dev mailing list