[SEARCH] Query-only analyzers with Elasticsearch - new annotation? - hibernate-dev

[SEARCH] Query-only analyzers with Elasticsearch - new annotation?

Hibernate OGM 5.1 Beta3 and 5.0.4...

NoORM IRC meeting minutes

Yoann Rodiere

Wednesday, 4 January 2017 Wed, 4 Jan '17

11 a.m.

Hello team, I'm currently working on HSEARCH-2534, "Query-only analyzer definitions are never added to the index settings with Elasticsearch". This issue is about using analyzers only when querying with Elasticsearch. It is already possible with Lucene, but not in Elasticsearch, because we assume that any analyzer definition that is not referenced by a @Analyzer annotation is a Lucene analyzer [1]. To be precise, the exact place where query-only analyzers are used is in EntityContext.overridesForField [2], and the overrides are leveraged even with Elasticsearch, for instance in ConnectedMultiFieldsTermQueryBuilder [3]. I can see two solutions to the issue: 1. Make all analyzer definitions available for all indexing services. 2. Allow users to define, for each entity, which analyzer definitions will be necessary when querying, even though the definitions are not used when indexing. Solution 1 seems quite hard to implement correctly. First we'd have to have a different namespace for each indexing service, but I've already implemented that much. Second, some analyzer definitions are only valid for one indexing service, and not for the other. For instance, analyzer definitions using ElasticsearchTokenFilterFactory are specific to Elasticsearch. And Analyzer definitions using the WhitespaceTokenizerFactory with the "rule" parameter are only valid with embedded Lucene. And so on. To sum up, I'm not sure we can do something smart. Solution 2 is easier to implement, but requires to add a bit of API: the way for users to declare that a given analyzer definition is to be available when querying a given entity. I would add type-level @QueryAnalyzer(definition = "foo") and @QueryAnalyzers annotation. I know nobody wants to add new annotations in a minor, but right now that seems to be the only workable solution. What do you think? [1] https://github.com/hibernate/hibernate-search/blob/1847bd222128395056cdf6... [2] https://github.com/hibernate/hibernate-search/blob/1847bd222128395056cdf6... [3] https://github.com/hibernate/hibernate-search/blob/1847bd222128395056cdf6... Yoann Rodière <yoann(a)hibernate.org> Hibernate NoORM Team

Show replies by thread

Sanne Grinovero

Thursday, 5 January Thu, 5 Jan

7:04 a.m.

Hello, I'm wondering how you'd all feel about the third solution: 3. don't do it. This depends of course how far it is blocking in practice. Maybe I'm missing something, but couldn't a user simply use an additional @AnalyzerDef, so that the analyzer definition is associated to a name, and use that? I guess that I'm missing why you'd want to force people to express that a specific Analyzer is meant to be used only at query time differently than one which is used at indexing time. If there is need to clearly make such discrimination then this should be made very clear to our users too, so I'd prefer if we could avoid introducing new concepts for people to learn.. unless there's strong need of course. Is this issue relating to a specific user request? Thanks, Sanne On 4 January 2017 at 16:00, Yoann Rodiere <yoann(a)hibernate.org> wrote:

...

Yoann Rodiere

8:06 a.m.

...

I'm wondering how you'd all feel about the third solution: 3. don't do it. This depends of course how far it is blocking in practice.

"Don't do it until 6.0" would be acceptable, I guess, since it's still just a technical preview. Though we would introduce a limitation that would only be our fault (since Elasticsearch supports query-time analyzers) and that would not exist with the Lucene integration. "Don't do it ever" seems really bad. As we've already discussed at length (multiple times), not being able to define analyzers from Hibernate Search would be a real pain for users, especially in Elasticsearch 5. That's true for indexing analyzers, and that's also true for querying-only analyzers. I wouldn't say that query-only analyzers are widespread, but they're at least useful, and I'm sure there are problems that can *only* be solved by using a different analyzer when querying than when indexing...

...

I guess that I'm missing why you'd want to force people to express that a specific Analyzer is meant to be used only at query time differently than one which is used at indexing time. If there is need to clearly make such discrimination then this should be made very clear to our users too, so I'd prefer if we could avoid introducing new concepts for people to learn.. unless there's strong need of course.

Analyzer definitions are interpreted as either Lucene analyzers (to be instantiated) or Elasticsearch analyzers (to be pushed to the ES index settings) based on where they are referenced (using @Analyzer). When I say an analyzer definition is query-only, it means there is an @AnalyzerDefinition but there isn't any @Analyzer referencing it. So Hibernate Search wouldn't know how to interpret it (ES or Lucene). Currently, the default for those definitions is to interpret them as Lucene analyzers, which leads to HSEARCH-2534: we can't have Elasticsearch query-only analyzers. Maybe with this piece of information, my original message makes more sense? I.e.: 1. Solution 1, interpret those definitions as both Lucene and Elasticsearch analyzer (there are problems with that, see my first message) 2. Solution 2, make users "reference" those definitions using a new @QueryAnalyzer annotation.

...

Maybe I'm missing something, but couldn't a user simply use an additional @AnalyzerDef, so that the analyzer definition is associated to a name, and use that?

As mentioned above, an @AnalyzerDef that is not referenced is considered as a Lucene analyzer, so it's not pushed to Elasticsearch and it can't be used when querying Elasticsearch. The only workaround I see would be to add a dummy, always-empty field like that: @Transient @Field(name = "__dummy", analyzer = @Analyzer(definition = "myQueryOnlyAnalyzer)) public String getMyQueryOnlyAnalyzerDummyField() { return null; } Which means there will be a useless field in the schema just to make Hibernate Search happy.

...

Is this issue relating to a specific user request?

No, it's just a feature that is available for Lucene but not for Elasticsearch. Yoann Rodière <yoann(a)hibernate.org> Hibernate NoORM Team On 5 January 2017 at 13:04, Sanne Grinovero <sanne(a)hibernate.org> wrote: > Hello, > > I'm wondering how you'd all feel about the third solution: > > 3. don't do it. > > This depends of course how far it is blocking in practice. Maybe I'm > missing something, but couldn't a user simply use an additional > @AnalyzerDef, so that the analyzer definition is associated to a name, > and use that? >

...

Is this issue relating to a specific user request?

> > Thanks, > Sanne > > > > On 4 January 2017 at 16:00, Yoann Rodiere <yoann(a)hibernate.org> wrote: > > Hello team, > > > > I'm currently working on HSEARCH-2534, "Query-only analyzer definitions > are > > never added to the index settings with Elasticsearch". > > This issue is about using analyzers only when querying with > Elasticsearch. > > It is already possible with Lucene, but not in Elasticsearch, because we > > assume that any analyzer definition that is not referenced by a @Analyzer > > annotation is a Lucene analyzer [1]. > > > > To be precise, the exact place where query-only analyzers are used is in > > EntityContext.overridesForField [2], and the overrides are leveraged > even > > with Elasticsearch, for instance in ConnectedMultiFieldsTermQueryBuilder > > [3]. > > > > I can see two solutions to the issue: > > > > 1. Make all analyzer definitions available for all indexing services. > > 2. Allow users to define, for each entity, which analyzer definitions > > will be necessary when querying, even though the definitions are not > used > > when indexing. > > > > Solution 1 seems quite hard to implement correctly. > > First we'd have to have a different namespace for each indexing service, > > but I've already implemented that much. > > Second, some analyzer definitions are only valid for one indexing > service, > > and not for the other. > > For instance, analyzer definitions using ElasticsearchTokenFilterFactory > > are specific to Elasticsearch. And Analyzer definitions using > > the WhitespaceTokenizerFactory with the "rule" parameter are only valid > > with embedded Lucene. And so on. To sum up, I'm not sure we can do > > something smart. > > > > Solution 2 is easier to implement, but requires to add a bit of API: the > > way for users to declare that a given analyzer definition is to be > > available when querying a given entity. I would add type-level > > @QueryAnalyzer(definition = "foo") and @QueryAnalyzers annotation. > > > > I know nobody wants to add new annotations in a minor, but right now that > > seems to be the only workable solution. > > > > What do you think? > > > > [1] > > https://github.com/hibernate/hibernate-search/blob/ > 1847bd222128395056cdf6e7cfb601ceed5e40c3/engine/src/main/ > java/org/hibernate/search/engine/impl/ConfigContext.java#L277 > > [2] > > https://github.com/hibernate/hibernate-search/blob/ > 1847bd222128395056cdf6e7cfb601ceed5e40c3/engine/src/main/ > java/org/hibernate/search/query/dsl/EntityContext.java#L14 > > [3] > > https://github.com/hibernate/hibernate-search/blob/ > 1847bd222128395056cdf6e7cfb601ceed5e40c3/engine/src/main/ > java/org/hibernate/search/query/dsl/impl/ConnectedMultiFieldsTermQueryB > uilder.java#L222 > > > > > > Yoann Rodière <yoann(a)hibernate.org> > > Hibernate NoORM Team > > _______________________________________________ > > hibernate-dev mailing list > > hibernate-dev(a)lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/hibernate-dev >

Sanne Grinovero

9:06 a.m.

On 5 January 2017 at 13:06, Yoann Rodiere <yoann(a)hibernate.org> wrote:

...

> I'm wondering how you'd all feel about the third solution: > 3. don't do it. > This depends of course how far it is blocking in practice. "Don't do it until 6.0" would be acceptable, I guess, since it's still just a technical preview. Though we would introduce a limitation that would only be our fault (since Elasticsearch supports query-time analyzers) and that would not exist with the Lucene integration. "Don't do it ever" seems really bad. As we've already discussed at length (multiple times), not being able to define analyzers from Hibernate Search would be a real pain for users, especially in Elasticsearch 5. That's true for indexing analyzers, and that's also true for querying-only analyzers. I wouldn't say that query-only analyzers are widespread, but they're at least useful, and I'm sure there are problems that can only be solved by using a different analyzer when querying than when indexing...

I don't disagree, I'm merely aiming to have in future Analyzer(s) defined in a non-Lucene specific way, possibly allowing controlled exceptions. When changing the definitions API we'll be able to reconsider if we want Analyzer definitions to be scoped per index like Elasticsearch does. But since today the Analyzer map is "global" (as in one map per SearchIntegrator), I don't see why we can't treat them consistently on both technologies and consider them global on one ES as well, i.e. we'd copy all definitions to each ES index definition. Sure that wouldn't allow to map on Hibernate Search an existing ES cluster which uses conflicting names on different indexes, but reverse engineering of existing ES clusters isn't our focus at this time; people badly needing it can change their names to saner choices (as I'd argue that name reuse for different things wouldn't be a sane configuration, probably it won't be common either).

...

> I guess that I'm missing why you'd want to force people to express > that a specific Analyzer is meant to be used only at query time > differently than one which is used at indexing time. > If there is need to clearly make such discrimination then this should > be made very clear to our users too, so I'd prefer if we could avoid > introducing new concepts for people to learn.. unless there's strong > need of course. Analyzer definitions are interpreted as either Lucene analyzers (to be instantiated) or Elasticsearch analyzers (to be pushed to the ES index settings) based on where they are referenced (using @Analyzer). When I say an analyzer definition is query-only, it means there is an @AnalyzerDefinition but there isn't any @Analyzer referencing it. So Hibernate Search wouldn't know how to interpret it (ES or Lucene). Currently, the default for those definitions is to interpret them as Lucene analyzers, which leads to HSEARCH-2534: we can't have Elasticsearch query-only analyzers.

Ok, I understand the status quo, but is there something which prevents us to refine this decision and rather generate ES definitions out of all known Analyzer definitions, rather than just the ones being referred? Let's keep in mind that we're only able "translate" a very limited set of well-known Analyzer definitions so - while it's cool to help migrations were we can - our primary focus is to make sure that people can use any custom Analyzer configuration which they have defined "manually" on ES. In short, I think what matters most now is not how to define such analyzers as there are viable (better?) alternatives, but we need to make sure one can run a query with the right query-time overrides, especially be able to refer to an Analyzer which has been manually defined on ES but is possibly not known to us. (As discussed previously with the exception of More-Like-This Queries which will have to wait). Thanks, Sanne

...

Maybe with this piece of information, my original message makes more sense? I.e.: Solution 1, interpret those definitions as both Lucene and Elasticsearch analyzer (there are problems with that, see my first message) Solution 2, make users "reference" those definitions using a new @QueryAnalyzer annotation. > Maybe I'm > missing something, but couldn't a user simply use an additional > @AnalyzerDef, so that the analyzer definition is associated to a name, > and use that? As mentioned above, an @AnalyzerDef that is not referenced is considered as a Lucene analyzer, so it's not pushed to Elasticsearch and it can't be used when querying Elasticsearch. The only workaround I see would be to add a dummy, always-empty field like that: @Transient @Field(name = "__dummy", analyzer = @Analyzer(definition = "myQueryOnlyAnalyzer)) public String getMyQueryOnlyAnalyzerDummyField() { return null; } Which means there will be a useless field in the schema just to make Hibernate Search happy. > Is this issue relating to a specific user request? No, it's just a feature that is available for Lucene but not for Elasticsearch. Yoann Rodière <yoann(a)hibernate.org> Hibernate NoORM Team On 5 January 2017 at 13:04, Sanne Grinovero <sanne(a)hibernate.org> wrote: > > Hello, > > I'm wondering how you'd all feel about the third solution: > > 3. don't do it. > > This depends of course how far it is blocking in practice. Maybe I'm > missing something, but couldn't a user simply use an additional > @AnalyzerDef, so that the analyzer definition is associated to a name, > and use that? > > I guess that I'm missing why you'd want to force people to express > that a specific Analyzer is meant to be used only at query time > differently than one which is used at indexing time. > If there is need to clearly make such discrimination then this should > be made very clear to our users too, so I'd prefer if we could avoid > introducing new concepts for people to learn.. unless there's strong > need of course. > > Is this issue relating to a specific user request? > > Thanks, > Sanne > > > > On 4 January 2017 at 16:00, Yoann Rodiere <yoann(a)hibernate.org> wrote: > > Hello team, > > > > I'm currently working on HSEARCH-2534, "Query-only analyzer definitions > > are > > never added to the index settings with Elasticsearch". > > This issue is about using analyzers only when querying with > > Elasticsearch. > > It is already possible with Lucene, but not in Elasticsearch, because we > > assume that any analyzer definition that is not referenced by a > > @Analyzer > > annotation is a Lucene analyzer [1]. > > > > To be precise, the exact place where query-only analyzers are used is in > > EntityContext.overridesForField [2], and the overrides are leveraged > > even > > with Elasticsearch, for instance in ConnectedMultiFieldsTermQueryBuilder > > [3]. > > > > I can see two solutions to the issue: > > > > 1. Make all analyzer definitions available for all indexing services. > > 2. Allow users to define, for each entity, which analyzer definitions > > will be necessary when querying, even though the definitions are not > > used > > when indexing. > > > > Solution 1 seems quite hard to implement correctly. > > First we'd have to have a different namespace for each indexing service, > > but I've already implemented that much. > > Second, some analyzer definitions are only valid for one indexing > > service, > > and not for the other. > > For instance, analyzer definitions using ElasticsearchTokenFilterFactory > > are specific to Elasticsearch. And Analyzer definitions using > > the WhitespaceTokenizerFactory with the "rule" parameter are only valid > > with embedded Lucene. And so on. To sum up, I'm not sure we can do > > something smart. > > > > Solution 2 is easier to implement, but requires to add a bit of API: the > > way for users to declare that a given analyzer definition is to be > > available when querying a given entity. I would add type-level > > @QueryAnalyzer(definition = "foo") and @QueryAnalyzers annotation. > > > > I know nobody wants to add new annotations in a minor, but right now > > that > > seems to be the only workable solution. > > > > What do you think? > > > > [1] > > > > https://github.com/hibernate/hibernate-search/blob/1847bd222128395056cdf6... > > [2] > > > > https://github.com/hibernate/hibernate-search/blob/1847bd222128395056cdf6... > > [3] > > > > https://github.com/hibernate/hibernate-search/blob/1847bd222128395056cdf6... > > > > > > Yoann Rodière <yoann(a)hibernate.org> > > Hibernate NoORM Team > > _______________________________________________ > > hibernate-dev mailing list > > hibernate-dev(a)lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/hibernate-dev

Yoann Rodiere

9:42 a.m.

...

Actually, for HSEARCH-2534, it would be enough if analyzer definitions were scoped by indexing service (Lucene/ES). But sure, that would be a good solution. If we wait for 6.0.

...

Sure that wouldn't allow to map on Hibernate Search an existing ES cluster which uses conflicting names on different indexes, but reverse engineering of existing ES clusters isn't our focus at this time; people badly needing it can change their names to saner choices (as I'd argue that name reuse for different things wouldn't be a sane configuration, probably it won't be common either).

I agree with you on that. To be honest I didn't even think of such an issue, since currently the analyzer definitions are scoped globally.

...

is there something which prevents us to refine this decision and rather generate ES definitions out of all known Analyzer definitions, rather than just the ones being referred?

Well, yes, that was in my first message; see below:

...

First we'd have to have a different namespace for each indexing service, but I've already implemented that much. Second, some analyzer definitions are only valid for one indexing service, and not for the other. For instance, analyzer definitions using ElasticsearchTokenFilterFactory are specific to Elasticsearch. And Analyzer definitions using the WhitespaceTokenizerFactory with the "rule" parameter are only valid with embedded Lucene. And so on. To sum up, I'm not sure we can do something smart.

What prevents us to generate ES definitions out of all known analyzer definitions is that there may be definitions that *cannot* be translated to ES, simply because they are supposed to be used only with Lucene. I guess we could say "let's try to generate ES definitions, and if it fails just ignore it and log a warning", but it seems a bit unsafe...

...

Let's keep in mind that we're only able "translate" a very limited set of well-known Analyzer definitions [...]

For translations it's true, but any ES analyzer definition can be expressed with Hibernate Search by using Elasticsearch*Factory. In fact, it's the recommended approach. See https://docs.jboss.org/hibernate/search/5.6/reference/en-US/html_single/#... .

...

In short, I think what matters most now is not how to define such analyzers as there are viable (better?) alternatives, but we need to make sure one can run a query with the right query-time overrides, especially be able to refer to an Analyzer which has been manually defined on ES but is possibly not known to us. (As discussed previously with the exception of More-Like-This Queries which will have to wait).

We already have discussed this many times, but once again: users will not be able to define their analyzers manually on ES starting from ES 5.0 for various reasons. So that's clearly not a long-term solution. It's "viable" for now, but since it's not future-proof it's certainly not better. As for the short term, if I understand correctly, what you're proposing is that users don't add an @AnalyzerDef for query-only analyzers, and that we allow using unknown analyzers in queries? I guess we could do that, but that basically amounts to solution 3 "don't do it". Which is fine as long as we plan to fix it later. Also note we'd still have to explain users that query-only analyzer definitions are not supported with Elasticsearch. Yoann Rodière <yoann(a)hibernate.org> Hibernate NoORM Team On 5 January 2017 at 15:06, Sanne Grinovero <sanne(a)hibernate.org> wrote: > On 5 January 2017 at 13:06, Yoann Rodiere <yoann(a)hibernate.org> wrote: > >> I'm wondering how you'd all feel about the third solution: > >> 3. don't do it. > >> This depends of course how far it is blocking in practice. > > > > "Don't do it until 6.0" would be acceptable, I guess, since it's still > just > > a technical preview. Though we would introduce a limitation that would > only > > be our fault (since Elasticsearch supports query-time analyzers) and that > > would not exist with the Lucene integration. > > > > "Don't do it ever" seems really bad. As we've already discussed at length > > (multiple times), not being able to define analyzers from Hibernate > Search > > would be a real pain for users, especially in Elasticsearch 5. That's > true > > for indexing analyzers, and that's also true for querying-only analyzers. > > I wouldn't say that query-only analyzers are widespread, but they're at > > least useful, and I'm sure there are problems that can only be solved by > > using a different analyzer when querying than when indexing... >

...

> > But since today the Analyzer map is "global" (as in one map per > SearchIntegrator), I don't see why we can't treat them consistently on > both technologies and consider them global on one ES as well, i.e. > we'd copy all definitions to each ES index definition.

...

> > > > >> I guess that I'm missing why you'd want to force people to express > >> that a specific Analyzer is meant to be used only at query time > >> differently than one which is used at indexing time. > >> If there is need to clearly make such discrimination then this should > >> be made very clear to our users too, so I'd prefer if we could avoid > >> introducing new concepts for people to learn.. unless there's strong > >> need of course. > > > > Analyzer definitions are interpreted as either Lucene analyzers (to be > > instantiated) or Elasticsearch analyzers (to be pushed to the ES index > > settings) based on where they are referenced (using @Analyzer). > > When I say an analyzer definition is query-only, it means there is an > > @AnalyzerDefinition but there isn't any @Analyzer referencing it. So > > Hibernate Search wouldn't know how to interpret it (ES or Lucene). > > Currently, the default for those definitions is to interpret them as > Lucene > > analyzers, which leads to HSEARCH-2534: we can't have Elasticsearch > > query-only analyzers. > > Ok, I understand the status quo, but is there something which prevents > us to refine this decision and rather generate ES definitions out of > all known Analyzer definitions, rather than just the ones being > referred? > > Let's keep in mind that we're only able "translate" a very limited set > of well-known Analyzer definitions so - while it's cool to help > migrations were we can - our primary focus is to make sure that people > can use any custom Analyzer configuration which they have defined > "manually" on ES. >

...

> > Thanks, > Sanne > > > > > Maybe with this piece of information, my original message makes more > sense? > > I.e.: > > > > Solution 1, interpret those definitions as both Lucene and Elasticsearch > > analyzer (there are problems with that, see my first message) > > Solution 2, make users "reference" those definitions using a new > > @QueryAnalyzer annotation. > > > >> Maybe I'm > >> missing something, but couldn't a user simply use an additional > >> @AnalyzerDef, so that the analyzer definition is associated to a name, > >> and use that? > > > > As mentioned above, an @AnalyzerDef that is not referenced is considered > as > > a Lucene analyzer, so it's not pushed to Elasticsearch and it can't be > used > > when querying Elasticsearch. > > The only workaround I see would be to add a dummy, always-empty field > like > > that: > > > > @Transient > > @Field(name = "__dummy", analyzer = @Analyzer(definition = > > "myQueryOnlyAnalyzer)) > > public String getMyQueryOnlyAnalyzerDummyField() { > > return null; > > } > > > > Which means there will be a useless field in the schema just to make > > Hibernate Search happy. > > > >> Is this issue relating to a specific user request? > > > > No, it's just a feature that is available for Lucene but not for > > Elasticsearch. > > > > > > Yoann Rodière <yoann(a)hibernate.org> > > Hibernate NoORM Team > > > > On 5 January 2017 at 13:04, Sanne Grinovero <sanne(a)hibernate.org> wrote: > >> > >> Hello, > >> > >> I'm wondering how you'd all feel about the third solution: > >> > >> 3. don't do it. > >> > >> This depends of course how far it is blocking in practice. Maybe I'm > >> missing something, but couldn't a user simply use an additional > >> @AnalyzerDef, so that the analyzer definition is associated to a name, > >> and use that? > >> > >> I guess that I'm missing why you'd want to force people to express > >> that a specific Analyzer is meant to be used only at query time > >> differently than one which is used at indexing time. > >> If there is need to clearly make such discrimination then this should > >> be made very clear to our users too, so I'd prefer if we could avoid > >> introducing new concepts for people to learn.. unless there's strong > >> need of course. > >> > >> Is this issue relating to a specific user request? > >> > >> Thanks, > >> Sanne > >> > >> > >> > >> On 4 January 2017 at 16:00, Yoann Rodiere <yoann(a)hibernate.org> wrote: > >> > Hello team, > >> > > >> > I'm currently working on HSEARCH-2534, "Query-only analyzer > definitions > >> > are > >> > never added to the index settings with Elasticsearch". > >> > This issue is about using analyzers only when querying with > >> > Elasticsearch. > >> > It is already possible with Lucene, but not in Elasticsearch, because > we > >> > assume that any analyzer definition that is not referenced by a > >> > @Analyzer > >> > annotation is a Lucene analyzer [1]. > >> > > >> > To be precise, the exact place where query-only analyzers are used is > in > >> > EntityContext.overridesForField [2], and the overrides are leveraged > >> > even > >> > with Elasticsearch, for instance in ConnectedMultiFieldsTermQueryB > uilder > >> > [3]. > >> > > >> > I can see two solutions to the issue: > >> > > >> > 1. Make all analyzer definitions available for all indexing > services. > >> > 2. Allow users to define, for each entity, which analyzer > definitions > >> > will be necessary when querying, even though the definitions are > not > >> > used > >> > when indexing. > >> > > >> > Solution 1 seems quite hard to implement correctly. > >> > First we'd have to have a different namespace for each indexing > service, > >> > but I've already implemented that much. > >> > Second, some analyzer definitions are only valid for one indexing > >> > service, > >> > and not for the other. > >> > For instance, analyzer definitions using > ElasticsearchTokenFilterFactory > >> > are specific to Elasticsearch. And Analyzer definitions using > >> > the WhitespaceTokenizerFactory with the "rule" parameter are only > valid > >> > with embedded Lucene. And so on. To sum up, I'm not sure we can do > >> > something smart. > >> > > >> > Solution 2 is easier to implement, but requires to add a bit of API: > the > >> > way for users to declare that a given analyzer definition is to be > >> > available when querying a given entity. I would add type-level > >> > @QueryAnalyzer(definition = "foo") and @QueryAnalyzers annotation. > >> > > >> > I know nobody wants to add new annotations in a minor, but right now > >> > that > >> > seems to be the only workable solution. > >> > > >> > What do you think? > >> > > >> > [1] > >> > > >> > https://github.com/hibernate/hibernate-search/blob/ > 1847bd222128395056cdf6e7cfb601ceed5e40c3/engine/src/main/ > java/org/hibernate/search/engine/impl/ConfigContext.java#L277 > >> > [2] > >> > > >> > https://github.com/hibernate/hibernate-search/blob/ > 1847bd222128395056cdf6e7cfb601ceed5e40c3/engine/src/main/ > java/org/hibernate/search/query/dsl/EntityContext.java#L14 > >> > [3] > >> > > >> > https://github.com/hibernate/hibernate-search/blob/ > 1847bd222128395056cdf6e7cfb601ceed5e40c3/engine/src/main/ > java/org/hibernate/search/query/dsl/impl/ConnectedMultiFieldsTermQueryB > uilder.java#L222 > >> > > >> > > >> > Yoann Rodière <yoann(a)hibernate.org> > >> > Hibernate NoORM Team > >> > _______________________________________________ > >> > hibernate-dev mailing list > >> > hibernate-dev(a)lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev > > > > >

2665

days inactive

2666

days old

hibernate-dev@lists.jboss.org

Manage subscription

4 comments

2 participants

tags (0)

participants (2)

Sanne Grinovero
Yoann Rodiere

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[SEARCH] Query-only analyzers with Elasticsearch - new annotation?