Reasons to remove analyzer discriminators:
- We can't implement them in the Elasticsearch backend
- They are bad practice anyway: storing the result of applying different analyzers in the same field is a great way to get unpredictable search results.
There is, however, an alternative. Let's take the example of a language-based analyzer discriminator. Instead of using the same field for every language, and just changing the analyzer based on the language of the document, we could change the field based on the language: for English we put the data in "myField_en", for French we put it in "myField_fr", etc. As long as the list of supported languages is known in advance (and honestly, why wouldn't it?), we can easily write a type bridge with this behavior. We could even write a type bridge with a custom annotation and custom marker annotations to re-use the same bridge on multiple entities. Example of use:
@SwitchingBridge(switcher = @SwitcherBeanReference(type = MySwitcher.class))
class MyType {
@SwitchedOn
String language;
@SwitchedField(name = “content1”)
String content1;
@SwitchedField(name = “content2”)
String content2;
}
public class MySwitcher extends Switcher {
public void bind(...) {
}
public void write(String switchValue, …) {
If ( switchValue.equals( … ) ){
} else if ( … ) {
}
}
}
Let's not expose this as an API right now, but at least let's test this solution. If it works, let's document it in the migration guide ( HSEARCH-3283 Open ). If it doesn't... We might have to consider adding some sort of support for analyzer discriminators in the Lucene backend, perhaps as an extension. |