Currently, when we reference an analyzer/normalizer (while defining a field) or a char filters/tokenizers/token filter (while defining an analyzer/normalizer), we don't check that the name actually matches a definition. That's because we expect some definitions to be available on the server side regardless of the user configuration, and we don't know exactly which ones will be (since the user can use server-side configuration to define them). I think we should: 1. Throw exceptions when trying to reference unknown analyzers, normalizers, char filters, tokenizers or token filters. 2. Set up a whitelist of analyzers, normalizers, char filters, tokenizers or token filters that are expected to already be defined on the server side (see HSEARCH-2584 Open ) 3. Allow users to add even more names to this whitelist (context.analyzer( "myName" ).builtin() or something similar in the ElasticsearchAnalysisConfigurer) Thoughts? |