Hi there,
I would like to extend ScopeAnalyzer so that I can specify an analyzer
for some extra fields that my custom bridge is writing into the index.
I don't (and can't really) create dummy accessors in my class for
these as their names are based on database data.
For example imagine we wanted to search for staff by their department,
so our field name might be:
people.sales
people.customerservice
Assume these departments would come from the DB and wouldn't be hard
coded into code, and require a custom analyzer.
In order to support specifying a custom analyzer for these records
there would need to be a way of specifying a wildcard in the
ScopeAnalyzer that would match "people.*".
This is a feature I need, and something which may be needed in the
future. I'm happy to develop a patch for it.
The way I would propose to approach this is:
1) Make ScopedAnalyzer match wildcards, which could be done by holding
a second map of analyzers, sorted by key longest first, and if an
analyzer cannot be found directly for a given field wilcard matches
can be checked. If a match is found it can be explicitly populated in
the direct cache for immediate lookup. If there is a miss then the
default analyzer can be inserted, saving further wildcard tests.
Performance penalty would be minimal.
2) Configure ScopedAnalyzer with the wildcards.
I can see two approaches to doing this:
a) Create a new class level annotation @AnalyzerBindings that takes
something like
@AnalyzerBindings({
@AnalyzerBinding(pattern="person.department.*", impl=MyAnalyzer.class)
})
Configure this from DocumentBuilder when the class is setup.
b) Create a callback or other mechanism that allows custom bridges to
specify which patterns they would like to match. I'd suggest a method
like
Map<String, Analyzer) getCustomAnalyzers ()
could be added to the Bridge interface (or another interface to avoid
breaking changes) which could be called when the bridges are
configured in DocumentBuilder.
Any feedback on either approach is much appreciated.
Cheers,
Nick