Hi,
I think that seems rather harsh.
So, the alternatives I'm seeing:
# Dropping the Dynamic Analyzer feature
# Cheat and pass in a mutable Analyzer - needs some caution re concurrent usage
# Cheat and pass in a pre-analyzed Document
# Fork & patch the IndexWriter
What's about the alternative to close the IndexWriter and re-open it? Obviously this
could be
optimised, but storing the field to analyzer map together with the open IndexWriter and
only
re-open if the mapping changes. As long as the mapping is the same the same IndexWriter
can be used.
This way we could keep the feature with a potential performance hit for the people who are
using it.
Still better than removing it, right? That said, what are the exact performance impacts?
Did you run
a test?
Funny enough, what the Lucene guys try to prevent by the API change can still be done,
namely
by just re-opening the IndexWriter. So they are effectively forcing people who want to use
this
analyzer per document feature to go down an even more slippery slope. I would not be
surprised if
this change get reverted.
My favourite long-term solution would be to do pre-analysis:
How would that look like and did we not once discuss exactly the opposite (aka letting
even
the Document be built on the master)?
master/slave clustering approach, that would have several other
benefits:
- move the analyzer work to the slaves
Why is that a benefit?
- reduce the network payloads
Really, is it actually not increasing payloads?
- remove the need to be able to serialize analyzers
We don't serialize analyzers afaik
--Hardy