2011/8/2 Emmanuel Bernard <emmanuel(a)hibernate.org>:
HSEARCH-681 HSEARCH-757
I've done a first analysis for the state to transfer. It's not too bad.
This doc is cryptic but people familiar with Lucene should get what I mean:
https://gist.github.com/1120651
Thanks!
I've added some minor comments on the format on github.
It describes the state to pass around for each operation and detail
what Document and Fieldable contain state wise and what would be transferred.
The two big annoyances are:
- Reader
- TokenStream
both of them can be the source of a Field.
While we can envision some strategies for Reader (worse case being we read the data and
ship it over). I have not yet analyzed the implications of TokenStream.
I would be tempted to change our API to not allow writing them :)
I'd expect that if people is really willing to use a Reader, they must
have good reasons, like having a big payload which needs streaming to
prevent OOM; in such a case I don't see other options than pre-analyze
and send the analyzed tokenstream.
This might be better anyway as it saves work from the writing node,
which is traditionally the bottleneck, but I'm not sure how complex
that would be to implement.
For the rest with a smart enough state machine and possibly some private attribute
reading, we should be able to serialize / deserialize Documents and Fieldables
+1
Cheers,
Sanne