[infinispan-dev] data interoperability and remote querying
Emmanuel Bernard
emmanuel at hibernate.org
Wed Apr 10 14:46:28 EDT 2013
On Wed 2013-04-10 18:55, Manik Surtani wrote:
>
> On 10 Apr 2013, at 18:18, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>
> > I favor the first options for a few reasons:
> >
> > - much easier client side implementations
> > Frankly rewriting the analyzer logic of Lucene in every languages is
> > not a piece of cake and you are out of luck for custom analysers
>
> I'm not suggesting all the analyser logic. Just the extraction of indexed fields into name/value pairs, to be sent alongside the blob value.
Which means you make a selection already and possibly already reduce
your precision for a given field. Which makes reindexing impossible.
>
> > - more robust client implementation: if we change how indexing is done
> > clients don't have to change
> > - reindexing: if there is a need to rebuild the index, or if the user
> > decides to reindex data differently, you must be able to read the data
> > on the server side
> > - validation: if you want to implement (cross entry) validation, the
> > server needs to be able to read the data.
> > - async, validation and indexing can be done in an async way on the
> > server and avoid perceived latency from a client requiest to the
> > result
>
> Valid points above though.
>
> > I'm not sure JSON should be the format though. As you said it's quite
> > verbose and string is not exactly the most efficient way to process
> > data.
>
> What would that format be, then?
Good question :) BSON is not necessarily smaller than JSON, it is meant
to be more parseable afair. I did use Avro in Hibernate Search as I find
ProtBuffer and the others too rigid for my needs to pass arbitrary
datasets. But if we have a schema and expect a given object type, then
we can start saving space a lot.
On other words, no idea that needs to be investigated.
More information about the infinispan-dev
mailing list