Re: [infinispan-dev] data interoperability and remote querying

Wednesday, 10 April 2013

Yes.  We haven't quite designed how remote querying will work, but we have a few
ideas.  First, let me explain  how in-VM indexing works.  An object's fields are
appropriately annotated so that when it is stored in Infinispan with a put(), Hibernate
Search can extract the fields and values, flatten it into a Lucene-friendly
"document", and associate it with the entry's key for searching later.

Now one approach to doing this when storing objects remotely is the serialisation format. 
A format that can be parsed on the server side for easy indexing.  An example of this
could be JSON (an appropriate transformation will need to exist on the server side to
strip out irrelevant fields before indexing).  This would be completely
platform-independent, and also support the interop you described below.  The drawback? 
Slow JSON serialisation and deserialization, and a very verbose data stream.

Another approach may be to perform the field extraction on the client side, so that the
data sent to the server would be key=XXX (binary), value=YYY (binary),
indexing_metadata=ZZZ (JSON).  This way the server does not need to be able to parse the
value for indexing, since the field data it needs is already provided in a
platform-independent manner (JSON).  The benefit here is that keys and values can still be
binary, and can use an efficient marshaller.  The drawback, is that field extraction needs
to happen on the client.  Not hard for the Java client (bits of Hibernate Search could be
reused), but for non-Java clients this may increase complexity of those clients quite a
bit (much easier for dynamic language clients - python/ruby).  This approach does *not*
solve your problem below, because for interop you will still need a platform-independent
serialisation mechanism like Avro or ProtoBufs for the object <--> blob <-->
object conversion.

Personally, I prefer the second approach since it separates concerns (portable indexes vs.
portable values) plus would lead to (IMO) a better-performing implementation.  I'd
love to hear others' thoughts though.

Cheers
Manik

On 10 Apr 2013, at 17:11, Mircea Markus <mmarkus(a)redhat.com&gt; wrote:

...
 That is write the Person object in Java and read a Person object in
C#, assume a hotrod client for simplicity.
 Now at some point we'll have to run a query over the same hotrod, something like
"give me all the Persons named Mircea".
 At this stage, the server side needs to be aware of the Person object in order to be able
to run the query and select the relevant Persons. It needs a schema. Instead of suggesting
Avro as an data interoperability protocol, we might want to define and use this schema
instead: we'd need it anyway for remote querying and we won't have two ways of
doing the same thing.
 Thoughts? 

 Cheers,
 -- 
 Mircea Markus
 Infinispan lead (www.infinispan.org)

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani

Platform Architect, JBoss Data Grid
http://red.ht/data-grid

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] data interoperability and remote querying