[infinispan-dev] data interoperability and remote querying

Thu Apr 11 08:30:26 EDT 2013

On 11 April 2013 12:35, Dan Berindei <dan.berindei at gmail.com> wrote:
>
>
>
> On Thu, Apr 11, 2013 at 1:30 PM, Manik Surtani <msurtani at redhat.com> wrote:
>>
>>
>> On 10 Apr 2013, at 20:57, Sanne Grinovero <sanne at infinispan.org> wrote:
>>
>> > Right, let's keep this to collecting requirements:
>>
>> +1.  Ok, so it seems we're all pretty much in agreement that metadata
>> extraction and indexing should happen on the server side and not on the
>> client.  As I said before, this is good. Simple clients, support for
>> re-indexing, support for changes in indexing characteristics, and the
>> ability to save the world from AIDS.
>>
>> This puts a requirement on an efficient and portable serialisation format.
>> Again, +1 to starting with defining what we need.  Good start below, Sanne.
>>
>
> Besides the serialization format, how do we want to define the indexes on
> the server?
>
> Relying on Java classes with Lucene annotations on them doesn't sound like
> it would support indexing changes very well, because each node would index
> whatever annotations it had loaded at the moment. So I guess we need a
> separate indexing configuration, modifiable at runtime, and with annotations
> as a backup.

You're right. Something like this:
https://github.com/hibernate/hibernate-search/blob/master/hibernate-search-orm/src/test/java/org/hibernate/search/test/configuration/ProgrammaticSearchMappingFactory.java#L50

>>
>> > - being able to upgrade the server without losing data
>> > - being able to change the (soft) schema on the server
>> > - read/write fields from different languages
>>
>> > - deal with multi-version control of values (i.e. being able to read
>> > an older value through an evoluted schema, doing comparisons of same
>> > value even if it was stored using different schema generations)
>>
>> I'd add:
>>
>> * Support for fast and easy translation to/from object model in high level
>> language of choice (i.e., not manual parsing!  Maybe some form of tooling,
>> like a Maven plugin, to generate "IDL"-esque format)
>> * Serialisation efficiency (size and speed) should be considered
>>
>> And in addition, I'd also list out existing technologies that fulfil some
>> or all of these requirements that we can consider, look at extending, etc.
>>
>
> I'd add support for random access for reads. If the user only needs to index
> a Person's date of birth, it would be nice if we could read only the
> dateOfBirth field and index that.
>
>
>>
>> - Manik