On 11 April 2013 12:35, Dan Berindei <dan.berindei(a)gmail.com> wrote:
On Thu, Apr 11, 2013 at 1:30 PM, Manik Surtani <msurtani(a)redhat.com> wrote:
>
>
> On 10 Apr 2013, at 20:57, Sanne Grinovero <sanne(a)infinispan.org> wrote:
>
> > Right, let's keep this to collecting requirements:
>
> +1. Ok, so it seems we're all pretty much in agreement that metadata
> extraction and indexing should happen on the server side and not on the
> client. As I said before, this is good. Simple clients, support for
> re-indexing, support for changes in indexing characteristics, and the
> ability to save the world from AIDS.
>
> This puts a requirement on an efficient and portable serialisation format.
> Again, +1 to starting with defining what we need. Good start below, Sanne.
>
Besides the serialization format, how do we want to define the indexes on
the server?
Relying on Java classes with Lucene annotations on them doesn't sound like
it would support indexing changes very well, because each node would index
whatever annotations it had loaded at the moment. So I guess we need a
separate indexing configuration, modifiable at runtime, and with annotations
as a backup.
You're right. Something like this:
https://github.com/hibernate/hibernate-search/blob/master/hibernate-searc...
>
> > - being able to upgrade the server without losing data
> > - being able to change the (soft) schema on the server
> > - read/write fields from different languages
>
> > - deal with multi-version control of values (i.e. being able to read
> > an older value through an evoluted schema, doing comparisons of same
> > value even if it was stored using different schema generations)
>
> I'd add:
>
> * Support for fast and easy translation to/from object model in high level
> language of choice (i.e., not manual parsing! Maybe some form of tooling,
> like a Maven plugin, to generate "IDL"-esque format)
> * Serialisation efficiency (size and speed) should be considered
>
> And in addition, I'd also list out existing technologies that fulfil some
> or all of these requirements that we can consider, look at extending, etc.
>
I'd add support for random access for reads. If the user only needs to index
a Person's date of birth, it would be nice if we could read only the
dateOfBirth field and index that.
>
> - Manik