[infinispan-dev] [hibernate-dev] Distributed queries

Emmanuel Bernard emmanuel at hibernate.org
Mon Sep 21 02:40:55 EDT 2009


All this is not tackled by Hibernate Search per se as it does highly  
dependent on the distribution algorithm:
  - finding all nodes (some discovery mechanism)
  - binding an index to the list of indexed document (Hibernate Search  
has an abstraction for sharding the index but we are talking about  
executing queries on various nodes. Lucene as an abstraction for that  
that should not be too hard to port to Hibernate Search)
  - distribute data wo duplication: that's entirely dependent on the  
distribution mechanism (ie Infinispan)


On 19 sept. 09, at 12:43, Michael Neale wrote:

> I think you just stuck a pin in the bubble that normally says "magic
> happens here" ;)
>
> How much of this did you tackle regarding hibernate search that could
> be applied here?
>
> (you final point re duplication may have some "flexibility" I think ?)
>
> On Fri, Sep 18, 2009 at 6:18 PM, Emmanuel Bernard
> <emmanuel at hibernate.org> wrote:
>> Neither 1 nor 2 imply *distributed* queries.
>>
>> The hard parts with distributed queries (ie executed on a grid and
>> recomposed) are:
>>  - making sure you ask all the nodes where the index is distributed
>> (you can't miss a node)
>>  - find a way to index only a subset of the data in a given index (on
>> a given node). Applying the Infinispan distribution routine to the
>> InfinispanDirectory does not do that, it chunks data arbitrarily.
>>  - be able to rebuild a given index on a givne node (ie remember
>> which element were indexed)
>>  - you need to find a way to distribute your data without
>> duplication. If a key is indexed multiple times, then you end up with
>> duplicated results that can't trivially be de-duplicated.
>>
>> Happy thinking.
>>
>> On 17 sept. 09, at 10:32, Sanne Grinovero wrote:
>>
>>> 2009/9/17 Michael Neale <michael.neale at gmail.com>:
>>>> I am still not entirely sure what I am asking, but look forward for
>>>> your merged in changes (they are in another branch right now yes?).
>>>>
>>>> Yes I mean querying objects - I was under the impression that  
>>>> lucene
>>>> was used for the indexing of the data to service these queries?
>>>
>>> Sure, to clarify: there's work going on on two different aspects,
>>> which
>>> complement each other in the ideal setup:
>>>
>>> 1) Be able to query a Lucene index (wherever you store that) to find
>>> objects
>>> which are located inside Infinispan; this is about how to search
>>> them and how
>>> to maintain the index in synch with Infinispan's content.
>>>
>>> 2) Store a Lucene index inside Infinispan, instead of, for example,
>>> filesystem.
>>> In this case we're not concerned about what you index, the Lucene
>>> interface
>>> is the usual one and you should be able to replace the Directory
>>> implementation in existing applications.
>>>
>>> So 1) is the branch you've found, and Navin is working on that, 2)
>>> is not yet
>>> in subversion, the latest patch is attached to other thread by
>>> Łukasz,
>>> and is to be applied
>>> on Hibernate Search's trunk (and depends on Infinispan).
>>>
>>>>
>>>> On Wed, Sep 16, 2009 at 10:32 PM, Navin Surtani
>>>> <nsurtani at redhat.com> wrote:
>>>>>
>>>>> On 16 Sep 2009, at 12:25, Michael Neale wrote:
>>>>>
>>>>>> oh ok nice - could you point me at which branch to try to find  
>>>>>> some
>>>>>> tests to play with?
>>>>>
>>>>> If you're talking about Querying objects in Infinispan: -
>>>>>
>>>>> The eventual goal is to be able to have different configurations  
>>>>> on
>>>>> how you want to index your data. Manik has given me the 'OK' to
>>>>> push a
>>>>> simple query interface for CR1 for Monday/Tuesday.
>>>>>
>>>>> I'm kind-of pressed with getting the code working for this and  
>>>>> also
>>>>> between moving house and lack of internet there I'll be a bit  
>>>>> quiet.
>>>>> However, I'll get a wiki up by the end of the week about how this
>>>>> all
>>>>> works.
>>>>>
>>>>> However if you're not then I assume you're talking about using
>>>>> Lucene
>>>>> to index into Infinispan?
>>>>>
>>>>>>
>>>>>> On Wed, Sep 16, 2009 at 6:05 PM, Sanne Grinovero
>>>>>> <sanne.grinovero at gmail.com> wrote:
>>>>>>> 2009/9/16 Michael Neale <michael.neale at gmail.com>:
>>>>>>>> regarding indexing and queries - is the current aim to not
>>>>>>>> require
>>>>>>>> that the index for the entire data grid exist on a single node?
>>>>>>>>
>>>>>>>> (asking as a potential user who is wrestling with lucene
>>>>>>>> indexes at
>>>>>>>> the moment is curious).
>>>>>>>
>>>>>>> Yes the concept is to store the Lucene index itself in the grid,
>>>>>>> so
>>>>>>> it will
>>>>>>> be distributed, and the segments you use most get cached  
>>>>>>> locally.
>>>>>>> At the moment you have to select only one node to write to the
>>>>>>> index,
>>>>>>> but all other nodes should be able to read.
>>>>>>> Feel free to test it as we are needing feedback.
>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Michael D Neale
>>>>>>>> home: www.michaelneale.net
>>>>>>>> blog: michaelneale.blogspot.com
>>>>>>>> _______________________________________________
>>>>>>>> infinispan-dev mailing list
>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> infinispan-dev mailing list
>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Michael D Neale
>>>>>> home: www.michaelneale.net
>>>>>> blog: michaelneale.blogspot.com
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>> Navin Surtani
>>>>>
>>>>> Intern Infinispan
>>>>> Intern JBoss Cache Searchable
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael D Neale
>>>> home: www.michaelneale.net
>>>> blog: michaelneale.blogspot.com
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>
>>> _______________________________________________
>>> hibernate-dev mailing list
>>> hibernate-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> -- 
> Michael D Neale
> home: www.michaelneale.net
> blog: michaelneale.blogspot.com
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev





More information about the infinispan-dev mailing list