[infinispan-dev] ISPN 200

Emmanuel Bernard emmanuel at hibernate.org
Wed Sep 15 08:57:28 EDT 2010


Ah yes sorry guys scratch what I said before. I forgot you were distributing the query load and were fully resolving the results before sending them back (ie not the two step process done by default - load sorted ids then load the necessary objects)

Well it's possible to have an algorithm that does something like that:
 - load the first few results of each node in sorted order
 - do a comparison between them on the client and return the highest results as long as there are other results of lower value queued up coming from all the other nodes (or that the end of result for a given node is reached)
 - if you are not in this situation, wait for results
One of the problem with this algo is that the server has to keep the next results ready (has to remember) unless you push everything on the client wo waiting the next() calls which makes it not so lazy anymore :).

Note that in practice, it does not work for relevance unless the dataset is known to be evenly distributed (term and stats wise) and we can compare the score value (which is a good first approximation until Sanne think more about the alternative he has explored with some other guys).

You also need to deduplicate data.

But I would do early perf comparisons between a solution where we load keys to the client and then load objects vs this super-lazy approach. I'm not convince the later is better.


On 15 sept. 2010, at 14:04, Israel Lacerra wrote:

> >I think if we allways block iterator.next() until we have at least a value of each node , we can apply sort in this values (since every node already made this). 
> I mean, the idea is to make a "merge" of the results... and the next() blocks until we have some results already sorted.
> 
> Emmanuel, do you think this can work?
> 
> On Wed, Sep 15, 2010 at 8:54 AM, Israel Lacerra <israeldl at gmail.com> wrote:
> >You give up any hope of implementing sorting (explicit or using relevance).
> 
> I think if we allways block iterator.next() until we have at least a value of each node , we can apply sort in this values (since every node already made this). That's what you said, Navin?
> 
> 
> On Wed, Sep 15, 2010 at 8:16 AM, Navin Surtani <nsurtani at redhat.com> wrote:
> Isn't that an issue that comes with lots of hits in a lazyIterator anyway?
> 
> I mean, if you're only going to load up a 'few hits at a time' then you
> can in theory only sort the few hits that you have loaded up anyway
> right? :S
> 
> On 15/09/10 12:01, Emmanuel Bernard wrote:
> > You give up any hope of implementing sorting (explicit or using relevance).
> >
> > On 15 sept. 2010, at 12:16, Manik Surtani wrote:
> >
> >> How about something like:
> >>
> >> - Broadcast the query
> >> - Every node creates the QueryHits inst, runs the query and collects
> >> results. Starts streaming the results back immediately.
> >> - The lazy iterator returns to the user immediately, and maintains an
> >> internal cache of results coming in from N remote QueryHits instances
> >> - iterator.next() blocks until this cache has available entries to return.
> >> - on an implementation level, the GetHItsCommand (or something like
> >> that) could return with a single hit, or N hits, with a flag of
> >> whether more hits are available or not.
> >>
> >> Does that help?
> >>
> >> Manik
> >>
> >>
> >> On 14 Sep 2010, at 20:46, Israel Lacerra wrote:
> >>
> >>> Hi guys,
> >>>
> >>> I'm still thinking about it, but I don't have a really good idea
> >>> about the lazy iterator yet. The only way (that I see) I could make
> >>> it more lazily is:
> >>>
> >>> - Broadcast the query.
> >>> - Every node creates a QueryHits instance with the query and keep it
> >>> in a simple little cache (array, hash, etc)
> >>> - A "state" of the query is created and every lazyIterator.next()
> >>> must send a command to a node and get the next hit (the next key).
> >>> - After a certain time, the instances of queryHits "dies".
> >>>
> >>> It seems to me that this is not too efficient. But I don't have any
> >>> other ideas.
> >>>
> >>> Do you have any suggestions about it?
> >>>
> >>> thanks!
> >>> Israel
> >>>
> 
> 
> --
> Navin Surtani
> Intern Infinispan
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20100915/59b53ff9/attachment-0001.html 


More information about the infinispan-dev mailing list