Re: [infinispan-dev] QueryIterator inconsistencies

Tuesday, 25 September 2012

On 25 September 2012 09:35, Marko Lukša <marko.luksa(a)gmail.com&gt; wrote:
...
 On 24.9.2012 19:03, Sanne Grinovero wrote:
>> 3. QueryIterator.afterFirst() & beforeLast()
>> These two are OK. Calling next() after afterFirst() should return the
>> second element, while calling previous() should return the first element.
> Are you sure they are ok? I'm actually wondering why they exist at
> all; they don't look like very useful to me.
> Why should someone want a method to directly jump to the second element?
> Users already have the option to "beforeFirst && next" or use the
> "jumpTo" / seek operation.
>
 Oh yes, I was wondering about that myself when I first saw
 QueryIterator, but then figured someone needed/wanted it. If we
 deprecate the whole QueryIterator and introduce a new interface, I'd
 leave these methods out.

>> 4. isFirst(), isLast()
>> ListIterator does not have the concept of "current element", but the
>> javadoc of all the is*() methods talk about the current element, which
>> is ambiguous. Therefore isFirst() and isLast() should probably be
>> renamed to isBeforeFirst() and isAfterLast().
> I would remove these too. Am I missing some use case?
>
 I think you're right. Also, instead of calling isBeforeFirst() and
 isAfterLast(), users can simply call !hasPrevious() and !hasNext(),
 which already exist.

>> 5. jumpToResult(int index)
>> When jumping to index=1, what does next() return? What about previous()?
>> We should probably replace the method with two others: beforeResult(int
>> index) and afterResult(int index)
> I would add only "beforeResult(int)", again I don't see why we should
> have both versions.
> Since you have to do "next()" to retrieve a value, I'd consider
> "beforeResults" more user friendly than "afterResults"
 Hmm, what if you want to iterate backwards. If you only had
 beforeResult(int), you wouldn't be able to jump behind the last element.
 You'd have to use afterLast(). For usecases where you always jump to the
 end and iterate backwards, afterLast() would be ok. But what if you
 needed to implement a method like iterateBackwards(int fromIndex,
 Visitor visitor). Then the method would need to have something like:  if
 (fromIndex == col.size()) it.afterLast() else it.beforeResult(fromIndex).

 Of course, we could allow users to call beforeResult(col.size()), but
 that's not completely clean, since there's no element with index
 col.size(). But still, it may be cleaner than having an additional
 method (afterResult). 
Right I hadn't thought about backwards scrolling. Don't you think that
being able to revert the sort order at query level is enough?
If someone needs A-Z, runs the scroll in forward mode and defines
lexicographical ordering, if you then need Z-A you revert the sort and
run the iterator again in forward only mode:

AFAIR supporting the reverse iterator in distributed lazy queries is a
nightmare, so if we agree they are pointless that's a good
simplification in the codebase.

I can think of a single use case in which one can not revert the Sort:
when the search result is defined by relevance in a fulltext query,
but in this case the revert sort doesn't make any sense as the last
results (which would get on top) are a "tail" of uninteresting
information.

...

>> Any comments / other ideas?
>>
>> Marko
> That's a lot of changes needed. One would say we could deprecate all
> methods and rewrite the new ones, but wouldn't it be better to
> deprecate the interface and introduce a brand new one?
> What about "ResultsIterator" ? after all, we're not iterating on
queries..

 I agree. The only thing that bothers me is that CacheQuery would then
 need new methods called (lazy)resultIterator() (instead of iterator(),
 which already exists). 
Right. though I like "resultIterator()" more so we're having luck in
this case ;)

Any better suggestion for names? Anyone thinks it's worth to remove
the old ones without a deprecation step to reuse the old names?

...

 This reminds me. What is the difference between LazyIterator and
 EagerIterator? Both of them execute the Lucene query immediately. The
 only difference is that EagerIterator loads EntityInfos straight away,
 while LazyIterator (only) loads the DocumentExtractor and then at first
 invocation of next(), it loads the EntityInfo and cache value at the
 same time (when filling the buffer).

 IMO, the difference between them is negligible and we could either scrap
 LazyResultIterator or make EagerResultIterator preload everything? 
You have to consider that Infinispan could hold many more values than
what the current node/JVM is able to keep in memory;
for example when running a distributed query, it's likely that the
user doesn't want to preload all data; obviously when running "list()"
we *could* load things lazily on access, but that gets tricky as we
don't have a close() method and I would expect people to use an
iterator if they know the amount of data retrieved is having some
critical mass.
Of course if you know you're going to hit a limited number of entries
(and these entries are of limited size - some people store hd movie
streams) then the eager loading could be more efficient.

So the lazy-loaders are needed; it could be nice to think of a
different way to specify the option; we could have something like

  .iterate(LoadMode);

rather than having to define the methods twice.

Sanne

...

 Marko
 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] QueryIterator inconsistencies