[hibernate-dev] Hibernate Search 3.1

Wed Feb 27 09:39:46 EST 2008

On  Feb 27, 2008, at 08:40, Nick Vincent wrote:

> Hi Emmanuel,
>
> On 26/02/2008, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>
>>  On  Feb 26, 2008, at 06:41, Nick Vincent wrote:
>>>
>>>
>>> 2) Explaining results
>>>
>>> This uses the new DOCUMENT_ID projection introduced in 3.0.1  to
>>> explain query results (we need this so the customer can understand
>>> their search results in the backoffice interface).  I added an  
>>> explain
>>> method to both implementations of FullTextQueryImpl which is only
>>> available by casting (e.g. no interface changes).  I think explain()
>>> is probably a fairly advanced function which it's acceptable to  
>>> access
>>> by casting.
>>
>>
>> Wouldn't it make sense to expose the explain result (I imagine an
>>  Explanation object) as a projected field?
>
> The Lucene javadoc says "Computing an explanation is as expensive as
> executing the query over the entire index.".
>
> http://lucene.apache.org/java/2_3_1/api/core/org/apache/lucene/ 
> search/Searcher.html#explain(org.apache.lucene.search.Query,%20int)
>
> which is why I didn't consider projecting this.  If that's true the
> effort to project an explanation onto the results will increase
> exponentially with the number of hits.  For this reason I think the
> method of accessing an Explanation that I proposed is reasonable
> (although not necessarily right).

My concern really is that the reader used to explain might be  
different than the reader that returns hits, and hence be out of  
sync. But the projection idea seems like too resource intensive.

How is your use case then? The user ask for the explanation of a  
single result manually after the query? (ie there is a human think  
time between he query and the explanation?)

>
>>>
>>> 3) Counting results
>>>
>>> In the current implementation we only want to perform one Lucene  
>>> query
>>> per search (all projected).  In order to get a resultcount and the
>>> results themselves it is currently necessary to invoke the Lucene
>>> query twice.
>>
>>
>> This is not true.
>>
>>  query.list(); //triggers a lucene query
>>  query.getResultSize(); //does not since list() has already  
>> computed it
>
> You are right, and I don't need to make any alterations.  I've worked
> out what the problem we encountered was that made me think this was a
> problem.  It took a bit of digging around the source to work out what
> we'd done wrong, and perhaps it might be useful to include in an FAQ
> or the documentation.  If you make the calls in this order:
>
> query.getResultSize();  // Hits retrieved, hitcount cached and  
> returned
> query.list(); // Hits retrieved
>
> then the query gets run twice as resultCount is cached in
> FullTextQueryImpl but the Hits object is not.
>
> A subtle effect, but when you're using something like JSF you're not
> always sure in which order the properties of your underlying beans are
> retrieved during the render cycle.  This was the cause of our double
> querying behaviour.

http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-157

I don't keep the hit around because it would mean keeping the readers  
opened. I guess a simple helper class could do what your code was  
doing (ie build a result size aware list).

>
> Cheers,
>
> Nick