[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-194) Inconsistent performance between hibernate search and pure lucene access

Thursday, 29 May 2008

    [
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-194?pag...
] 

Emmanuel Bernard commented on HSEARCH-194:
------------------------------------------

OK I think I found the final bit of difference between Pure Lucene and Hibernate Search:
When receiving 100 requests at the same time an IndexReader will reach it's
scalability limit (hence the monitor contentions you have observed). Basically for such
high request/sec, it is better to provide a pool of IndexReader/IndexSearcher.
The sweet pot depends on the index size as warming up the IndexReader is also an important
factor.

I guess on top of the optimization proposals (from my last post) we should add a pooling
mechanism for IndexReaders.

...
 Inconsistent performance between hibernate search and pure lucene
access
 ------------------------------------------------------------------------

                 Key: HSEARCH-194
                 URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-194
             Project: Hibernate Search
          Issue Type: Bug
          Components: query
    Affects Versions: 3.0.1.GA
         Environment: Linux - Hibernate 3.2.6, Hibernate Annotations 3..3.1 - Lucene
2.3.1
            Reporter: Stephane Nicoll
            Priority: Critical
         Attachments: Monitor_Usage_Statistics.html

 I have a simple index that contains:
 * id (pk of the entity)
 * keywords (a list of tokens)
 The index contains 100.000 objects and the keywords field has 2 tokens from a list of 40
different values
 What I want to do is retrieve all the IDs that matches a given lucene query on the
keywords. So for that I'm doing something like:
 FullTextSession fullTextSession = Search.createFullTextSession(session);
 QueryParser parser = new QueryParser("keywords", luceneAnalyzer);
 org.apache.lucene.search.Query hibernateQuery = parser.parse("foo AND bar");
 FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(hibernateQuery,
target);
 fullTextQuery.setProjection("id");
 fullTextQuery.setResultTransformer(resultTransformer);
 Iterator it = fullTextQuery.iterate();
 Where ResultTransformer is
 private static class FirstObjectResultTransformer implements ResultTransformer {
         public Object transformTuple(Object[] objects, String[] strings) {
             return objects[0];
         }
         public List transformList(List list) {
             return list;
         }
     }
 If I do a load test with a single thread, the execution time of my lucene query is around
200 msec. If I do a load test with 10 threads, the execution time is 2 sec (per user!). If
I run the profiler on the service, I see lots of deadocks on SegmentReader.
 Switching to a "non-shared" strategy removes the deadlocks but it's still
slow (1.5 sec).
 Now, If I execute the same query on the same index and the same host with only the lucene
API, the query takes around 100msec with 10 concurrent users. I tried to use the lucene
API from Hibernate Search but it did not change anything.
 What am I missing? Attached the profiling result. 
-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-194) Inconsistent performance between hibernate search and pure lucene access