[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-194) Inconsistent performance between hibernate search and pure lucene access

Thursday, 29 May 2008

    [
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-194?pag...
] 

Emmanuel Bernard commented on HSEARCH-194:
------------------------------------------

About a new ReaderProvider, let's start a discussion on the dev mailing list. With
some design ideas.

The scalability issues I saw compared to Lucene were due to the fact that my Lucene test
did not reuse the IndexSearcher. It had one searcher per Thread. Which made it much faster
in the end on my duo-core and on this extreme test (100 // queries)

The DirectoryProvider itself is locked when you update data, not when you read. Pure read
(ReaderProvider) are not locked in that regard. But for the structure keeping the cached
IndexReaders for a given DirectoryProvider, you need to have a lock around. "//needed
for same problem as the double-checked locking" is basically ensuring a memory
barrier to make sure the instructions are not shuffled by the VM.

I will commit my test, probably today or tomorrow. You can commit the experiment on the
test directory and we can move it eventually. From he tests I've done, the isCurrent
was not the issue really.

...
 Inconsistent performance between hibernate search and pure lucene
access
 ------------------------------------------------------------------------

                 Key: HSEARCH-194
                 URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-194
             Project: Hibernate Search
          Issue Type: Bug
          Components: query
    Affects Versions: 3.0.1.GA
         Environment: Linux - Hibernate 3.2.6, Hibernate Annotations 3..3.1 - Lucene
2.3.1
            Reporter: Stephane Nicoll
            Priority: Critical
         Attachments: Monitor_Usage_Statistics.html

 I have a simple index that contains:
 * id (pk of the entity)
 * keywords (a list of tokens)
 The index contains 100.000 objects and the keywords field has 2 tokens from a list of 40
different values
 What I want to do is retrieve all the IDs that matches a given lucene query on the
keywords. So for that I'm doing something like:
 FullTextSession fullTextSession = Search.createFullTextSession(session);
 QueryParser parser = new QueryParser("keywords", luceneAnalyzer);
 org.apache.lucene.search.Query hibernateQuery = parser.parse("foo AND bar");
 FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(hibernateQuery,
target);
 fullTextQuery.setProjection("id");
 fullTextQuery.setResultTransformer(resultTransformer);
 Iterator it = fullTextQuery.iterate();
 Where ResultTransformer is
 private static class FirstObjectResultTransformer implements ResultTransformer {
         public Object transformTuple(Object[] objects, String[] strings) {
             return objects[0];
         }
         public List transformList(List list) {
             return list;
         }
     }
 If I do a load test with a single thread, the execution time of my lucene query is around
200 msec. If I do a load test with 10 threads, the execution time is 2 sec (per user!). If
I run the profiler on the service, I see lots of deadocks on SegmentReader.
 Switching to a "non-shared" strategy removes the deadlocks but it's still
slow (1.5 sec).
 Now, If I execute the same query on the same index and the same host with only the lucene
API, the query takes around 100msec with 10 concurrent users. I tried to use the lucene
API from Hibernate Search but it did not change anything.
 What am I missing? Attached the profiling result. 
-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-194) Inconsistent performance between hibernate search and pure lucene access