[infinispan-issues] [JBoss JIRA] (ISPN-5452) Query Execution using Hibernate Search slow for large volume data
Prashant Thakur (JIRA)
issues at jboss.org
Thu Jul 23 13:02:06 EDT 2015
[ https://issues.jboss.org/browse/ISPN-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092356#comment-13092356 ]
Prashant Thakur commented on ISPN-5452:
---------------------------------------
Hi Sanne,
We have been doing some profiling and changes to improve around BuildSearcher piece. We observed that if codec is changed to org.apache.lucene.codecs.bloom for fields which are discrete there is huge dip in buildSearcher part. Can you please look at providing this as configuration with other annotations.
We also see that most part of time is spent in Parsing If we have something like bind variables in Oracle and reuse the query plan with change in just parameter it would save alot of time and we can expect 2x improvement than current speed.
I saw a ticket being raised for same https://issues.jboss.org/browse/ISPN-5414 . If there is some work already in progress and you would wish us to contribute in testing or coding please let us know . This is an absolute must for achieving the performance figures and we would like to avoid duplication of effort.
Regarding 2nd question Test run have the same number of Keys and Same number of queries. So we are sure multiple queries are not fired. For sample data I am attaching the profiling info in attached excel sheet where we see time spent in various phases.
Please let us know if you require extra information
> Query Execution using Hibernate Search slow for large volume data
> -----------------------------------------------------------------
>
> Key: ISPN-5452
> URL: https://issues.jboss.org/browse/ISPN-5452
> Project: Infinispan
> Issue Type: Bug
> Components: Configuration, Remote Querying
> Affects Versions: 7.2.1.Final, 8.0.0.Final
> Environment: Linux
> Reporter: Prashant Thakur
> Attachments: Infinispan Dist Cache.xlsx, profiling_results.7z
>
>
> While benchmarking Infinispan we found that Querying is very slow when compared with Hibernate Search in Isolation
> Single node of Infinispan
> Memory allocated 230GB. No GC seen throughout query operation.
> Total required after full GC was 122GB.
> Setup 240 million records each of avg size 330 bytes .
> System has 16 cores and 40 worker threads were allocated at server side.
> With Single Client thread throughput was 900 req/sec in remote and 3k per sec in embedded more same request with Hibernate Search in Isolation gives throughput of 14000 req/sec.
> For 50 threads of clients the throughput was limited to 15k req/sec while hibernate search gives 80k req/sec for 10 threads.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
More information about the infinispan-issues
mailing list