[infinispan-issues] [JBoss JIRA] (ISPN-5947) Infinispan directory provider is a lot slower when lucene caches are distributed compared to replicated
Sanne Grinovero (JIRA)
issues at jboss.org
Wed Aug 2 06:04:00 EDT 2017
[ https://issues.jboss.org/browse/ISPN-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sanne Grinovero resolved ISPN-5947.
-----------------------------------
Resolution: Rejected
Not a bug, this is a direct consequence of the current design. We recommend using replicated caches for index storage for this reason.
> Infinispan directory provider is a lot slower when lucene caches are distributed compared to replicated
> -------------------------------------------------------------------------------------------------------
>
> Key: ISPN-5947
> URL: https://issues.jboss.org/browse/ISPN-5947
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying
> Reporter: Jakub Markos
> Assignee: Gustavo Fernandes
>
> I noticed that the difference in performance when using Infinispan Directory Provider with lucene data cache in distributed mode compared to replicated mode is quite big. In numbers, on my computer, running a 4 node cluster with a distributed cache with indexing enabled:
> {code}
> <distributed-cache name="dist_lucene" owners="2" statistics="true">
> <indexing index="LOCAL">
> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
> <property name="default.exclusive_index_use">true</property>
> <property name="default.metadata_cachename">lucene_metadata</property>
> <property name="default.data_cachename">lucene_data</property>
> <property name="default.locking_cachename">lucene_locking</property>
> </indexing>
> </distributed-cache>
> <replicated-cache name="lucene_metadata" mode="SYNC" remote-timeout="25000">
> <indexing index="NONE"/>
> </replicated-cache>
> <replicated-cache name="lucene_data" mode="SYNC" remote-timeout="25000">
> <indexing index="NONE"/>
> </replicated-cache>
> <replicated-cache name="lucene_locking" mode="SYNC" remote-timeout="25000">
> <indexing index="NONE"/>
> </replicated-cache>
> {code}
> Using 10 threads on each node, loading 100 000 entries takes ~2.5 minutes, and using 100 threads takes ~1 minute. Changing the configuration to use a distributed cache for the index data:
> {code}
> <distributed-cache name="lucene_data" mode="SYNC" remote-timeout="25000">
> <indexing index="NONE"/>
> </distributed-cache>
> {code}
> leads to loading times 3+ hours (10 threads, I stopped it at around 80000 entries) and 22 minutes (100 threads), which is around 20x slowdown.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
More information about the infinispan-issues
mailing list