[infinispan-issues] [JBoss JIRA] (ISPN-6425) FileNotFoundException with async indexing backend
kostd kostd (JIRA)
issues at jboss.org
Tue Apr 19 09:34:00 EDT 2016
[ https://issues.jboss.org/browse/ISPN-6425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193766#comment-13193766 ]
kostd kostd commented on ISPN-6425:
-----------------------------------
[~gustavonalle], we have similar issue in production environment. Environment: wildfly 8.2.0.Final, infinispan 6.0.2.Final, hibernate-search 4.5.1.Final, hibernate-search-infinispan 4.5.1.Final, two nodes in hibernate-search cluster by jgroups 3.4.5.Final.
we use async data and metadata cache because it recommended for perf:
{quote}
if you need high performance on writes with the Lucene Directory the best option is to disable any CacheStore; the second best option is to configure the CacheStore as async .
{quote}
{code:title=our infinispan config}
<global>
<!-- Duplicate domains are allowed so that multiple deployments with default configuration of Hibernate Search applications
work - if possible it would be better to use JNDI to share the CacheManager across applications -->
<globalJmxStatistics enabled="true" cacheManagerName="HibernateSearch" allowDuplicateDomains="true" />
<!-- If the transport is omitted, there is no way to create distributed or clustered caches. There is no added cost to
defining a transport but not creating a cache that uses one, since the transport is created and initialized lazily. -->
<transport clusterName="${argus.textsearch.infinispan.cluster-name}" distributedSyncTimeout="240000">
<!-- Note that the JGroups transport uses sensible defaults if no configuration property is defined. See the JGroupsTransport
javadocs for more flags -->
<properties>
<property name="configurationFile" value="${jboss.home.dir}/domain/configuration/hibernatesearch-infinispan-jgroups-tcp.xml" />
</properties>
</transport>
<!-- Note that the JGroups transport uses sensible defaults if no configuration property is defined. See the Infinispan
wiki for more JGroups settings: http://community.jboss.org/wiki/ClusteredConfigurationQuickStart -->
<!-- Used to register JVM shutdown hooks. hookBehavior: DEFAULT, REGISTER, DONT_REGISTER. Hibernate Search takes care to
stop the CacheManager so registering is not needed -->
<shutdown hookBehavior="DONT_REGISTER" />
</global>
<!-- *************************** -->
<!-- Default "template" settings -->
<!-- *************************** -->
<default>
<locking lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="500" useLockStriping="false" />
<invocationBatching enabled="false" />
<!-- This element specifies that the cache is clustered. modes supported: distribution (d), replication (r) or invalidation
(i). Don't use invalidation to store Lucene indexes (as with Hibernate Search DirectoryProvider). Replication is recommended
for best performance of Lucene indexes, but make sure you have enough memory to store the index in your heap. Also distribution
scales much better than replication on high number of nodes in the cluster. -->
<clustering mode="replication">
<!-- Prefer loading all data at startup than later -->
<stateTransfer timeout="480000" fetchInMemoryState="true" />
<!-- Network calls are synchronous by default -->
<sync replTimeout="30000" />
</clustering>
<jmxStatistics enabled="true" />
<eviction maxEntries="-1" strategy="NONE" />
<expiration maxIdle="-1" />
</default>
<!-- *************************************** -->
<!-- Cache to store Lucene's file metadata -->
<!-- *************************************** -->
<namedCache name="LuceneIndexesMetadata">
<persistence passivation="false">
<singleFile fetchPersistentState="true" ignoreModifications="false" preload="true" purgeOnStartup="false"
shared="false" location="${jboss.server.data.dir}/textsearch-store/${argus.db.name}/">
<async enabled="true" />
</singleFile>
</persistence>
</namedCache>
<!-- **************************** -->
<!-- Cache to store Lucene data -->
<!-- **************************** -->
<namedCache name="LuceneIndexesData">
<persistence passivation="false">
<singleFile fetchPersistentState="true" ignoreModifications="false" preload="true" purgeOnStartup="false"
shared="false" location="${jboss.server.data.dir}/textsearch-store/${argus.db.name}/">
<async enabled="true" />
</singleFile>
</persistence>
</namedCache>
{code}
Why changes in this issue only corrects default value and do nothing with cases, when async metadata cache was selected explicitly?
We wanna fast async metadata cache and do not want to regularly catch FileNotFound. Can we, or should migrate to synchronous metadata(data?) cache?
May be it not possible to correct FileNotFoundException for async cache? Or may be our old hibernate-search-infinispan-6.0.2.Final.jar not affected to this issue? please help.
> FileNotFoundException with async indexing backend
> -------------------------------------------------
>
> Key: ISPN-6425
> URL: https://issues.jboss.org/browse/ISPN-6425
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying, Lucene Directory
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Fix For: 8.2.1.Final, 9.0.0.Alpha1, 9.0.0.Final
>
>
> The Infinispan directory defaults to {{write_metadata_async=true}} when the indexing backend is configured as async, i.e. {{default.worker.execution}} is {{true}}.
> The {{write_metadata_async=true}} will use {{cache.putAsync}} to write the index file metadata, while still deleting and creating files syncronously. This can lead to
> a stale metadata causing FileNotFoundExceptions when executing queries:
> Suppose a lucene directory contains files \[segments_4, _4.si\]. During normal regime, apart from the user thread, there could be other 2 threads that could be changing the index, the periodic commit thread (since backend is async) and the async deletion of files.
> The following race can happen:
> ||Time||Thread||work type||work||
> |T1|Hibernate Search: Commit Scheduler for index| SYNC | write files segments_5 and _5.si to the index
> |T2|Hibernate Search: Commit Scheduler for index| ASYNC | write the new file list containing \[segments_4, _4.si, segments_5,_5.si\]
> |T3|Hibernate Search: Commit Scheduler for index| ASYNC | enqueue a deletion task for files segments_4 and _4.si
> |T4|Hibernate Search: async deletion of index| SYNC | dequeue deletion task for files segments_4 and _4.si
> |T5|Hibernate Search: async deletion of index| SYNC | delete files segments_4 and _4.si from the index
> |T6|Hibernate Search: async deletion of index| ASYNC | write the new file list containing \[segments_5,_5.si\]
> |T7|User-thread| |open index reader, file list is \[segments_4, _4.si\], highest segment number is 4 (file list is not updated yet)
> |T8|User-thread| |open segments_4
> |T9|User-thread| |FileNotFoundException!
> |T10|remote-thread-User| | new file list received \[segments_4, _4.si, segments_5,_5.si\]
> |T11|remote-thread-User| | new file list received \[segments_5,_5.si\]
> This race can be observed in {{MassIndexerAsyncBackendTest#testMassIndexOnAsync}} that fails intermittently with the exception:
> {noformat}
> Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> {noformat}
> We should not enable {{write_metadata_async=true}} for async backends. The file list is already {{DeltaAware}}, so writing should not pose a meaningfull overhead when done synchronously.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
More information about the infinispan-issues
mailing list