[JBoss JIRA] (ISPN-6098) LockManagerTest.testMultipleCounterStripped random failures
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-6098?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-6098:
-------------------------------
Attachment: LockManagerTest_ISPN-6417_remove-replication-queue_20160322.log.gz
I got the test to hang again, and this time I have a trace log.
One more request: Please decrease the timeout from 1 day, I think 1 minute should be enough.
> LockManagerTest.testMultipleCounterStripped random failures
> -----------------------------------------------------------
>
> Key: ISPN-6098
> URL: https://issues.jboss.org/browse/ISPN-6098
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 8.1.0.Final
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Fix For: 9.0.0.Alpha1
>
> Attachments: LockManagerTest_ISPN-6417_remove-replication-queue_20160322.log.gz
>
>
> Doesn't appear to fail in CI, but I did get a couple failures running the test on my machine:
> {noformat}
> 17:42:29,998 ERROR (testng-LockManagerTest:) [UnitTestTestNGListener] Test testMultipleCounterStripped(org.infinispan.lock.LockManagerTest) failed.
> java.util.concurrent.ExecutionException: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 60 seconds for key key-6 and requestor Thread[pool-1127-thread-4,5,main]. Lock is held by null
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_60]
> at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_60]
> at org.infinispan.lock.LockManagerTest.doMultipleCounterTest(LockManagerTest.java:193) ~[test-classes/:?]
> at org.infinispan.lock.LockManagerTest.testMultipleCounterStripped(LockManagerTest.java:67) ~[test-classes/:?]
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 60 seconds for key key-6 and requestor Thread[pool-1127-thread-4,5,main]. Lock is held by null
> at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.lock(DefaultLockManager.java:236) ~[classes/:?]
> at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$CompositeLockPromise.lock(DefaultLockManager.java:305) ~[classes/:?]
> at org.infinispan.lock.LockManagerTest.lambda$doMultipleCounterTest$483(LockManagerTest.java:163) ~[test-classes/:?]
> ... 4 more
> {noformat}
> In another run, I got a deadlock that I think is related in {{InfinispanLockTest}}:
> {noformat}
> "testng-InfinispanLockTest" #19 prio=5 os_prio=0 tid=0x00007ff7f0eeb800 nid=0x4f5d waiting on condition [0x00007ff770da5000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000084202448> (a java.util.concurrent.FutureTask)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> at org.infinispan.lock.InfinispanLockTest.testSingleCounter(InfinispanLockTest.java:262)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers: - <0x0000000082875cc0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "pool-1011-thread-7" #9840 prio=5 os_prio=0 tid=0x00007ff7380da000 nid=0x75f7 waiting on condition [0x00007ff698409000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000842020d8> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:74)
> at org.infinispan.util.concurrent.locks.impl.InfinispanLock$LockPlaceHolder.lock(InfinispanLock.java:319)
> at org.infinispan.lock.InfinispanLockTest.lambda$testSingleCounter$509(InfinispanLockTest.java:243)
> at org.infinispan.lock.InfinispanLockTest$$Lambda$516/1905052434.call(Unknown Source)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers: - <0x00000000842021f8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "pool-1011-thread-6" #9839 prio=5 os_prio=0 tid=0x00007ff738130800 nid=0x75f6 waiting on condition [0x00007ff699c21000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000000842023a0> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:74)
> at org.infinispan.util.concurrent.locks.impl.InfinispanLock$LockPlaceHolder.lock(InfinispanLock.java:319)
> at org.infinispan.lock.InfinispanLockTest.lambda$testSingleCounter$509(InfinispanLockTest.java:243)
> at org.infinispan.lock.InfinispanLockTest$$Lambda$516/1905052434.call(Unknown Source)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers: - <0x0000000084202468> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> I also suggest using {{AbstractInfinispanTest.fork()}} instead of an explicit {{ExecutorService}} in both tests, because without the test name in the thread name it's impossible to filter the test logs.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years
[JBoss JIRA] (ISPN-6337) MassIndexerAsyncBackendTest.testMassIndexOnAsync random failures
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6337?page=com.atlassian.jira.plugin.... ]
Work on ISPN-6337 started by Gustavo Fernandes.
-----------------------------------------------
> MassIndexerAsyncBackendTest.testMassIndexOnAsync random failures
> ----------------------------------------------------------------
>
> Key: ISPN-6337
> URL: https://issues.jboss.org/browse/ISPN-6337
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Query
> Affects Versions: 8.2.0.CR1
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
>
> {code}
> java.lang.RuntimeException: Unexpected!
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:166)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:146)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:122)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:456)
> at org.infinispan.query.distributed.MassIndexerAsyncBackendTest.assertAllIndexed(MassIndexerAsyncBackendTest.java:59)
> at org.infinispan.query.distributed.MassIndexerAsyncBackendTest.testMassIndexOnAsync(MassIndexerAsyncBackendTest.java:53)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
> at org.testng.internal.MethodInvocationHelper$1.runTestMethod(MethodInvocationHelper.java:200)
> at org.jboss.byteman.contrib.bmunit.BMNGAbstractRunner.run(BMNGAbstractRunner.java:58)
> at sun.reflect.GeneratedMethodAccessor169.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.testng.internal.MethodInvocationHelper.invokeHookable(MethodInvocationHelper.java:212)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:707)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.hibernate.search.exception.SearchException: Unable to reopen IndexReader
> at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:242)
> at org.hibernate.search.indexes.impl.SharingBufferReaderProvider.openIndexReader(SharingBufferReaderProvider.java:73)
> at org.hibernate.search.indexes.impl.SharingBufferReaderProvider.openIndexReader(SharingBufferReaderProvider.java:35)
> at org.hibernate.search.reader.impl.ManagedMultiReader.createInstance(ManagedMultiReader.java:68)
> at org.hibernate.search.reader.impl.MultiReaderFactory.openReader(MultiReaderFactory.java:42)
> at org.hibernate.search.query.engine.impl.LuceneHSQuery.buildSearcher(LuceneHSQuery.java:477)
> at org.hibernate.search.query.engine.impl.LuceneHSQuery.buildSearcher(LuceneHSQuery.java:368)
> at org.hibernate.search.query.engine.impl.LuceneHSQuery.queryEntityInfos(LuceneHSQuery.java:131)
> at org.infinispan.query.impl.CacheQueryImpl.list(CacheQueryImpl.java:161)
> at org.infinispan.query.distributed.MassIndexerAsyncBackendTest$1.isSatisfied(MassIndexerAsyncBackendTest.java:62)
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:161)
> ... 31 more
> Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138)
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102)
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109)
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294)
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493)
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683)
> at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490)
> at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344)
> at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300)
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263)
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251)
> at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:137)
> at org.hibernate.search.indexes.impl.SharingBufferReaderProvider$PerDirectoryLatestReader.refreshAndGet(SharingBufferReaderProvider.java:239)
> ... 41 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years
[JBoss JIRA] (ISPN-6425) FileNotFoundException with async indexing backend
by Gustavo Fernandes (JIRA)
Gustavo Fernandes created ISPN-6425:
---------------------------------------
Summary: FileNotFoundException with async indexing backend
Key: ISPN-6425
URL: https://issues.jboss.org/browse/ISPN-6425
Project: Infinispan
Issue Type: Bug
Components: Embedded Querying, Lucene Directory
Affects Versions: 8.2.0.Final
Reporter: Gustavo Fernandes
Assignee: Gustavo Fernandes
Fix For: 9.0.0.Final, 8.2.1.Final
The Infinispan directory defaults to {{write_metadata_async=true}} when the indexing backend is configured as async, i.e. {{default.worker.execution}} is {{true}}.
The {{write_metadata_async=true}} will use {{cache.putAsync}} to write the index file metadata, while still deleting and creating files syncronously. This can lead to
a stale metadata causing FileNotFoundExceptions when executing queries:
Suppose a lucene directory contains files \[segments_4, _4.si\]. During normal regime, apart from the user thread, there could be other 2 threads that could be changing the index, the periodic commit thread (since backend is async) and the async deletion of files.
The following race can happen:
||Time||Thread||work type||work||
|T1|Hibernate Search: Commit Scheduler for index| SYNC | write files segments_5 and _5.si to the index
|T2|Hibernate Search: Commit Scheduler for index| ASYNC | write the new file list containing \[segments_4, _4.si, segments_5,_5.si\]
|T3|Hibernate Search: Commit Scheduler for index| ASYNC | enqueue a deletion task for files segments_4 and _4.si
|T4|Hibernate Search: async deletion of index| SYNC | dequeue deletion task for files segments_4 and _4.si
|T5|Hibernate Search: async deletion of index| SYNC | delete files segments_4 and _4.si from the index
|T6|Hibernate Search: async deletion of index| ASYNC | write the new file list containing \[segments_5,_5.si\]
|T7|User-thread| |open index reader, file list is \[segments_4, _4.si\], highest segment number is 4 (file list is not updated yet)
|T8|User-thread| |open segments_4
|T9|User-thread| |FileNotFoundException!
|T10|remote-thread-User| | new file list received \[segments_4, _4.si, segments_5,_5.si\]
|T11|remote-thread-User| | new file list received \[segments_5,_5.si\]
This race can be observed in {{MassIndexerAsyncBackendTest#testMassIndexOnAsync}} that fails intermittently with the exception:
{noformat}
Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1
at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
{noformat}
We should not enable {{write_metadata_async=true}} for async backends. The file list is already {{DeltaAware}}, so writing should not pose a meaningfull overhead when done synchronously.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years
[JBoss JIRA] (ISPN-6425) FileNotFoundException with async indexing backend
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6425?page=com.atlassian.jira.plugin.... ]
Work on ISPN-6425 started by Gustavo Fernandes.
-----------------------------------------------
> FileNotFoundException with async indexing backend
> -------------------------------------------------
>
> Key: ISPN-6425
> URL: https://issues.jboss.org/browse/ISPN-6425
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying, Lucene Directory
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Fix For: 9.0.0.Final, 8.2.1.Final
>
>
> The Infinispan directory defaults to {{write_metadata_async=true}} when the indexing backend is configured as async, i.e. {{default.worker.execution}} is {{true}}.
> The {{write_metadata_async=true}} will use {{cache.putAsync}} to write the index file metadata, while still deleting and creating files syncronously. This can lead to
> a stale metadata causing FileNotFoundExceptions when executing queries:
> Suppose a lucene directory contains files \[segments_4, _4.si\]. During normal regime, apart from the user thread, there could be other 2 threads that could be changing the index, the periodic commit thread (since backend is async) and the async deletion of files.
> The following race can happen:
> ||Time||Thread||work type||work||
> |T1|Hibernate Search: Commit Scheduler for index| SYNC | write files segments_5 and _5.si to the index
> |T2|Hibernate Search: Commit Scheduler for index| ASYNC | write the new file list containing \[segments_4, _4.si, segments_5,_5.si\]
> |T3|Hibernate Search: Commit Scheduler for index| ASYNC | enqueue a deletion task for files segments_4 and _4.si
> |T4|Hibernate Search: async deletion of index| SYNC | dequeue deletion task for files segments_4 and _4.si
> |T5|Hibernate Search: async deletion of index| SYNC | delete files segments_4 and _4.si from the index
> |T6|Hibernate Search: async deletion of index| ASYNC | write the new file list containing \[segments_5,_5.si\]
> |T7|User-thread| |open index reader, file list is \[segments_4, _4.si\], highest segment number is 4 (file list is not updated yet)
> |T8|User-thread| |open segments_4
> |T9|User-thread| |FileNotFoundException!
> |T10|remote-thread-User| | new file list received \[segments_4, _4.si, segments_5,_5.si\]
> |T11|remote-thread-User| | new file list received \[segments_5,_5.si\]
> This race can be observed in {{MassIndexerAsyncBackendTest#testMassIndexOnAsync}} that fails intermittently with the exception:
> {noformat}
> Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> {noformat}
> We should not enable {{write_metadata_async=true}} for async backends. The file list is already {{DeltaAware}}, so writing should not pose a meaningfull overhead when done synchronously.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years
[JBoss JIRA] (ISPN-6425) FileNotFoundException with async indexing backend
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6425?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes updated ISPN-6425:
------------------------------------
Status: Open (was: New)
> FileNotFoundException with async indexing backend
> -------------------------------------------------
>
> Key: ISPN-6425
> URL: https://issues.jboss.org/browse/ISPN-6425
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying, Lucene Directory
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Fix For: 9.0.0.Final, 8.2.1.Final
>
>
> The Infinispan directory defaults to {{write_metadata_async=true}} when the indexing backend is configured as async, i.e. {{default.worker.execution}} is {{true}}.
> The {{write_metadata_async=true}} will use {{cache.putAsync}} to write the index file metadata, while still deleting and creating files syncronously. This can lead to
> a stale metadata causing FileNotFoundExceptions when executing queries:
> Suppose a lucene directory contains files \[segments_4, _4.si\]. During normal regime, apart from the user thread, there could be other 2 threads that could be changing the index, the periodic commit thread (since backend is async) and the async deletion of files.
> The following race can happen:
> ||Time||Thread||work type||work||
> |T1|Hibernate Search: Commit Scheduler for index| SYNC | write files segments_5 and _5.si to the index
> |T2|Hibernate Search: Commit Scheduler for index| ASYNC | write the new file list containing \[segments_4, _4.si, segments_5,_5.si\]
> |T3|Hibernate Search: Commit Scheduler for index| ASYNC | enqueue a deletion task for files segments_4 and _4.si
> |T4|Hibernate Search: async deletion of index| SYNC | dequeue deletion task for files segments_4 and _4.si
> |T5|Hibernate Search: async deletion of index| SYNC | delete files segments_4 and _4.si from the index
> |T6|Hibernate Search: async deletion of index| ASYNC | write the new file list containing \[segments_5,_5.si\]
> |T7|User-thread| |open index reader, file list is \[segments_4, _4.si\], highest segment number is 4 (file list is not updated yet)
> |T8|User-thread| |open segments_4
> |T9|User-thread| |FileNotFoundException!
> |T10|remote-thread-User| | new file list received \[segments_4, _4.si, segments_5,_5.si\]
> |T11|remote-thread-User| | new file list received \[segments_5,_5.si\]
> This race can be observed in {{MassIndexerAsyncBackendTest#testMassIndexOnAsync}} that fails intermittently with the exception:
> {noformat}
> Caused by: java.io.FileNotFoundException: Error loading metadata for index file: M|segments_4|commonIndex|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-9.0.0-SNAPSHOT.jar:9.0.0-SNAPSHOT]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:493) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:490) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:344) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:300) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:263) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251) ~[lucene-core-5.5.0.jar:5.5.0 2a228b3920a07f930f7afb6a42d0d20e184a943c - mike - 2016-02-16 15:18:34]
> {noformat}
> We should not enable {{write_metadata_async=true}} for async backends. The file list is already {{DeltaAware}}, so writing should not pose a meaningfull overhead when done synchronously.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years
[JBoss JIRA] (ISPN-6414) Hybrid query + compat mode applies type conversion too early (perf issue)
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-6414?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-6414:
----------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Hybrid query + compat mode applies type conversion too early (perf issue)
> -------------------------------------------------------------------------
>
> Key: ISPN-6414
> URL: https://issues.jboss.org/browse/ISPN-6414
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying
> Affects Versions: 8.2.0.Final
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 9.0.0.Final, 9.0.0.Alpha1, 8.2.1.Final
>
>
> To reproduce we need a compat mode cache and a remote hybrid query or any query that is executed in two phases (indexed + non indexed). The type conversion interceptor will convert the result of the first phase (from Java object to protobuf) and then the second phase will execute a matcher on the protobuf output of the first phase.
> This can be optimized by avoiding early type conversion. The output of the first phase should produce unconverted results, ie. java objects. Then the second phase matcher should filter on these objects (ReflectionMatcher). These results are then implicitly type-converted when are marshalled back to the client, but the type converter interceptor is no longer involved. To avoid type conversion kicking in early we need to stop obtaining the cache using the Flag.OPERATION_HOTROD in org.infinispan.query.remote.impl.LifecycleManager, and then there are some other places in RemoteQueryEngine and friends that need to be adjusted following this change.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years
[JBoss JIRA] (ISPN-5495) ConcurrentStartTest.testConcurrentStart random failures
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5495?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5495:
----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 9.0.0.Final
9.0.0.Alpha1
8.2.1.Final
(was: 8.2.0.CR1)
Resolution: Done
> ConcurrentStartTest.testConcurrentStart random failures
> -------------------------------------------------------
>
> Key: ISPN-5495
> URL: https://issues.jboss.org/browse/ISPN-5495
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 9.0.0.Final, 9.0.0.Alpha1, 8.2.1.Final
>
>
> {noformat}
> org.testng.internal.thread.ThreadTimeoutException: Method org.testng.internal.TestNGMethod.testConcurrentStart() didn't finish within the time-out 60000
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
> at org.infinispan.test.TestingUtil.waitForRehashToComplete(TestingUtil.java:253)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years