[hibernate-dev] MassIndexer have any known issues when InfinispanDirectory is used?

Tue Aug 30 03:50:51 EDT 2011

Is it using exclusive_index_use=true ? Exclusive index use does not
work nicely with the MassIndexer in version 3.4, that's one of the
issues fixed in 4.0 (included in the last alpha release).
Is the same test & configuration working fine if you change it only
from Infinispan to a filesystem Directory ?

I'm asking these questions as the MassIndexer will grab this exclusive
lock from the main backend, and the "main backend" is supposed to not
have any pending activity when it's started. Are you aware of other
ongoing writes to the index while the MassIndexer starts?

If there is a limited amount of activity it should be able to acquire
the lock, but in the way Lucene's BaseLuceneLock works it's only
polling, not a fair lock, so if the standard backend is constantly
writing it might timeout waiting for the writes to finish.

A testcase would help; as pointed out in my previous mail I've written
one and found no issues, so please help me reproducing the issue. You
could also open an issue and attach both your Infinispan and Search
configurations.

Regards,
Sanne

2011/8/29 Tom Waterhouse <tomwaterhouse at gmail.com>:
> Here is the associated stack at the time of the call, if it would be
> helpful.  The stack is the same for both the initial call that obtains the
> lock and the second and subsequent calls that fail.
>
>     BaseLuceneLock(Lock).obtain(long) line: 72
>     IndexWriter.<init>(Directory, IndexWriterConfig) line: 1115
>     Workspace.createNewIndexWriter(DirectoryProvider<?>, IndexWriterConfig,
> ParameterSet) line: 202
>     Workspace.getIndexWriter(boolean, ErrorContextBuilder) line: 175
>     Workspace.getIndexWriter(boolean) line: 210
>     DirectoryProviderWorkspace.doWorkInSync(LuceneWork) line: 96
>
> LuceneBatchBackend$SyncBatchPerDirectoryWorkProcessor.addWorkToDpProcessor(DirectoryProvider<?>,
> LuceneWork) line: 139
>
> DpSelectionVisitor$PurgeAllSelectionDelegate.addAsPayLoadsToQueue(LuceneWork,
> IndexShardingStrategy, PerDirectoryWorkProcessor) line: 119
>     LuceneBatchBackend.sendWorkToShards(LuceneWork,
> PerDirectoryWorkProcessor) line: 119
>     LuceneBatchBackend.doWorkInSync(LuceneWork) line: 80
>     BatchCoordinator.beforeBatch() line: 153
>     BatchCoordinator.run() line: 96
>     MassIndexerImpl.startAndWait() line: 203
>     ArticleSearchControllerImpl.reindexAllArticles() line: 215
>     NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not
> available [native method]
>     NativeMethodAccessorImpl.invoke(Object, Object[]) line: 39
>     DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 25
>     Method.invoke(Object, Object...) line: 597
>     AopUtils.invokeJoinpointUsingReflection(Object, Method, Object[]) line:
> 307
>     ReflectiveMethodInvocation.invokeJoinpoint() line: 183
>     ReflectiveMethodInvocation.proceed() line: 150
>     TransactionInterceptor.invoke(MethodInvocation) line: 110
>     ReflectiveMethodInvocation.proceed() line: 172
>     JdkDynamicAopProxy.invoke(Object, Method, Object[]) line: 202
>     $Proxy168.reindexAllArticles() line: not available
>     ReindexArticlesJob.executeJob(JobExecutionContext) line: 24
>     ReindexArticlesJob(AbstractJob).execute(JobExecutionContext) line: 79
>     JobRunShell.run() line: 214
>     SimpleThreadPool$WorkerThread.run() line: 549
>
>
> On Mon, Aug 29, 2011 at 2:49 PM, Tom Waterhouse <tomwaterhouse at gmail.com>
> wrote:
>>
>> I set a breakpoint inside of org.apache.lucene.store.Lock.obtain(long) and
>> noticed something peculiar - the method is called twice.  The first call
>> succeeds, the second fails, my guess because the first call obtained the
>> lock.
>>
>> I added logging/stepped through our code to verify that we only make the
>> call one time, and in fact that is the case.
>>
>> We are using JPA and JTA, would that impact the call to obtain the lock in
>> such a way as to allow two calls?
>>
>> Here is the call we make, including the logging statements I've added.
>> This call is done inside of a JTA transaction.
>>
>>             FullTextEntityManager fullTextEntityManager =
>> org.hibernate.search.jpa.Search.getFullTextEntityManager(em);
>>
>>             logger.info("creating MassIndexer");
>>             MassIndexer massIndexer =
>> fullTextEntityManager.createIndexer(Article.class);
>>             massIndexer.batchSizeToLoadObjects(30);
>>             massIndexer.threadsForSubsequentFetching(8);
>>             massIndexer.threadsToLoadObjects(4);
>>             logger.info("starting MassIndexer");
>>             massIndexer.startAndWait();
>>
>>
>> On Sun, Aug 28, 2011 at 9:28 AM, Sanne Grinovero <sanne at hibernate.org>
>> wrote:
>>>
>>> Hi Tom,
>>> I've created a test checking for both event-driven changes and the
>>> MassIndexer, double-checking event-driven changes after the
>>> MassIndexer completion, starting two nodes initially and adding a
>>> third node dynamically during the test execution.. alls seems to work
>>> flawlessly, a part of taking so long due the jgroups delays for
>>> starting a new node (it takes ~14 seconds to run).
>>> you can find it here:
>>>
>>> https://github.com/Sanne/hibernate-search/tree/MassIndexerWithInfinispan5.0-Search3.4
>>>
>>> Could you please check this test out and change it as much as you feel
>>> is needed to reproduce the problem?
>>>
>>> Also note the previous commit to change the Infinispan version to 5.0:
>>> you mentioned you're using 5.0 but Search 3.4 was intended to support
>>> Infinispan 4.2.x, so I had to apply some minimal changes.
>>>
>>> You might want to try out Hibernate Search 4.0.0.Alpha1, intended to
>>> support Infinispan 5.x, but I've created this test for Search 3.4 as
>>> the backends interaction in 4.0 is very different: there are not two
>>> competing backends anymore, but a unified access to the IndexWriter,
>>> so to try reproducing your issue it was pointless to try it out on
>>> master.
>>>
>>> Sanne
>>>
>>> 2011/8/27 Tom Waterhouse <tomwaterhouse at gmail.com>:
>>> > Sanne,
>>> >
>>> > There aren't any other nodes involved in the cluster.  This is the
>>> > 'just
>>> > make it work' phase of the project, so the simplest configuration is
>>> > being
>>> > used.
>>> >
>>> > Note that normal index access is fine.  Entity operations populate the
>>> > Lucene indexes as expected, and search operations work as expected.  It
>>> > is
>>> > only the MassIndexer that has had trouble to this point.
>>> >
>>> > Tom
>>> >
>>> > On Fri, Aug 26, 2011 at 7:02 AM, Sanne Grinovero <sanne at hibernate.org>
>>> > wrote:
>>> >>
>>> >> Hi Tom,
>>> >>
>>> >> the MassIndexer needs to acquire the Directory lock, which is in this
>>> >> case distributed, i.e. it's a single lock to coordinate writes across
>>> >> all nodes (searches can happen in parallel, but writes can not).
>>> >>
>>> >> Is it possible that another node is writing to the index, or is any
>>> >> node using exclusive_index_use=true ?
>>> >>
>>> >> Regards,
>>> >> Sanne
>>> >>
>>> >> 2011/8/25 Tom Waterhouse <tomwaterhouse at gmail.com>:
>>> >> > I'm trying to setup clustering of entities and Lucene indexes for
>>> >> > our
>>> >> > app
>>> >> > with Hibernate 3.6.5, Hibernate Search 3.4.0, Infinispan 5.0.  I'm
>>> >> > using
>>> >> > FileCacheStore for the Infinispan cache loader
>>> >> > (InfinispanDirectoryProvider).
>>> >> >
>>> >> > MassIndexerImpl.startAndWait() never returns with this
>>> >> > configuration.  A
>>> >> > lock is never able to be obtained, see the stack from a thread dump
>>> >> > below.
>>> >> > The same MassIndexer call works fine when using FSDirectoryProvider.
>>> >> >
>>> >> > Should MassIndexer work with Infinispan as the directory?
>>> >> >
>>> >> > Tom
>>> >> >
>>> >> >  java.lang.Thread.State: TIMED_WAITING (sleeping)
>>> >> >    at java.lang.Thread.sleep(Native Method)
>>> >> >    at org.apache.lucene.store.Lock.obtain(Lock.java:91)
>>> >> >    at
>>> >> > org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1097)
>>> >> >    at
>>> >> >
>>> >> >
>>> >> > org.hibernate.search.backend.Workspace.createNewIndexWriter(Workspace.java:202)
>>> >> >    at
>>> >> >
>>> >> >
>>> >> > org.hibernate.search.backend.Workspace.getIndexWriter(Workspace.java:175)
>>> >> >    - locked <7793180e8> (a org.hibernate.search.backend.Workspace)
>>> >> > _______________________________________________
>>> >> > hibernate-dev mailing list
>>> >> > hibernate-dev at lists.jboss.org
>>> >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> >> >
>>> >
>>> >
>>
>
>