"Live async" MassIndexer
by Sanne Grinovero
I've written some ideas for MassIndexer improvements in the forum, to
allow it to work in background using updates instead of wiping out the
index, so that it doesn't prevent running applications to find
(it's being asked now, but something I wanted to add anyway at some point) :
An alternative would be to have it create a new index from scratch,
and switch at the end, but this introduces the problem of applying
changes of transactions which happen during the indexing; we could
apply transactional changes to both, but this seems more complex than
using the single index.
13 years
[HSEARCH] About HSEARCH-917 or DSL API and query parser
by Guillaume Smet
I opened https://hibernate.onjira.com/browse/HSEARCH-917 a few months
ago about something that really prevents us from using the DSL API in
a lot of cases.
I explained why in the JIRA issue. While I'm OK to do the ground work
of coding something and adding tests for it, I think it might be a
good idea to discuss it before.
Basically, the problem is that when I search for XXXX-AAAA-HAGYU-19910
using an analyzer with the WordDelimiterFilterFactory filter, the DSL
API searches for "XXXX" OR "AAAA" OR "HAGYU" OR "19910" (yes, OR). In
this case, the Lucene QueryParser is designed to look for "XXXX" AND
"AAAA" AND "HAGYU" AND "19910".
The underlying problem is that in
ConnectedMultiFieldsTermQueryBuilder, we don't use the QueryParser to
build the Lucene query but a getAllTermsFromText() method which uses
the analyzer to get all the terms and from that it builds a OR query.
You can also observe the problem if you search for more than one word.
I think it's plain wrong in the case of a standard search and it
should be fixed by using the Lucene QueryParser to build the query.
The only problem I see with using the Lucene query parser is that it
doesn't pass the text through the analyzer for a fuzzy or wildcard
search (but we have a special case for wildcard so I think it's
already working this way with the current code). I'm not sure it's
really a problem, considering that it's a well known Lucene behaviour.
But it's probably why it's done that way (maybe someone with the
history can explain why it's done that way).
It would be cool to discuss this problem and find an acceptable solution.
Have a nice day.
13 years
[HSEARCH] Lucene lock problems with Hibernate Search 4.0
by Guillaume Smet
After our upgrade to 4.0 (from 3.4.1), we've started to have a lot of
lock errors on our development boxes. I thought it was due to brutal
kills of the JVM (typically Terminate in Eclipse) but we also have
them in our CI environment on a project where every test is OK.
The stracktraces look like this:
[2011-12-28 12:17:55,963] ERROR - LuceneBackendQueueTask -
HSEARCH000072: Couldn't open the IndexWriter because of previous
error: operation skipped, index ouf of sync!
[2011-12-28 12:17:56,974] ERROR - LogErrorHandler -
HSEARCH000058: Exception occurred
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out: SimpleFSLock(a)/data/services/test/data/helios/lucene/fr.openwide.helios.core.business.contract.model.Company/write.lock
Primary Failure:
Entity fr.openwide.helios.core.business.contract.model.Company Id 1
Work Type org.hibernate.search.backend.DeleteLuceneWork
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out: SimpleFSLock(a)/data/services/test/data/helios/lucene/fr.openwide.helios.core.business.contract.model.Company/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1115)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:125)
at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:100)
at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:114)
at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:101)
at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:69)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Does it ring a bell? Does anyone have any idea of where I should start
to investigate?
As you can imagine, it's pretty annoying (especially when it's a
project with 100k entities and we need to reindex after this error to
sync the indexes again).
I never saw this problem prior to 4.0 (and we have quite a lot of
applications in production which uses Hibernate Search).
Thanks for your feedback.
13 years