[hibernate-dev] Hibernate Search: making changes to different indexes concurrently

Sanne Grinovero sanne.grinovero at gmail.com
Thu Oct 23 13:26:50 EDT 2008

After the latest changes all locks between different DirectoryProviders
are gone and "sequentiality" is no longer needed, I would like to
have the changes made on different DP happen concurrently,
this is needed for several other optimizations.

the workQueue gets split by DP (code exists already), and then:

in case of "sync" indexing:
  schedule each sub-queue to the executor as separate tasks,
 and wait for them all.

in case of "async" indexing:
 just schedule each sub-queue, no need to wait.

This is quite easy to implement -inside- the luceneBackend, but there I have no
reference to the original ThreadPool I should use for this task.
If I could move this code to the BatchedQueueingProcessor I would see several
1) the optimization is "free" for other existing backends.
2) You could create a different Runnable for each

However if I do this the naive way I would break the current
BackendQueueProcessorFactory API,
which I know for sure many customers implement by reading around the forum.

The most elegant solution is IMHO: instantiate a different
for each DP; this way the API's  don't change, you get full
concurrency without having
to worry about support by the backends, and it would be very easy to
have different
implementations for each DP's backend, configured independently as
"sync" or "async":
you could use jms for some indexes and lucene for others, or other
instantiated with different parameters (think about different JMS topics).

The code is pretty simple, can be done changing in a first step only
the BatchedQueueingProcessor
to solve for example
" New Feature - HSEARCH-268 - Run optimize() (all classes) in parallel"
and then with two more patches to remove unneeded code from the
current lucene backend,
and finally to enable indipendend backends configuration (just adapt
the cfg reading
and document it all).

Having the BatchedQueueingProcessor "have a look" in the queue has
other benefits,
it would be the right place to implement
" Bug - HSEARCH-257 - Ignore delete operation when Core does update
then delete on the same entity"
as a map is being built for each "todo" (I'm going to need it anyway
to classify work per-DP).

I would be happy to implement the first step soon ( before tomorrow),
that is creating N-BackendQueueProcessorFactorys
instead of one, to submit the works to concurrent queues.


kind regards,

