See some comments inline
On May 25, 2009, at 11:53, Sanne Grinovero wrote:
Hello,
I'm forwarding this email to Emmanuel and Hibernate Search dev, as I
believe we should join the discussion.
Could we keep both dev-lists (jbosscache-dev(a)lists.jboss.org,
hibernate-dev(a)lists.jboss.org ) on CC ?
Sanne
2009/4/29 Manik Surtani <manik(a)jboss.org>:
>
> On 27 Apr 2009, at 05:18, Andrew Duckworth wrote:
>
>> Hello,
>>
>> I have been working on a Lucene Directory provider based on JBoss
>> Cache,
>> my starting point was an implementation Manik had already written
>> which
>> pretty much worked with a few minor tweaks. Our use case was to
>> cluster a
>> Lucene index being used with Hibernate Search in our application,
>> with the
>> requirements that searching needed to be fast, there was no shared
>> file
>> system and it was important that the index was consistent across
>> the cluster
>> in a relatively short time frame.
>>
>> Maniks code used a token node in the cache to implement the
>> distributed
>> lock. During my testing I set up multiple cache copies with
>> multiple threads
>> reading/writing to each cache copy. I was finding a lot of
>> transactions to
>> acquire or release this lock were timing out, not understanding
>> JBC well I
>> modified the distributed lock to use JGroups
>> DistrubutedLockManager. This
>> worked quite well, however the time taken to acquire/release the
>> lock (~100
>> ms for both) dwarfed the time to process the index update, lowering
>> throughput. Even using Hibernate Search with an async worker
>> thread, there
>> was still a lot of contention for the single lock which seemed to
>> limit the
>> scalability of the solution. I thinkl part of the problem was that
>> our use
>> of HB Search generates a lot of small units of work (remove index
>> entry, add
>> index entry) and each of these UOW acquire a new IndexWriter and
>> new write
>> lock on the underlying Lucene Directory implementation.
>>
>>
>> Out of curiosity, I created an alternative implementation based on
>> the
>> Hibernate Search JMS clustering strategy. Inside JBoss Cache I
>> created a
>> queue node and each slave node in the cluster creates a separate
>> queue
>> underneath where indexing work is written:
>>
>> /queue/slave1/[work0, work1, work2 ....]
>> /slave2
>> /slave3
>>
>> etc
>>
>> In each cluster member a background thread runs continuously when
>> it wakes
>> up, it decides if it is the master node or not (currently checks
>> if it is
>> the view coordinator, but I'm considering changing it to use a
>> longer lived
>> distributed lock). If it is the master it merges the tasks from
>> each slave
>> queue, and updates the JBCDirectory in one go, it can safely do
>> this with
>> only local VM locking. This approach means that in all the slave
>> nodes they
>> can write to their queue without needing a global lock that any
>> other slave
>> or the master would be using. On the master, it can perform
>> multiple updates
>> in the context of a single Lucene index writer. With a cache loader
>> configured, work that is written into the slave queue is
>> persistent, so it
>> can survive the master node crashing with automatic fail over to a
>> new
>> master meaning that eventually all updates should be applied to
>> the index.
>> Each work element in the queue is time stamped to allow them to be
>> processed
>> in order (requires!
>> time synchronisation across the cluster) by the master. For our
>> workload
>> the master/slave pattern seems to improve the throughput of the
>> system.
Interestingly, we are working on similar directions.
Sanne has been working on a new model where the master is guaranteed
not to share indexes with other writers. In this case we keep the IW
open for a long time (single lock) and makes significant improvements.
In // the new index needs to be distributed to the slaves, the current
model is the file copy (which avoids any lock issue) but a JGroups
version has been discussed. Now that I think about it more, it might
make sense to use JBoss Cache for the distribution simply by reusing
the file copy model:
- no write lock is shared amongst nodes
- each slave has an active and a passive directory. the passive can
receive the new index data from the master while the active node is
used for search. When the copy is done, active and passive switch
- each master copy the index on a regular basis to the shared model
(in this case the passive slave)?
I am not 100% sure it will work as we should only replicate data to
the passive node but that's a good thing to explore.
note that this approach does require much less lock that the current
JBoss Cache Directory implementation (as we use an async writing
approach).