Re: [jbosscache-dev] JBoss Cache Lucene Directory

Tuesday, 26 May 2009

Sanne,

Agreed.  Could all involved please make sure we post to both hibernate- 
dev as well as infinispan-dev (rather than jbosscache-dev) when  
discussing anything to do with such integration work.  As there are  
parallel efforts which can be brought together.

Cheers
Manik

On 25 May 2009, at 10:53, Sanne Grinovero wrote:

...
 Hello,
 I'm forwarding this email to Emmanuel and Hibernate Search dev, as I
 believe we should join the discussion.
 Could we keep both dev-lists (jbosscache-dev(a)lists.jboss.org,
 hibernate-dev(a)lists.jboss.org ) on CC ?

 Sanne

 2009/4/29 Manik Surtani <manik(a)jboss.org&gt;:
>
> On 27 Apr 2009, at 05:18, Andrew Duckworth wrote:
>
>> Hello,
>>
>> I have been working on a Lucene Directory provider based on JBoss  
>> Cache,
>> my starting point was an implementation Manik had already written  
>> which
>> pretty much worked with a few minor tweaks. Our use case was to   
>> cluster a
>> Lucene index being used with Hibernate Search in our application,  
>> with the
>> requirements that searching needed to be fast, there was no shared  
>> file
>> system and it was important that the index was consistent across  
>> the cluster
>> in a relatively short time frame.
>>
>> Maniks code used a token node in the cache to implement the  
>> distributed
>> lock. During my testing I set up multiple cache copies with  
>> multiple threads
>> reading/writing to each cache copy. I was finding a lot of  
>> transactions to
>> acquire or release this lock were timing out, not understanding  
>> JBC well  I
>> modified the distributed lock to use JGroups  
>> DistrubutedLockManager. This
>> worked quite well, however the time taken to acquire/release the  
>> lock (~100
>> ms for both) dwarfed the time to process the index update, lowering
>> throughput. Even using Hibernate Search with an async worker  
>> thread, there
>> was still a lot of contention for the single lock which seemed to  
>> limit the
>> scalability of the solution. I thinkl part of the problem was that  
>> our use
>> of HB Search generates a lot of small units of work (remove index  
>> entry, add
>> index entry) and each of these UOW acquire a new IndexWriter and  
>> new write
>> lock on the underlying Lucene Directory implementation.
>>
>>
>> Out of curiosity, I created an alternative implementation based on  
>> the
>> Hibernate Search JMS clustering strategy. Inside JBoss Cache I  
>> created a
>> queue node and each slave node in the cluster creates a separate  
>> queue
>> underneath where indexing work is written:
>>
>>  /queue/slave1/[work0, work1, work2 ....]
>>            /slave2
>>            /slave3
>>
>> etc
>>
>> In each cluster member a background thread runs continuously when  
>> it wakes
>> up, it decides if it is the master node or not (currently checks  
>> if it is
>> the view coordinator, but I'm considering changing it to use a   
>> longer lived
>> distributed lock). If it is the master it merges the tasks from  
>> each slave
>> queue, and updates the JBCDirectory in one go, it can safely do  
>> this with
>> only local VM  locking. This approach means that in all the slave  
>> nodes they
>> can write to their queue without needing a global lock that any  
>> other slave
>> or the master would be using. On the master, it can perform  
>> multiple updates
>> in the context of a single Lucene index writer. With a cache loader
>> configured, work that is written into the slave queue is  
>> persistent, so it
>> can survive the master node crashing with automatic fail over to a  
>> new
>> master meaning that eventually all updates should be applied to  
>> the index.
>> Each work element in the queue is time stamped to allow them to be  
>> processed
>> in order (requires!
>>  time synchronisation across the cluster) by the master. For our  
>> workload
>> the master/slave pattern seems to improve the throughput of the  
>> system.
>>
>>
>> Currently I'm refining the code and I have a few JBoss Cache  
>> questions
>> which I hope you can help me with:
>>
>> 1) I have noticed that under high load I get LockTimeoutExceptions  
>> writing
>> to /queue/slave0 when the lock owner is a transaction working on
>> /queue/slave1 , i.e. the same lock seems to be used for 2  
>> unrelated nodes in
>> the cache. I'm assuming this is a result of the lock striping  
>> algorithm, if
>> you could give me some insight into how this works that would be  
>> very
>> helpful. Bumping up the cache concurrency level from 500 to 2000  
>> seemed to
>> reduce this problem, however I'm not sure if it just reduces the  
>> probability
>> of a random event of if there is some level that will be  
>> sufficient to
>> eliminate the issue.
>
> It could well be the lock striping at work.  As of JBoss Cache  
> 3.1.0 you can
> disable lock striping and have one lock per node.  While this is  
> expensive
> in that if you have a lot of nodes, you end up with a lot of locks,  
> if you
> have a finite number of nodes this may help you a lot.
>
>> 2) Is there a reason to use separate nodes for each slave queue ?  
>> Will it
>> help with locking, or can each slave safely insert to the same  
>> parent node
>> in separate transactions without interfering or blocking each  
>> other ? If I
>> can reduce it to a single queue I thin that would be a more elegant
>> solution. I am setting the lockParentForChildInsertRemove to false  
>> for the
>> queue nodes.
>
> It depends.  Are the work objects attributes in /queue/slaveN ?   
> Remember
> that the granularity for all locks is the node itself so if all  
> slaves write
> to a single node, they will all compete for the same lock.
>
>> 3) Similarly, is there any reason why the master should/shouldn't  
>> take
>> responsibility for removing work nodes that have been processed ?
>
> Not quite sure I understand your design - so this distributes the  
> work
> objects and each cluster member maintains indexes locally?  If so,  
> you need
> to know when all members have processed the work objects before  
> removing
> these.
>
>> Thanks in advance for help, I hope to make this solution general  
>> purpose
>> enough to be able to contribute back to Hibernate Search and JBC  
>> teams.
>
> Thanks for offering to contribute.  :-)  One other thing that may  
> be of
> interest is that I just launched Infinispan [1] [2] - a new data grid
> product.  You could implement a directory provider on Infinispan  
> too - it is
> a lot more efficient than JBC at many things, including  
> concurrency.  Also,
> Infinispan's lock granularity is per-key/value pair.  So a single
> distributed cache would be all you need for work objects.  Also,  
> another
> thing that could help is the eager locking we have on the roadmap  
> [3] which
> may make a more traditional approach of locking + writing indexes  
> to the
> cache more feasible.  I'd encourage you to check it out.
>
> [1] http://www.infinispan.org
> [2]
> http://infinispan.blogspot.com/2009/04/infinispan-start-of-new-era-in-ope...
> [3] https://jira.jboss.org/jira/browse/ISPN-48
> --
> Manik Surtani
> manik(a)jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
> _______________________________________________
> jbosscache-dev mailing list
> jbosscache-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
> 
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009