[jbosscache-dev] JBoss Cache Lucene Directory

Thu Apr 30 05:17:59 EDT 2009

On 29 Apr 2009, at 23:43, Andrew Duckworth wrote:
>> Not quite sure I understand your design - so this distributes the  
>> work
>> objects and each cluster member maintains indexes locally?  If so,  
>> you
>> need to know when all members have processed the work objects before
>> removing these.
>
> The master node processes all work objects written by the slave and  
> then updates the JBCDirectory held in the cache to distribute the  
> index back to all the slave nodes. I think this works well for  
> Hibernate Search for a few reasons:
>
> - On each slave, HB Search will continue to use the shared index  
> reader until the master publishes the next version of the index, so  
> fewer index updates translate into faster searching at the expense  
> of returning slightly out of date data. There is an obvious trade  
> off here of search performance vs accuracy of the index and the  
> current JMS HB Search solution based on file copying works best when  
> the index can be quite out of date without impacting the  
> application. For our application we'd prefer to have the index up to  
> date within a few seconds of the entity being modified and it's not  
> a requirement that the index be updated as part of the transaction.
>
> - The index updates are batched which means each individual update  
> takes less time due to some of the mechanisms at work inside HB  
> Search and Lucene
>
> - No distributed locking, so slaves are never blocked which provides  
> some limited insurance against one node impacting every other node

Well, as long as the master node completes the work objects, it should  
be fine for it to then remove them.

> One more question, does the JDBCCacheLoader integrate with the cache  
> transaction, i.e. does it create a transaction with equivalent DB  
> updates to the updates being applied to the cache ? For the current  
> JBCDirectory structure, it is important that the parent node is  
> updated in the same transaction as the node holding the file chunk  
> to avoid the index being left in a corrupt state. Does it matter if  
> the  loader is async vs sync ?

The JDBCCL *does* participate in the transaction, but only  
indirectly.  I.e., it does not register with the transaction manager  
directly, but the cache does and the cache propagates prepares and  
commits to the JDBCCL.  So you do have atomicity guarantees there.

Sync and async only apply to non-transactional behaviour with the  
cache loaders.

Cheers
--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org