On 29 Apr 2009, at 23:43, Andrew Duckworth wrote:
> Not quite sure I understand your design - so this distributes the
> work
> objects and each cluster member maintains indexes locally? If so,
> you
> need to know when all members have processed the work objects before
> removing these.
The master node processes all work objects written by the slave and
then updates the JBCDirectory held in the cache to distribute the
index back to all the slave nodes. I think this works well for
Hibernate Search for a few reasons:
- On each slave, HB Search will continue to use the shared index
reader until the master publishes the next version of the index, so
fewer index updates translate into faster searching at the expense
of returning slightly out of date data. There is an obvious trade
off here of search performance vs accuracy of the index and the
current JMS HB Search solution based on file copying works best when
the index can be quite out of date without impacting the
application. For our application we'd prefer to have the index up to
date within a few seconds of the entity being modified and it's not
a requirement that the index be updated as part of the transaction.
- The index updates are batched which means each individual update
takes less time due to some of the mechanisms at work inside HB
Search and Lucene
- No distributed locking, so slaves are never blocked which provides
some limited insurance against one node impacting every other node
Well, as long as the master node completes the work objects, it should
be fine for it to then remove them.
One more question, does the JDBCCacheLoader integrate with the cache
transaction, i.e. does it create a transaction with equivalent DB
updates to the updates being applied to the cache ? For the current
JBCDirectory structure, it is important that the parent node is
updated in the same transaction as the node holding the file chunk
to avoid the index being left in a corrupt state. Does it matter if
the loader is async vs sync ?
The JDBCCL *does* participate in the transaction, but only
indirectly. I.e., it does not register with the transaction manager
directly, but the cache does and the cache propagates prepares and
commits to the JDBCCL. So you do have atomicity guarantees there.
Sync and async only apply to non-transactional behaviour with the
cache loaders.
Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org