[infinispan-dev] Hibernate Search alternative Directory distribution
Manik Surtani
manik at jboss.org
Thu Jul 9 10:16:15 EDT 2009
On 9 Jul 2009, at 14:57, Emmanuel Bernard wrote:
> Here is the concall notes on how to cluster and copy Hibernate
> indexes using non file system approaches.
>
> Forget JBoss Cache, forget plain JGroups and focus on Infinispan
> Start with Infinispan in replication mode (the most stable code) and
> then try distribution. It should be interesting to test the dist
> algo and see how well L1 cache behaves in a search environment.
> For the architecture, we will try the following approach in
> decreasing interest )If the first one works like a charm we stick
> with it):
> 1. share the same grid cache between the master and the slaves
> 2. have a local cache on the master where indexing is done and
> manually copy over the chuncks of changed data to the grid
> This requires to store some metadata (namely the list of chunks for
> a given index and the lastupdate for each chunk) to implement the
> same algorithm as the one implemented in FSMaster/
> SlaveDirectoryProvider (incremental copy).
> 3. have a local cache on the master where indexing is done and
> manually copy over the chuncks of changed data to the grid. Each
> slave copy from the grid to a local version of the index and use the
> local version for search.
>
> When writing the InfinispanDirectory (inspired by the RAMDirectory
> and the JBossCacheDirectory), one need to consider than Infinispan
> has a flat structure. The key has to contain:
> - the index name
> - the chunk name
> Both with essentially be the unique identifier.
> Each chunk should have its size limited (Lucene does that already
> AFAIK)
> Question on the metadata. one need ot keep the last update and the
> list of chuncks. Because Infinispan is not queryable, we need to
> store that as metadata:
> - should it be on each chunk (ie last time on each chunk, the size
> of a chunk)
> - on a dedicated metadata chunk ie one metadata chunk per chunk + a
> chink containing the list
> - on a single metadata chunk (I fear conflicts and inconsistencies)
>
> On changes or read explore the use of Infinispan transaction to
> ensure RR semantic. Is it necessary? A file system does not
> guarantee that anyway.
>
> In the case of replication, make sure a FD back end can be activated
> in case the grid goes to the unreachable clouds of total inactivity.
FD backend? I presume you mean a cache store. Have a look at the
different cache stores we ship with, I reckon a FileCacheStore would
do the trick for you.
http://infinispan.sourceforge.net/4.0/apidocs/org/infinispan/loaders/CacheStore.html
http://infinispan.sourceforge.net/4.0/apidocs/org/infinispan/loaders/file/FileCacheStore.html
> Question to Manik: do you have a cluster to play with once we reach
> this stage?
The cluster team does have a set of lab servers used to test,
benchmark, etc. You will need to "book" time on this cluster though
since it is shared between JBC/Infinispan, JGroups and JBoss AS
clustering devs.
Cheers
--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
More information about the infinispan-dev
mailing list