On 24 Sep 2012, at 16:22, Dan Berindei <dan.berindei(a)gmail.com> wrote:
Hi guys
During the final push for NBST I found a bug with preloading (entries that didn't
belong on a joiner weren't removed after the initial state transfer). I decided to fix
it and
https://issues.jboss.org/browse/ISPN-1586 at the same time, since it was a
longstanding bug and I had a reasonable idea on what to do. However, I missed some
implications and I need to fix them - there is at least one Query test failing because of
my change (SharedCacheLoaderQueryIndexTest).
In 5.1, preloading worked like this:
1. Start the CacheLoaderManager, which preloads everything from the cache store in
memory.
2. Start the StateTransferManager, retrieving data from the other cache members and
overwriting already-preloaded values.
3. When the initial state transfer ends, entries not owned by the local node are
deleted.
The main issue with this, raised in ISPN-1586, is that entries that were deleted on the
other cache members are "revived" on the joiner when it reads the data from the
cache store. There is another performance issue, because we load a lot of data that we
then discard, but that's less important.
With the ISPN-1586 fix, preloading should work like this:
1. Start the StateTransferManager, receive initial CH.
2. If the local node is not the first to start up, fetching state (either in-memory or
persistent) is enabled and the cache store is non-shared, clear it.
3. Start the CacheLoaderManager, which preloads the cache store in memory - but only if
the local node is the first one having started the cache OR if the fetching state is
disabled.
4. Run the initial state transfer, retrieving data from the other cache members (if any,
and if fetching state is enabled).
This solves ISPN-1586, but it does mean that data from non-shared cache stores will be
lost on all the nodes except the first that starts up. So if the last node to shut down is
not the first node to start back up, the cluster will lose data.
These are the alternatives I'm considering:
a) Finish the ISPN-1586 fix and clearly document that non-shared cache stores don't
guarantee persistence after cluster restart (unless the last cache to stop is the first to
start back up and shutdown was spaced out to allow state transfer to move everything to
the last node).
b) Revert my ISPN-1586 fix and allow "zombie" cache entries on the joiners
(leaving ISPN-1586 open).
Maybe another approach could be:
1. Start the STM, retrieve initial CH
2. If the local node… (as above) … is non-shared, *don't clear it*, but mark the node
so preloading is *deferred*.
3. Start the CLM … skip preload if we mark it as deferred, in step 2.
4. Run initial state transfer. This will write newer versions of entries to the cache
store if needed.
5. Now, if preloading has been deferred in step 2, start a preload, if we're
configured to do any preloading.
This should give us consistency.
I think there may be a third option:
c) Make preload a JMX operation and allow the user to run a cluster-wide preload once all
the nodes in the cluster have started up. But this looks a little complicated, and it
would require either versioning or prohibiting external cache writes until the
cluster-wide preload is done to ensure consistency.
I'm not sure how having this as a JMX option helps. Having versioning, etc. solves
the problem even with an automatic preload.
What do you guys think? Sanne, I'm particularly interested how you think option a)
would fit with the query module.
Cheers
Dan
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani
Platform Architect, JBoss Data Grid
http://red.ht/data-grid