On Mon, Oct 1, 2012 at 12:48 PM, Galder Zamarreño
<galder(a)redhat.com> wrote:
On Sep 24, 2012, at 5:22 PM, Dan Berindei <dan.berindei(a)gmail.com> wrote:
> Hi guys
>
> During the final push for NBST I found a bug with preloading (entries that
didn't belong on a joiner weren't removed after the initial state transfer). I
decided to fix it and
https://issues.jboss.org/browse/ISPN-1586 at the same time, since it
was a longstanding bug and I had a reasonable idea on what to do. However, I missed some
implications and I need to fix them - there is at least one Query test failing because of
my change (SharedCacheLoaderQueryIndexTest).
>
> In 5.1, preloading worked like this:
> 1. Start the CacheLoaderManager, which preloads everything from the cache store in
memory.
> 2. Start the StateTransferManager, retrieving data from the other cache members and
overwriting already-preloaded values.
> 3. When the initial state transfer ends, entries not owned by the local node are
deleted.
>
> The main issue with this, raised in ISPN-1586, is that entries that were deleted on
the other cache members are "revived" on the joiner when it reads the data from
the cache store. There is another performance issue, because we load a lot of data that we
then discard, but that's less important.
>
> With the ISPN-1586 fix, preloading should work like this:
> 1. Start the StateTransferManager, receive initial CH.
> 2. If the local node is not the first to start up, fetching state (either in-memory
or persistent) is enabled and the cache store is non-shared, clear it.
> 3. Start the CacheLoaderManager, which preloads the cache store in memory - but only
if the local node is the first one having started the cache OR if the fetching state is
disabled.
> 4. Run the initial state transfer, retrieving data from the other cache members (if
any, and if fetching state is enabled).
>
> This solves ISPN-1586, but it does mean that data from non-shared cache stores will
be lost on all the nodes except the first that starts up. So if the last node to shut down
is not the first node to start back up, the cluster will lose data.
>
> These are the alternatives I'm considering:
> a) Finish the ISPN-1586 fix and clearly document that non-shared cache stores
don't guarantee persistence after cluster restart (unless the last cache to stop is
the first to start back up and shutdown was spaced out to allow state transfer to move
everything to the last node).
^ What if the whole cluster goes down for other reasons? The in-memory state would be
gone, but having these non-shared cache stores should provide with the opportunity to
recover.
If this is implemented, this option would be gone and partial state would be lost.
That's not good.
Good point Galder... unfortunately, we can't tell at the moment if the whole cluster
was shut down (and the state on the joiner is up-to-date, maybe even essential) or if only
the joiner was shut down, and its state is stale.
In the spirit of my JMX proposal, maybe we could leave preloading enabled by default but
add a cluster-wide "stop preloading" flag that the admin can set once the
cluster has finished starting?
So if "stop preloading" is on, the newcomers would not load data from cache
store and assume it is stale?
In a restart after a clean shutdown it would be nice to have no state transfer at all, but
each cache loads data from store and keeps it locally. This would mean that the routing
table is also serialised as part of the clean shutdown.
Cheers,
--
Mircea Markus
Infinispan lead (