Re: [infinispan-dev] Non-blocking state transfer (ISPN-1424)

Thursday, 22 March 2012

On Tue, Mar 20, 2012 at 2:51 PM, Mircea Markus <mircea.markus(a)jboss.com&gt; wrote:
...
 Hi Dan,

 Some notes:

 Ownership Information:
 - I remember the discussion with Sanne about an algorithm that wouldn't
 require view prepare/rollback, but I think it would be very interring to see
 it described in detail somewhere as all the points you raised in the
 document are very closely tied to that 
Yes, actually I'm trying to describe precisely that. Of course, there
are some complications...

...
 - "However, it's not that clear what to do when a node
leaves while the
 cache is in "steady state"" --> if (numOwners-1) nodes crash a state
 transfer is (most likely) wanted in order not to loose consistency in the
 eventuality another node goes down. By default numOwners is 2, so in the
 first release, can't we just assume that for every leave we'd want
 immediately issue a state transfer? I think this would cover most of our
 user's needs and simplify the problem considerably.

Not necessarily: if the user has a TopologyAwareConsistentHash and has
split his cluster in two "sites" or "racks", he can bring down an
entire site/rack before rebalancing, without the risk of losing any
data.

...
 Recovery
 - I agree with your point re:recovery. This shouldn't be considered hight
 prio in the first release. The recovery info is kept in an independent
 cache, which allows a lot of flexibility: e.g. can point to a shared cache
 store so that recovery caches on other nodes can read that info when needed.

I don't agree about the shared cache store, because then every cache
transaction would have to write to that store. I think that would be
cost-prohibitive.

...

 Locking/sync..
 "The state transfer task will acquire the DCL in exclusive mode after
 updating the cache view id and release it after obtaining the data container
 snapshot. This will ensure the state transfer task can't proceed on the old
 owner until all write commands that knew about the old CH have finished
 executing." <-- that means that incoming writes would block for the duration
 of iterating the DataContainer? That shouldn't be too bad, unless a cache
 store is present..

Ok, first of all, in the new draft merged the data container lock and
the cache view lock, I don't think making them separate actually
helped with anything but it made things harder to explain.

The idea was that incoming writes would only block while state
transfer gets the iterator to the data container, which is very short.
But the state transfer will also block on all the writes to the data
container - to make sure that we won't miss any writes.

The algorithm is like this:

Write command
===
1. acquire read lock
2. check cache view id
3. write entry to data container
4. release read lock

State Transfer
===
1. acquire write lock
2. update ch
3. update cache view id
4. release read lock

...
 "CacheViewInterceptor [..] also needs to block in the interval
between a
 node noticing a leave and actually receiving the new cache view from the
 coordinator" <-- why can't the local cache star computing the new CH
 independently and not wait for the coordinator..?

Because the local cache doesn't listen for join and leave requests,
only the coordinator does, and the cache view is computed based on the
requests that the coordinator has seen (and not on the JGroups cluster
joins and leaves).

Cheers
Dan

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Non-blocking state transfer (ISPN-1424)