[infinispan-dev] Non-blocking state transfer (ISPN-1424)

Fri Mar 16 03:16:31 EDT 2012

On 3/15/12 11:29 AM, Dan Berindei wrote:

> That was basically what we did in the blocking design: the ST commands
> could execute during ST, but regular commands would block until the
> end of the ST. With async caches, that meant we would use JGroups' 1
> queue per sender (so not a global queue, but close).
>
> The problem was not with the regular commands that arrived after the
> start of the ST, but with the commands that had already started
> executing when ST started. This is the classic example:
> 1. A prepare command for Tx1 locks k1 on node A
> 2. A prepare command for Tx2 tries to acquire lock k1 on node A
> 3. State transfer starts up and blocks all write commands
> 4. The Tx1 commit command, which will unlock k1, arrives but can't run
> until state transfer has ended
> 5. The Tx2 prepare command times out on the lock acquisition after 10
> seconds (by default)
> 6. State transfer can can now proceed and push or receive data.
> 7. The Tx1 commit can now run and unlock k1. It's too late for Tx2, however.
>
> The solution I had in mind for the old design was to add some kind of
> deadlock detection to the LockManager and throw a
> StateTransferInProgress when a deadlock with the state transfer is
> detected.

OK. I don't like the old design, as ST has to wait until all pending TXs 
(those with locks held) have to commit before we can make progress. If 
the lock acquition timeout is high, we'll have to wait for a long time.

> With the new design I thought it would be simpler to not acquire a big
> lock for the entire duration of the write command that would prevent
> state transfer. Instead I would acquire different locks for much
> shorter amounts of time, and at the beginning of each lock acquisition
> we would just check that the command's view id is still the correct
> one.

OK. Perhaps an overview of the new design in the document is warranted. 
There's a section on transfer of CacheEntries and one on locks, but I 
didn't see a combined discussion. Perhaps an example like the one above 
would be good ?

I now realize how much simpler the use of total order is here: since all 
updates in a cluster happen in total order, we don't need to acquire 
locks in 1 phase and release them in another phase. ST is then just 
another update, inserted at a certain place in the stream of updates.

I assume the Cloud-TM guys don't do state transfer in their prototype, 
or do they ? Pedro ? If not, then there needs to be an implementation of 
ST for TO.

Cheers,

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)