[infinispan-dev] Non-blocking state transfer (ISPN-1424)

Thu Mar 22 04:48:55 EDT 2012

Hi guys

I've updated the document at
https://community.jboss.org/wiki/Non-blockingStateTransfer
This time I've added an overview and changed "locking requirements" to
"command execution during state transfer".
I also assumed that we are going to use the tree of cache view.

More comments inline.

On Fri, Mar 16, 2012 at 9:16 AM, Bela Ban <bban at redhat.com> wrote:
>
>
> On 3/15/12 11:29 AM, Dan Berindei wrote:
>
>> That was basically what we did in the blocking design: the ST commands
>> could execute during ST, but regular commands would block until the
>> end of the ST. With async caches, that meant we would use JGroups' 1
>> queue per sender (so not a global queue, but close).
>>
>> The problem was not with the regular commands that arrived after the
>> start of the ST, but with the commands that had already started
>> executing when ST started. This is the classic example:
>> 1. A prepare command for Tx1 locks k1 on node A
>> 2. A prepare command for Tx2 tries to acquire lock k1 on node A
>> 3. State transfer starts up and blocks all write commands
>> 4. The Tx1 commit command, which will unlock k1, arrives but can't run
>> until state transfer has ended
>> 5. The Tx2 prepare command times out on the lock acquisition after 10
>> seconds (by default)
>> 6. State transfer can can now proceed and push or receive data.
>> 7. The Tx1 commit can now run and unlock k1. It's too late for Tx2, however.
>>
>> The solution I had in mind for the old design was to add some kind of
>> deadlock detection to the LockManager and throw a
>> StateTransferInProgress when a deadlock with the state transfer is
>> detected.
>
>
> OK. I don't like the old design, as ST has to wait until all pending TXs
> (those with locks held) have to commit before we can make progress. If
> the lock acquition timeout is high, we'll have to wait for a long time.
>
>
>> With the new design I thought it would be simpler to not acquire a big
>> lock for the entire duration of the write command that would prevent
>> state transfer. Instead I would acquire different locks for much
>> shorter amounts of time, and at the beginning of each lock acquisition
>> we would just check that the command's view id is still the correct
>> one.
>
>
> OK. Perhaps an overview of the new design in the document is warranted.
> There's a section on transfer of CacheEntries and one on locks, but I
> didn't see a combined discussion. Perhaps an example like the one above
> would be good ?
>

I hope I've improved on this in the new version.

> I now realize how much simpler the use of total order is here: since all
> updates in a cluster happen in total order, we don't need to acquire
> locks in 1 phase and release them in another phase. ST is then just
> another update, inserted at a certain place in the stream of updates.
>

Unfortunately I think that would mean a blocking state transfer,
because all the other updates would have to wait on state to be
transferred.

> I assume the Cloud-TM guys don't do state transfer in their prototype,
> or do they ? Pedro ? If not, then there needs to be an implementation of
> ST for TO.
>

I'm curious about this as well.

Cheers
Dan