[infinispan-dev] Partial state transfer

Wed Jun 15 11:39:31 EDT 2011

I looked into adding partial state transfer back into JGroups, but found 
out that partial state transfer is fundamentally flawed, something I've 
always suspected ! (Regular state transfer is correct, and has always 
been correct).

- Say we have node A and B. B requests the state from A
- There are partial states X and Y
- Message M1 modifies X, M2 modifies Y

Here's what happens:

T1: A multicasts M1
T2: A delivers M1, and changes X
T3: B sends a GET_STATE("Y") request to A    // partial state request 
for state Y
T4: A multicasts M2
T5: A delivers M2, changing Y
T6: A receives the GET_STATE request, sends a SET_STATE response back 
including Y and the digest (including M1 and M2's seqnos)
T7: B receives the SET_STATE response, sets its digest (includes now M1 
and M2) and state Y *BUT NOT* state X !
T8: *** B receives M1, discards it because it is already in its digest ***
T9: B receives M2, and also discards it

At time T8, M1 (which would have changed state X) is discarded, because 
it is already in the digest sent with the SET_STATE response. Therefore 
state X is now incorrect, as M1 was never applied !

As a summary, if we get a number of updates to partial states, and don't 
receive all of them before requesting the partial state, the last update 
includes in the digest wins...

I'm a real idiot, as I've written this down before, in 2006: see [1] for 
details.

In a nutshell, [1] shows that partial state transfer doesn't work, 
unless virtual synchrony (FLUSH) is used.

So I propose Infinispan and JBoss AS look into how they can replace 
their use of partial state transfer. I suggest Infinispan uses the same 
approach already used for state transfer with mode=distribution.

Opinions ?

[1] 
https://github.com/belaban/JGroups/blob/master/doc/design/PartialStateTransfer.txt

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss