[infinispan-dev] Non-blocking state transfer (ISPN-1424)

Thu Mar 22 05:26:33 EDT 2012

Thanks Dan,

here are some comments / questions:

"2. Each cache member receives the ownership information from the 
coordinator and starts rejecting all commands started in the old cache view"

How do you know a command was started in the old cache view;  does this 
mean you're shipping a cache view ID with every request ?

"2.1. Commands with the new cache view id will be blocked until we have 
installed the new CH and we have received lock information from the 
previous owners"

Doesn't this make this design *blocking* again ? Or do you queue 
requests with the new view-ID, return immediately and apply them when 
the new view-id is installed ? If the latter is the case, what do you 
return ? An OK (how do you know the request will apply OK) ?

"A merge can be coalesced with multiple state transfers, one running in 
each partition. So in the general case a coalesced state transfer 
contain a tree of cache view changes."

Hmm, this can make a state transfer message quite large. Are we trimming 
the modification list ? E.g if we have 10 PUTs on K, 1 removal of K, and 
another 4 PUTs, do we just send the *last* PUT, or do we send a 
modification list of 15 ?

"Get commands can write to the data container as well when L1 is 
enabled, so we can't block just write commands."

Another downside of the L1 cache being part of the regular cache. IMO it 
would be much better to separate the 2, as I wrote in previous emails 
yesterday.

"The new owners will start receiving commands before they have received 
all the state
     * In order to handle those commands, the new owners will have to 
get the values from the old owners
     * We know that the owners in the later views have a newer version 
of the entry (if they have it at all). So we need to go back on the 
cache views tree and ask all the nodes on one level at the same time - 
if we don't get any certain anwer we go to the next level and repeat."

How does a member C know that it *will* receive any state at all ? E.g. 
if we had key K on A and B, and now B crashed, then A would push a copy 
of K to C.

So when C receives a request R from another member E, but hasn't 
received a state transfer from A yet, how does it know whether to apply 
or queue R ? Does C wait until it get an END-OF-ST message from the 
coordinator ?

(skipping the rest due to exhaustion by complexity :-))

Over the last couple of days, I've exchanged a couple of emails with the 
Cloud-TM guys, and I'm more and more convinced that their total order 
solution is the simpler approach to (1) transactional updates and (2) 
state transfer. They don't have a solution for (2) yet, but I believe 
this can be done as another totally ordered transaction, applied in the 
correct location within the update stream. Or, we could possibly use a 
flush: as we don't need to wait for pending TXs to complete and release 
their locks, this should be quite fast.

So my 5 cents:

#1 We should focus on the total order approach, and get rid of the 2PC 
and locking business for transactional updates

#2 Really focus on the eventual consistency approach

Thoughts ?

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)