[jboss-dev-forums] [Design of JBossCache] - Re: New state transfer in JBoss Cache

bela@jboss.com do-not-reply at jboss.com
Fri Dec 14 16:56:43 EST 2007


"bstansberry at jboss.com" wrote : Do you mean you see a problem with getting the state even with MVCC in place?
  | 

I've only looked at optimistic locking so far. Pessimistic is probably harder to get right because the chances of locks on the nodes are higher than with O/L.

My current ideas tend towards a queue for prepares, state transfer requests and commits/rollbacks.

Prepares acquire locks, commits copy data back into the tree from the workspace. I can argue that if D wants to acquire state from A {A,B,C,D} that (in random order)
- the first prepare in the queue will always succeed
- there is never a commit for TX-N that was started after D joined and asked for state, because D will not reply to prepares during state transfer
- A state request from D has to be pushed to the top of the queue (commits can be ahead of the state request because commits only copy data and release locks)
- The queue is suspended during state transfer. I don't think we should reply YES to a prepare for a TX if we cannot be 100% certain we'll be able to commit it later
- I can add queueing to JGroups, but I'm not sure it's a gain, because missing messages get retransmitted anyway. If we waited for more then LockAcquisitionTimeout ms, then all prepares will result in corresponding rollbacks later anyway

I need to write these ideas down in a more coherent form next week...

anonymous wrote : 
  | Either way, MVCC is quite a ways off and we discussed support for P/L and O/L for at least a year after it's available. So, I think we need to get a reasonable solution for the P/L and O/L cases ASAP.
  | 

Agreed.

anonymous wrote : 
  | Re: the "queue messages during state transfer" approach discussed on the wiki, we have (in the 1.4.X branch at least) an existing implementation that is very close to what is discussed.  Partial state transfer done during region activation implements the message queueing; it should be fairly trivial to extend the same logic to the full initial state transfer.
  | 

Well, we cannot indiscriminately queue all messages. It is important that we know exactly *when* to start queueing, otherwise we get (a) duplicate messages or (b) will miss some messages. That's why I use consistent cuts (digests) in JGroups.

anonymous wrote : 
  | Responding positively to prepare even though we just queue the message still worries me; one concern is D receives things in different order than A did due to message retransmission. One possible way to mitigate this is to add some intelligence to how the queue gets applied; e.g. before applying a prepare, look for rollback or commit; discard both if rollback, move commit up in queue so it gets applied immediately.
  | 

Sounds complex. I agree with your assertion above that responding with YES to a prepare is probably not a good idea, as we are really not sure we can later commit. We can only be sure if we hold the lock to the real tree and have the data shipped with prepare in the workspace.

anonymous wrote : 
  | The queuing bit for partial state transfer was implemented because we were using RPC calls for that rather than JGroups state transfer.  With JG state transfer we could get the same effect from the JGroups layer using digests/retransmission.  However, JGroups couldn't do the prepare/rollback/commit matching I discuss above.  Also, JGroups will only detect and retransmit missing messages next time a sender sends, which potentially could be a long time (maybe never).
  | 

No, JGroups uses stability messages to detect the 'last message missing'. So this is not an issue. Maybe some amount of queueing can be done in the JGroups stack itself, e.g. by Channel.getState() sending down a START_QUEUEING event and - when the state has been received - the complementary event to stop the queueing.
Queueing would reduce the number of messages that have to get retransmitted. Of course, any queue has to be bounded, otherwise we'd happily queue until an OOME occurs.

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4113094#4113094

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4113094



More information about the jboss-dev-forums mailing list