[Design of JBossCache] - Re: Locking tree before setting initial state - jboss-dev-forums

Tuesday, 5 December 2006

Here is a transcript of conversation that I had with Brian regarding the algorithm details
of JBCACHE-315:

Replying privately, but think we should take this to jbcache-dev or the forum. This is
complex and Vladimir Blagojevic wrote:
...
 Hey Brian,

 I though a bit more about the locking algorithm and I would like to 
 bounce it off you. If you recall we agreed on our phone call that we 
 have to go through the steps of:

 a) set a flag or something that will block any ACTIVE transactions 
 from proceeding (i.e. entering prepare() phase of 2PC)

 We then revised this by saying that infact "any ACTIVE transactions" 
 should rather be "any ACTIVE locally initiated transactions". We also 
 agreed how we can do this by using a latch in TxInterceptor.

 b) wait for completion of any transactions that are beyond ACTIVE.

 We thought that this was great idea but soon realized that 
 transactions can still deadlock. You said:"For example, locally 
 initiated transaction is holding a lock on some node and you have 
 remote prepare that comes in. Remote won't be able to acquire lock. At 
 some point we have to deal with that. Whoever sent that prepare call 
 isn't going to proceed - sender will block on that synchronous call. 
 So on remote node prepare is not going to be progressing." 
To be even more specific:

On the remote node (i.e. the one we're working on) the JG up-handler thread will be
blocked while the prepare() call waits to acquire a lock.  That thread will block until
there is a lock timeout.  This will occur whether we are using REPL_ASYNC or REPL_SYNC.
One effect of this is that no other JG messages will be received until there is a timeout.
Note also that *I think* that having another thread roll back the tx associated with the
prepare call will not cause the JG up-handler thread to unblock!!

If REPL_SYNC, on the node that originated the GTX, the client thread that committed the tx
will be blocking waiting for a response to the prepare() call.

...

 I have a another proposal. If we already have to introduce a latch why 
 not introduce it in "better" location. So the proposal is to introduce 
 our latch in InvocationContextInterceptor rather than in 
 TxInterceptor.
 InvocationContextInterceptor is always first interceptor in the chain. 
 By introducing a latch here we can inspect a call and determine its 
 origin and transactional status and block transactions prior to them 
 grabbing any locks. 
Can this be done in the TxInterceptor?  I.e. isn't it always before any
LockInterceptor? I would think it would be.  I expect Manik would put up a fuss about
doing tx-related stuff outside TxInterceptor; the whole reason it was added in 1.3.0 was
to encapsulate stuff that was previously spread around other interceptors.

...
 If transaction
 is originating locally and has not been registered in 
 TransactionalTable (had not yet performed any operation) block it on 
 latch prior it has a chance to acquire any locks. 
+1. No reason to let a tx that hasn't acquired any locks go through and cause
trouble.

...
 Then we look at the table and rollback any local transactions that 
 have not yet gone to prepare i.e transactions that we have missed with 
 our latch. If any rollbacked transaction retries it will be caught by 
 our latch:) All other transactions we let go through. Start a timer 
 and give it enough time to have beyond prepare transactions finish.

 So in pseudocode, algorithm executed on each node:

 	receive block call
 	flip a latch in InvocationContextInterceptor and block any subsequent 
 local transactions
 	rollback local transactions if not yet in prepare phase and start 
 timer T (allow some time for beyond prepare transactions to finish)
 	if lock still exists at integration node after T expires rollback our 
 local transaction
 	flip latch back and allow transactions to proceed
 	return from block ok

 	flush blocks all down threads (thus no prepare will go through 
 although local transactions will proceed on each node)

 Proceed with algorithm on state provider:

 	receive getState
 	grab a lock on integration point using LockUtil.breakLock variant 
 possibly rolling back some local
 	transactions read state and do state transfer with state receiver

 	when state transfer is done prepare messages will hit the cluster and 
 state will be consistent
 	no matter what happens will all global transactions

The concern I have with this is we give up one of the key goals -- not rolling back a tx
if its not hurting anything.  Here we assume that an ACTIVE locally originated tx is going
to cause a problem by blocking a remote prepare() call.  So we roll back the tx.  Actually
the odds of a remote prepare() call being blocked are pretty low.

How about this:

1) receive block call
2) flip a latch in TxInterceptor (I'm assuming it will work putting it here instead of
InvocationContextInterceptor).  This latch is used at 2 or 3 different control points to
block any threads that:
   a) are not associated with a GTX (i.e. non-transactional reads/writes from the local
server)
   b) are associated with a GTX, but not yet in TransactionTable (your idea above)
   c) are associated with a locally originated GTX and are about to enter the
beforeCompletion phase (i.e. the original idea of preventing the tx from proceeding to
making a prepare() call.)
3) Loop through the GTXs in the TransactionTable. Create and throw in a map little object
for each GTX.  Object is a state machine that uses the JTA Transaction.STATUS, the elapsed
time and whether the tx is locally or remotely originated to govern its state transitions.
 Keep looping through the TransactionTable, create more of these objects if more GTXs
appear and for each GTX update the object with the current Transaction.STATUS, then read
the object state. The object state tells you whether you need to rollback the tx, etc.
4) If a the state machine is for a *remotely initiated* GTX that's in ACTIVE status,
after some elapsed time its state will tell you that its likely held up by a lock conflict
with a locally originated tx.  At that point we have a choice.
   a) roll back all locally originated tx's that are ACTIVE or PREPARED.  Con:
indiscriminately breaks transactions. Con: if tx has already entered beforeCompletion() we
don't know whether its in prepare() call or later.  We can only roll it back during
beforeCompletion(); otherwise we introduce a heuristic.
   b) roll back the remotely originated tx. Pro: doesn't indiscriminately break
transactions.  Con: I *think* this rollback won't unblock the JG up-handler thread.
5) We'd need to work out all the state transitions; i.e. what conditions lead to tx
rollback.
6) flip latch back and allow transactions to proceed
7) return from block ok

...

 So in summary the goal of the first part of the algorithm is to allow 
 transactions beyond prepare to finish and prevent any local 
 transactions from hitting the cluster and becoming global. That leaves 
 us dealing only with local transactions at state provider in the 
 second part of the algorithm. In the second part with deal just with 
 state provider. We grab lock at integration point, possibly rollback 
 any local transaction there, do state transfer and let the prepares 
 hit the cluster thus preserving state consistency and disturbing least 
 number of global transactions.

 It seems like if we had that blockDone Jgroups callback then things 
 would be nicer,algorithm executed on each node:

 receive block call
 	flip a latch in InvocationContextInterceptor and block any subsequent 
 local transactions
 	rollback local transactions if not yet in prepare phase and start 
 timer T
 	if lock still exists at integration node after T expires rollback our 
 local transaction
 	return from block ok

 	flush blocks all down threads (thus no prepare will go through 
 although local transactions will proceed on each node)

 Proceed with algorithm on state receiver and provider:

 	do state transfer

 Proceed with algorithm executed on each node:

 	flip latch back and allow transactions to proceed

Here's a question for you about FLUSH:  When a service returns from block() or sends
blockOK, does the channel immediately block?  Is there coordination across the cluster?

My concern:

Node A doesn't have much going on, quickly returns from block(), so his channel is
blocked.

Node B takes a little longer; has some txs the completion of which requires sending
messages to A.  Those messages don't get through due to A being blocked.

View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3991386#...

Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Design of JBossCache] - Re: Locking tree before setting initial state