At JBoss World, Jason had proposed a mechanism for eager deadlock detection when using a 2
phase commit and pessimistic locking or MVCC.
A description of the problem:
* Server 1 updates /a, /b and broadcasts a prepare.
* Server 2 updates /a, /c and broadcasts a prepare at the same time.
---> Deadlock, even though both txs succeed in getting local locks.
With OL, one will fail, with PL, both will fail after TIMING OUT (LockAcquisitionTimeout
millisecs), which can be unnecessarily slow. MVCC will have the same behaviour as PL
since MVCCâs write lock strategy is similar to PL.
Eager Deadlock Detection (EDD) requires:
* lock() method on the lock manager to inspect other owners of the lock attempting to be
acquired if the lock CANNOT be acquired.
* Can be achieved with acquire(someSmallNumber); in a loop to achieve the real timeout
required:
| lock(int timeout)
| {
| while (currentTime < startTime + timeout)
| {
| if (acquire(smallTimeout)) break;
| testEDD();
| }
| }
|
Assume we have 2 servers, S1 and S2. S1 has thread TL1 which is a thread pertaining to a
local transaction. S1-TL1 will have a corresponding thread, S2-TR1 pertaining to the
remote prepare broadcast to S2. Similarly, S1-TL2 is a local prepare on S2, and
corresponds to S1-TR2.
This process will be followed by either S1-TR2 or S2-TR1.
1) If the lock is owned and the current thread originates remotely, inspect the owner.
2) If the owner is a LOCAL tx, inspect itâs state. If it is COMMITTING or COMMITTED,
then there will definitely be a deadlock, since the remote tx wants a lock on a node that
has already been locally locked and is committing - which means the local tx thread will
fail on other remote nodes for the same reason.
3) If the deadlock is detected abort the local transaction. (e.g., if the current thread
is S1-TR2, S2-TL2 will be forced to roll back). This will allow S2-TR1 to get the remote
locks it needs, allowing S1-TL1 to succeed.
4) Only one instance should do this with the local transaction, not both (since otherwise
both will not succeed). I.e., only S2-TR1 or S1-TR2 should follow this process; not
both.
5) Sufficiently randomise the probability of "winning" in a deadlock by each
node sending out a "coin toss" - a random number generated for each transaction
and stored in the TransactionEntry. This way all cache instances know, deterministically,
whether it will yield it's local transaction during a deadlock. I.e., both instances
know which of the two threads, S1-TR2 or S2-TR1, will yield.
The upside of such an approach is that the deadlock can be determined very quickly
(milliseconds?) and one of the participants in the deadlock forced to abort, allowing at
least one transaction to complete quickly.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4135088#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...