[infinispan-dev] eager locking and deadlocks

Tue Jul 6 11:06:37 EDT 2010

Hi,

Whit eager locking enabled, cache.put(k,v) acquires locks as follows:
1. acquire local lock on k
2. acquire remote lock on k
3. if 2 fails release local lock on k

Previous sequence (i.e. local then remote) makes more sense to me than the reversed:
1. acquire remote lock on k
2. acquire local lock on k
3. if 2 fails release remote lock on k 

This is because it doesn't make sense to go remotely just then to realise that you cannot acquire a local lock.

However, this approach does not work with deadlock detection[1] which requires the 2nd approach.
Here are my options:
1. forbid deadlock detection on eager locking
2. allow users to use deadlock detection with eager locking and if DLD is on to use the 2nd locking approach, otherwise to use the first one. 
3.?

2 is doable, and I think DLD might have a good impact on eager locking, as in this situation the transactions hold locks longer, hence the chance for deadlock is higher. 

Wdyt?

Cheers,
Mircea

[1] here is why:
tx1 and tx2 two transactions.
tx1 originates on node A, tx2 originates on node B

Step 1 (simultaneously):
tx1: A.put(k1) -> lock(tx1) = {A_k1, B_k1}
tx2: B.put(k2) -> lock(tx2) = {A_k2, B_k2}

Step 2:
tx1: A.put(k2) -> status(tx1) = Holds lock on {A_k1, B_k1} and tries to lock A_k2 
tx2: B.put(k1) -> status(tx2) = Holds lock on {A_k2, B_k2} and tries to lock A_k1

DLD on Tx1 cannot progress: it tries to lock A_k2 and cannot, it is aware that A_k2 is locked by tx2, but it doesn't know that tx2's intention is to lock A_k1(that would mean a deadlock). Why? because tx2 hasn't replicated its intention to the other node as it's waiting to acquire lock on A_k1.
Similar to Tx2.