[infinispan-dev] Re: Locking

Tue Apr 28 05:16:19 EDT 2009

Hi Vladimir

I haven't spent too much time thinking about eager locking yet -  
swamped with the finishing touches on 4.0.0.ALPHA2 and getting a  
public splash ready.  :-)

Also, infinispan-dev (now cc'd!) is pretty informal so don't worry  
about getting a perfect, distilled design.  The purpose of the list is  
to distil the design.

I've included my initial thoughts inline, below.

On 28 Apr 2009, at 03:18, Vladimir Blagojevic wrote:

> Manik,
>
> Have you thought about if we need non-eager locking? I have put  
> sketches of requirements for locking. Maybe you can put final  
> touches and we can post it to infinispan-dev.
>
> Regards,
> Vladimir
>
>
> Introduction
>
> The difference between eager and non-eager locking is related to  
> timing and method of lock acquisition across the cluster. Eager  
> locking is used by a user who wants to lock set of keys prior to  
> invoking a batch of operations on those keys. If locks are  
> successfully obtained on all keys across all cluster nodes user is  
> sure that he/she will not get lock acquisition exception during  
> those batch operations.
>
> As far as implementation goes eager locking is executed by  
> synchronously replicating lock command across the network. Depending  
> on a cache mode, lock acquisitions are attempted on all specified  
> keys on all cluster nodes or on a subset of cluster nodes - in case  
> of DIST cache set-up. Lock operation either succeeds or fails. Eager  
> locking can be used with or without transactions.
>

All makes sense, except that we need to think who the lock 'owner' is  
going to be if we are not running in a tx.  The way the cache works,  
locks are held by the thread (if not in a tx) or the associated  
GlobalTransaction object (if in a tx).

Let's take these cases one at a time.  Let's start with (what I think)  
is the most pertinent:

A.  Transactions + Eager locking
---------------------------------

when you do the following:

1.  tx.begin()
2.  cache.put(K, V)
3.  tx.commit()

what happens is, at 1., a new GTX instance is associated with the  
transaction.  At 2.  locks are acquired on K on the *local* cache, the  
owner being GTX.  At 3., a prepare() is broadcast and the locks on K  
are acquired on all remote participating caches, the owner being GTX  
again.

The GTX is propagated with the PrepareCommand as it is a field in  
AbstractTransactionBoundaryCommand.

So now, an explicit lock in the scope of a transaction would also need  
to propagate the GTX - just so we know who the lock owner should be.   
E.g.,

1.  tx.begin()
2.  cache.lock(K)
3.  // read K,V and calculate V2 which is a function of V.  E.g., V2 =  
V + 1
4.  cache.put(K, V2)
5.  tx.commit()

In the above scenario, step 2 broadcasts a LockControlCommand which is  
constructed with the GTX and the necessary key(s).  Remote caches  
acquire the locks using the GTX as the owner, and responds  
positively.  (this RPC call would *have* to be sync regardless of  
cache mode).

The way I see it, cache.unlock() should be *optional*.  If unlock() is  
not called, when the transaction completes all locks associated with  
the transaction are released anyway.

If unlock() is used, it would need some special behaviour.  Consider:

1.  tx.begin()
2.  cache.lock(K)
3.  cache.put(K2, V2)
4.  cache.unlock(K)
5.  tx.commit()

In the above case, unlock() would release locks at step 4 since the tx  
has not actually modified K.  If, however, we have:

1.  tx.begin()
2.  cache.lock(K)
3.  cache.put(K, V)
4.  cache.unlock(K)
5.  tx.commit()

then the unlock() should be a no-op (maybe don't even bother with the  
RPC) since we know that the tx has modified K and we can only feasibly  
release the lock once the transaction completes in 5.

B.  Transactions + Non-eager locking
---------------------------------

So I guess non-eager would be that the acquisition of remote locks are  
deferred to when the prepare() broadcasts, and any potential lock  
acquisition failures happen at the time of prepare().  This is what we  
have in place right now anyway, I need to think about whether lock()  
and unlock() makes sense in a non-eager context.

C.  No transaction + Eager locking
---------------------------------

This gets a bit more tricky, because in this case the lock owner is  
the Thread.  And this does not translate to anything meaningful  
cluster-wide.  Assume:

1.  cache.lock(K)
2.  cache.get(K)
3.  cache.put(K, V)
4.  cache.unlock()

Here, we notice that explicit use of unlock() is needed to release  
locks acquired by lock(), unlike the case where a committing (or  
rolling back) transaction would do this for you.

The tricky thing here is, who holds lock ownership across a  
cluster?  :-)  If this is standalone mode, this is simple - the Thread- 
owner paradigm works fine.  Perhaps we need to change the locking code  
so that the non-tx lock owner is a combination of thread-name + cache  
address?  I can see this leading to tricky stale locks though if we  
have a join/rejoin midway during a code sequence like above.

D.  No transaction + Non-eager locking
---------------------------------

Just as in B., I'm not sure if this is a valid use case.  I.e., how is  
this useful and what is its purpose.  (perhaps just acquire local  
locks on lock() and only acquire the remote locks when the actual  
put() is performed?)

Anyway, to sum things up: the way I see it, B and D are cases that  
still need further thought (i.e., the non-eager cases).  C needs a lot  
more in-depth thought as it has the potential to break a lot of stuff  
as it changes lock ownership for all non-transactional cases.  So I am  
inclined to suggest that lock() and unlock() only works for case A.   
For other cases, we do not support the use of these methods.  At least  
for now, until we think of C in greater detail.

What do you think?  Apart from that, the detailed changes you  
mentioned below all make sense (except that I would add lockOwner as a  
field on LockControlCommand, etc).

Cheers
Manik

>
> Example usage
>
> Cache.lock(k1, k2, k3)
> lots of rw operations with k1, k2, k3 while being sure no lock  
> acquisition exceptions will be raisedCache.unlock(k1, k2, k3)
>
>
>
> Implementation
>
>
>
> 1. Add the following API to AdvancedCache and implementation to  
> CacheDelegate
>
> void lock(K key, boolean eager);
> void lock(Collection<? extends K> keys, boolean eager);
> void unlock(K key);
> void unlock(Collection<? extends K> keys);
>
>
>
> 2. Add the following API to CommandsFactory along with  
> implementation in CommandsFactoryImpl
>
> public LockControlCommand buildLockControlCommand();
>
>
>
> 3. LockControlCommand has following fields:
>
> a boolean to indicate whether this is a lock or unlock
> a boolean to indicate whether this is an eager lock
> an array of keys to be locked/unlocked
>
>
>
> 4. Add LockingInterceptor that intercepts LockControlCommand and  
> handles acquiring/releasing of locks as needed. Lock acquisition/ 
> release is implemented by using already existing LockManager. If  
> LockManager is able to lock all required keys true is returned,  
> otherwise false.
>
>
>
>
>
>
>
>
>
>
>
>
>
>

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20090428/4617c820/attachment-0004.html