Hi Vladimir
I haven't spent too much time thinking about eager locking yet -
swamped with the finishing touches on 4.0.0.ALPHA2 and getting a
public splash ready. :-)
Also, infinispan-dev (now cc'd!) is pretty informal so don't worry
about getting a perfect, distilled design. The purpose of the list is
to distil the design.
I've included my initial thoughts inline, below.
On 28 Apr 2009, at 03:18, Vladimir Blagojevic wrote:
Manik,
Have you thought about if we need non-eager locking? I have put
sketches of requirements for locking. Maybe you can put final
touches and we can post it to infinispan-dev.
Regards,
Vladimir
Introduction
The difference between eager and non-eager locking is related to
timing and method of lock acquisition across the cluster. Eager
locking is used by a user who wants to lock set of keys prior to
invoking a batch of operations on those keys. If locks are
successfully obtained on all keys across all cluster nodes user is
sure that he/she will not get lock acquisition exception during
those batch operations.
As far as implementation goes eager locking is executed by
synchronously replicating lock command across the network. Depending
on a cache mode, lock acquisitions are attempted on all specified
keys on all cluster nodes or on a subset of cluster nodes - in case
of DIST cache set-up. Lock operation either succeeds or fails. Eager
locking can be used with or without transactions.
All makes sense, except that we need to think who the lock 'owner' is
going to be if we are not running in a tx. The way the cache works,
locks are held by the thread (if not in a tx) or the associated
GlobalTransaction object (if in a tx).
Let's take these cases one at a time. Let's start with (what I think)
is the most pertinent:
A. Transactions + Eager locking
---------------------------------
when you do the following:
1. tx.begin()
2. cache.put(K, V)
3. tx.commit()
what happens is, at 1., a new GTX instance is associated with the
transaction. At 2. locks are acquired on K on the *local* cache, the
owner being GTX. At 3., a prepare() is broadcast and the locks on K
are acquired on all remote participating caches, the owner being GTX
again.
The GTX is propagated with the PrepareCommand as it is a field in
AbstractTransactionBoundaryCommand.
So now, an explicit lock in the scope of a transaction would also need
to propagate the GTX - just so we know who the lock owner should be.
E.g.,
1. tx.begin()
2. cache.lock(K)
3. // read K,V and calculate V2 which is a function of V. E.g., V2 =
V + 1
4. cache.put(K, V2)
5. tx.commit()
In the above scenario, step 2 broadcasts a LockControlCommand which is
constructed with the GTX and the necessary key(s). Remote caches
acquire the locks using the GTX as the owner, and responds
positively. (this RPC call would *have* to be sync regardless of
cache mode).
The way I see it, cache.unlock() should be *optional*. If unlock() is
not called, when the transaction completes all locks associated with
the transaction are released anyway.
If unlock() is used, it would need some special behaviour. Consider:
1. tx.begin()
2. cache.lock(K)
3. cache.put(K2, V2)
4. cache.unlock(K)
5. tx.commit()
In the above case, unlock() would release locks at step 4 since the tx
has not actually modified K. If, however, we have:
1. tx.begin()
2. cache.lock(K)
3. cache.put(K, V)
4. cache.unlock(K)
5. tx.commit()
then the unlock() should be a no-op (maybe don't even bother with the
RPC) since we know that the tx has modified K and we can only feasibly
release the lock once the transaction completes in 5.
B. Transactions + Non-eager locking
---------------------------------
So I guess non-eager would be that the acquisition of remote locks are
deferred to when the prepare() broadcasts, and any potential lock
acquisition failures happen at the time of prepare(). This is what we
have in place right now anyway, I need to think about whether lock()
and unlock() makes sense in a non-eager context.
C. No transaction + Eager locking
---------------------------------
This gets a bit more tricky, because in this case the lock owner is
the Thread. And this does not translate to anything meaningful
cluster-wide. Assume:
1. cache.lock(K)
2. cache.get(K)
3. cache.put(K, V)
4. cache.unlock()
Here, we notice that explicit use of unlock() is needed to release
locks acquired by lock(), unlike the case where a committing (or
rolling back) transaction would do this for you.
The tricky thing here is, who holds lock ownership across a
cluster? :-) If this is standalone mode, this is simple - the Thread-
owner paradigm works fine. Perhaps we need to change the locking code
so that the non-tx lock owner is a combination of thread-name + cache
address? I can see this leading to tricky stale locks though if we
have a join/rejoin midway during a code sequence like above.
D. No transaction + Non-eager locking
---------------------------------
Just as in B., I'm not sure if this is a valid use case. I.e., how is
this useful and what is its purpose. (perhaps just acquire local
locks on lock() and only acquire the remote locks when the actual
put() is performed?)
Anyway, to sum things up: the way I see it, B and D are cases that
still need further thought (i.e., the non-eager cases). C needs a lot
more in-depth thought as it has the potential to break a lot of stuff
as it changes lock ownership for all non-transactional cases. So I am
inclined to suggest that lock() and unlock() only works for case A.
For other cases, we do not support the use of these methods. At least
for now, until we think of C in greater detail.
What do you think? Apart from that, the detailed changes you
mentioned below all make sense (except that I would add lockOwner as a
field on LockControlCommand, etc).
Cheers
Manik
Example usage
Cache.lock(k1, k2, k3)
lots of rw operations with k1, k2, k3 while being sure no lock
acquisition exceptions will be raisedCache.unlock(k1, k2, k3)
Implementation
1. Add the following API to AdvancedCache and implementation to
CacheDelegate
void lock(K key, boolean eager);
void lock(Collection<? extends K> keys, boolean eager);
void unlock(K key);
void unlock(Collection<? extends K> keys);
2. Add the following API to CommandsFactory along with
implementation in CommandsFactoryImpl
public LockControlCommand buildLockControlCommand();
3. LockControlCommand has following fields:
a boolean to indicate whether this is a lock or unlock
a boolean to indicate whether this is an eager lock
an array of keys to be locked/unlocked
4. Add LockingInterceptor that intercepts LockControlCommand and
handles acquiring/releasing of locks as needed. Lock acquisition/
release is implemented by using already existing LockManager. If
LockManager is able to lock all required keys true is returned,
otherwise false.
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org