[jbosscache-dev] Locking parents for insertion and removal with MVCC

Fri Jul 4 08:35:40 EDT 2008

On 3 Jul 2008, at 21:21, Jason T. Greene wrote:

> Manik Surtani wrote:
>
>> Now with MVCC - since there are no read locks - we have a problem.   
>> Consider:
>> 1.  Tx begns.
>> 2.  Tx reads /a (no locks)
>> 3.  Tx removes /a/b (locks /a/b)
>> 4.  Tx counts children of /a (will still see that /a/b is in /a's  
>> child map).
>> he approaches I have considered are:
>> 1.  Removing /a/b from /a's child map.
>> This will provide a consistent view to the tx removing /a/b but  
>> will mean that other txs will also see /a/b disappear when  
>> counting /a's children, but not when querying the cache for /a/b!   
>> Worse, rollbacks would mean re-adding /a/b to /a's child map,  
>> providing weird views on other readers (/a/b disappears for a  
>> while, then reappears)
>> 2.  Make a copy of /a for the tx to work off, when removing /a/b.   
>> This is the same as having LockParentForChildInsertRemove semantics.
>> Copy the parent as well and work off it, but to prevent problems  
>> with concurrent child removes and adds, we'd have to lock the parent.
>> My vote is for approach 2.  In fact, LockParentForChildInsertRemove  
>> would always need to be enabled when using MVCC.  Perhaps this  
>> should be a deprecated property that just supports additional  
>> consistency for OL and PL, and be removed when OL/PL eventually get  
>> removed?
>> Thoughts?
>
> Why not treat a delete like a modify (as I think it has been done in  
> the past)? When a node is deleted, it is locked, and either a  
> sentinel or just an fqn is stored in the TX context that issued the  
> delete. Then the parent still conains the original node, allowing  
> concurrent readers, while the calling tx knows to filter deleted  
> nodes from child operations.

Wasn't as simple - phantoms are problematic since we have no read  
locks.  Let me explain: Node /a/b exists.  T1 starts, adds child /a/b/ 
c.  T2 reads /a/b.  Gets child count for /a/b.  Sees 1 (phantom).   
Then tries to do a get() on /a/b/c and sees this as well - since this  
is added to the parent and the parent is not locked for writing (i.e.,  
copied).

I have worked around this by only attaching new children to their  
parents at commit time.  This completely gets rid of phantoms, but  
introduces a problem where when retrieving children newly added in a  
tx by accessing it's parent, you won't see the new children.  So I've  
updated GetChildrenNamesCommand to retrieve the child list, and then  
correct it based on children added and removed in the current scope  
before returning.  This works well, save for the overhead of checking  
and correcting the child list with respect to anything added or  
removed in the current scope.

There is one more consequence of not locking parents for adding  
children though.  The way things work as per current designs, we have:

1.  T1 reads /a/b - no locks needed.
2.  T1 writes to /a/b/c - /a/b/c is locked.
	2.1.  /a/b/c is copied (call this copy /a/b/c'), version updated.
	2.2.  Versions checked for write skew.
3.  T1 commits.
	3.1.  /a/b/c is replaced with /a/b/c' in /a/b's child map.
	3.2. Lock on /a/b/c is released.

So that works, except that there can be a race where updates can be  
lost.  Consider:

1.  T1 reads /a/b - no locks needed.
2.  T1 writes to /a/b/c - /a/b/c is locked.
	2.1.  /a/b/c is copied (call this copy /a/b/c'), version updated.
	2.2.  Versions checked for write skew.
3.  T2 writes to /a/b - /a/b is locked.
	3.1.  /a/b is copied to /a/b' and version is updated.
	3.2.  Write skew chk
4.  T1 commits.
	4.1.  /a/b/c is replaced with /a/b/c' in /a/b's child map.
	4.2. Lock on /a/b/c is released.
5.  T2 commits.
	5.1.  /a/b is replaced with /a/b' in /a's child map.
	5.2.  Lock on /a/b is released.

In the above sequence, changes made to /a/b/c are lost in step 5.1  
since changes to /a/b/c are made in /a/b/c' which is attached to /a/b  
and /a/b is replaced with /a/b'.

Now this can be solved by either:

1.  Acquiring parent lock at *commit* time.
Simpler to implement and more secure, but involves more locking, even  
if for a short while.

2.  Merge child maps during commit (steps 4.1 and 5.1) between the  
copied node and the original node.
E.g., in 5.1., merge child maps between /a/b and /a/b' before swapping  
refs on /a.  Slightly more complex to implement (may need to track  
whether children have been modified on each copied node).

I'd vote for the 2nd approach above.

Any other thoughts around this?

Cheers,
--
Manik Surtani
Lead, JBoss Cache
manik at jboss.org