On 4 Jul 2008, at 09:43, Jason T. Greene wrote:
Manik Surtani wrote:
> On 3 Jul 2008, at 21:21, Jason T. Greene wrote:
>> Manik Surtani wrote:
>>
>>> Now with MVCC - since there are no read locks - we have a
>>> problem. Consider:
>>> 1. Tx begns.
>>> 2. Tx reads /a (no locks)
>>> 3. Tx removes /a/b (locks /a/b)
>>> 4. Tx counts children of /a (will still see that /a/b is in /a's
>>> child map).
>>> he approaches I have considered are:
>>> 1. Removing /a/b from /a's child map.
>>> This will provide a consistent view to the tx removing /a/b but
>>> will mean that other txs will also see /a/b disappear when
>>> counting /a's children, but not when querying the cache for /a/
>>> b! Worse, rollbacks would mean re-adding /a/b to /a's child map,
>>> providing weird views on other readers (/a/b disappears for a
>>> while, then reappears)
>>> 2. Make a copy of /a for the tx to work off, when removing /a/
>>> b. This is the same as having LockParentForChildInsertRemove
>>> semantics.
>>> Copy the parent as well and work off it, but to prevent problems
>>> with concurrent child removes and adds, we'd have to lock the
>>> parent.
>>> My vote is for approach 2. In fact,
>>> LockParentForChildInsertRemove would always need to be enabled
>>> when using MVCC. Perhaps this should be a deprecated property
>>> that just supports additional consistency for OL and PL, and be
>>> removed when OL/PL eventually get removed?
>>> Thoughts?
>>
>> Why not treat a delete like a modify (as I think it has been done
>> in the past)? When a node is deleted, it is locked, and either a
>> sentinel or just an fqn is stored in the TX context that issued
>> the delete. Then the parent still conains the original node,
>> allowing concurrent readers, while the calling tx knows to filter
>> deleted nodes from child operations.
> I guess that approach can still be taken. Just that then it
> introduces the potential for phantom reads (nodes appearing in the
> child set of a parent), but by that stage lockParent can be a cfg
> option to prevent that.
Right, but phantoms are still allowed by RR, and locking the parent
can significantly reduce performance. As an example, in POJO Cache,
a list is mapped as a root node, and its child elements are child
nodes. Locking the parent effectively serializes concurrent list
adds. This is especially a problem if the TX is not short-lived. Now
it could be possible to change the POJO Cache list impl to use hash
branches in the fqn to stripe the parent locking (this is what the
internal area currently does), however, this is stll less optimal
than having no parent locking.
Yeah I agree though, from a perf perspective, locking parents for
inserting and removing children sucks.
I think the real desire for locking the parent is for doing certain
atomic operations, like those on ConcurrentMap. However we could
provide a special API for such things, and in it's absence the user
can still do a cooperative lock on the parent with an update.
Not really - we only have atomic ops on the data level and not on the
structural level. E.g., Node.putIfAbsent(), etc. only operate on the
node's data map. There is no Node.addChildIfAbsent() or anything of
the sort - yet. :-)
Cheers,
--
Manik Surtani
Lead, JBoss Cache
manik(a)jboss.org