[jbosscache-dev] Problems with node locking semantics in JBC1.4.1.GA

Sun Jan 28 16:11:01 EST 2007

I played around with one way to deal with this.

Speaking in db isolation terms, let's say a non-repeatable-read means a
tx reads a node's *data map* twice and gets different results. A phantom
read means a tx reads a node's *child map* twice and gets different
results. Conceptually, a JBC node represents both a "row" (data map) and
a query result set (children map -- set of rows that share a common
characteristic that the parent represents). (I know this analogy is
inexact, but, ...).

In 1.4.0.SP1, REPEATABLE_READ meant you wouldn't get
non-repeatable-reads, but could get phantom reads. With 1.4.1, you also
don't get phantom reads (at least with respect to a node's immediate
children). But, in many cases this behavior is unneeded and undesirable,
and JBC no longer offers the old behavior.

I attempted to remedy this with a new IsolationLevel
REPEATABLE_DATA_READ, which basically provides the 1.4.0 semantics.
This was actually very easy to do.

1) LockStrategyFactory creates the same LockStrategy for
REPEATABLE_DATA_READ as for REPEATABLE_READ -- hence lock behavior at
the node level is unaffected.

2) PessimisticLockInterceptor uses the new level when deciding whether
to WL a parent before adding/removing a node.  Manik nicely encapsulated
that logic, so it  was simple to change.  Here's a diff to the head of
Branch_JBossCache_1_4_0:

### Eclipse Workspace Patch 1.0
#P JBossCache
Index: src/org/jboss/cache/interceptors/PessimisticLockInterceptor.java
===================================================================
RCS file:
/cvsroot/jboss/JBossCache/src/org/jboss/cache/interceptors/PessimisticLo
ckInterceptor.java,v
retrieving revision 1.20.2.8
diff -u -r1.20.2.8 PessimisticLockInterceptor.java

--- src/org/jboss/cache/interceptors/PessimisticLockInterceptor.java
14 Dec 2006 12:51:48 -0000	1.20.2.8
+++ src/org/jboss/cache/interceptors/PessimisticLockInterceptor.java
28 Jan 2007 21:08:55 -0000
@@ -41,6 +41,8 @@
 public class PessimisticLockInterceptor extends Interceptor
 {
    TransactionTable tx_table = null;
+   
+   boolean writeLockOnChildInsertRemove = true;   
 
    /**
     * Map<Object, java.util.List>. Keys = threads, values = lists of
locks held by that thread
@@ -55,6 +57,7 @@
       tx_table = cache.getTransactionTable();
       lock_table = cache.getLockTable();
       lock_acquisition_timeout = cache.getLockAcquisitionTimeout();
+      writeLockOnChildInsertRemove = (cache.getIsolationLevelClass() !=
IsolationLevel.REPEATABLE_DATA_READ);
    }
 
 
@@ -323,12 +326,14 @@
 
    private boolean writeLockNeeded(int lock_type, int currentNodeIndex,
int treeNodeSize, boolean isRemoveOperation, boolean createIfNotExists,
Fqn targetFqn, Fqn currentFqn)
    {
-      if (isRemoveOperation && currentNodeIndex == treeNodeSize - 2)
-         return true; // we're doing a remove and we've reached the
PARENT node of the target to be removed.
-
-      if (!isTargetNode(currentNodeIndex, treeNodeSize) &&
!cache.exists(new Fqn(currentFqn, targetFqn.get(currentNodeIndex + 1))))
-         return createIfNotExists; // we're at a node in the tree, not
yet at the target node, and we need to create the next node.  So we need
a WL here.
-
+      if (writeLockOnChildInsertRemove)
+      {
+         if (isRemoveOperation && currentNodeIndex == treeNodeSize - 2)
+            return true; // we're doing a remove and we've reached the
PARENT node of the target to be removed.
+   
+         if (!isTargetNode(currentNodeIndex, treeNodeSize) &&
!cache.exists(new Fqn(currentFqn, targetFqn.get(currentNodeIndex + 1))))
+            return createIfNotExists; // we're at a node in the tree,
not yet at the target node, and we need to create the next node.  So we
need a WL here.
+      }
       return lock_type == DataNode.LOCK_TYPE_WRITE &&
isTargetNode(currentNodeIndex, treeNodeSize) && (createIfNotExists ||
isRemoveOperation); //normal operation, write lock explicitly requested
and this is the target to be written to.
    }


jbosscache-dev-bounces at lists.jboss.org wrote:
> In 1.4.1.GA, we introduced node locking semantics that
> weren't there in the previous few releases (and maybe never
> were there).  This is causing significant integration issues.
> 
> Specifically, before inserting or removing a child node, we
> acquire a WL on the parent.  Previously, only a RL was acquired.
> 
> I'm concerned about anomalies this might cause in apps that
> intentionally or unintentionally depends on the old behavior.
> For example, the AS http session repl code assumes it has
> freedom to add/remove cache nodes that represent a session
> underneath the parent that represents the webapp. Now these
> activities will block until any activity on any other session
> is complete.  With FIELD granularity, that's a big issue, as
> locks are held throughout the web request. It will likely
> cause issues with other granularities as well. (We already
> saw a minor issue in a support case when a customer tried to upgrade
> 4.0.5 to 1.4.1).
> 
> This definitely breaks the Hibernate (and thus EJB3 entity bean
> clustering) integration with 1.4.1.GA. Basically, assume
> there is an entity Person, with a child collection
> "children".  Hibernate caches the entity, and later caches
> the child collection in a child node.
> 
> 1) Transaction tx = tm.begin();
> 2) cache.get("/Person/Person#1/children", ITEM); // returns
> null -- not cached yet
> 
> 3) read db to get the ids of the collection elements
> 
> 4) tx.suspend();
> 5) cache.putFailFast("/Person/Person#1/children", ITEM, data); 6)
> tx.resume(); 
> 
> This fails in step 5, because with the tx suspended, the
> putFailFast() call cannot acquire the WL on /Person/Person#1
> that it needs to insert /Person/Person#1/children.
> 
> Hmm, an EJB3 unit test shows that problem occuring, but I
> assume even simple entity caching will break:
> 
> 1) Transaction tx = tm.begin();
> 2) cache.get("/Person/Person#2", ITEM); // returns null --
> not cached yet
> 
> 3) read db to get the Person
> 
> 4) tx.suspend();
> 5) cache.putFailFast("/Person/Person#1", ITEM, data); 6) tx.resume();
> 
> We need a quick workaround for this in the 1.4 branch, as
> it's a blocker for EJB3 RC10 and AS 4.2.
> 
> Brian Stansberry
> Lead, AS Clustering
> JBoss, a division of Red Hat
> Ph: 510-396-3864
> skype: bstansberry
> 
> _______________________________________________
> jbosscache-dev mailing list
> jbosscache-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jbosscache-dev