[jbosscache-dev] Issue with JBCACHE-1309 WAS: Branch for 2.1.X

Brian Stansberry brian.stansberry at redhat.com
Tue Apr 1 15:33:16 EDT 2008


Thanks. :) Good news, bad news.

Good news is the JBCACHE-1170 problem seems to be gone!

Bad news is a problem that seems to come from the JBCACHE-1309 fix. 
Hibernate 2nd Level Cache, OPTIMISTIC locking.

Here's what happens:

Hibernate registers a tx Synchronization. So does JBC. But Hibernate's 
executes before JBC's. So any beforeCompletion()/afterCompletion() will 
be invoked by Hibernate before the same is done for JBC. So,

1) Say a tx updates an entity, resulting in a put: to Fqn

cache.put("/test/Accounts/ENTITY/id at 1", ITEM, value);

2) Tx goes into commit. beforeCompletion() gets called on on Hibernate 
and JBC synchronizations. JBC.beforeCompletion() acquires locks on the 
main tree.

3) Hibernate.afterCompletion() gets called. JBC.afterCompletion() not 
yet called, so locks are held on main cache. Hibernate uses the callback 
to update the timestamps caches, which will invalidate any queries using 
entity types affected by the tx. Logic boils down to this pseudo-code:

// Suspend any ongoing tx
Transaction tx = tm.suspend();
// Timestamp updates must go async
cache.getInvocationContext().getOptionOverrides().setForceAsynchronous(true);
cache.put("/TS/test/org/hibernate/cache/UpdateTimestampsCache/Accounts",
           ITEM, value);
tm.resume(tx);

Note there is no overlap in Fqn between the two puts. The call with the 
transaction suspended is not trying to write to the same subtree as the 
uncommitted tx.

The put is failing with a lock acquisition timeout:

org.jboss.cache.lock.TimeoutException: write lock for / could not be 
acquired after 15000 ms. Locks: Read lock owners: 
[GlobalTransaction:<127.0.0.1:43985>:72]
Write lock owner: null
  (caller=GlobalTransaction:<127.0.0.1:43985>:74, lock info: read 
owners=[GlobalTransaction:<127.0.0.1:43985>:72] (activeReaders=1, 
activeWriter=null, waitingReaders=0, waitingWriters=0, waitingUpgrader=0))
	at 
org.jboss.cache.lock.IdentityLock.acquireWriteLock0(IdentityLock.java:249)
	at 
org.jboss.cache.lock.IdentityLock.acquireWriteLock(IdentityLock.java:172)
	at 
org.jboss.cache.interceptors.OptimisticInterceptor.lockAndCreateWorkspaceNode(OptimisticInterceptor.java:112)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:542)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.fetchWorkspaceNode(OptimisticNodeInterceptor.java:571)
	at 
org.jboss.cache.interceptors.OptimisticNodeInterceptor.invoke(OptimisticNodeInterceptor.java:81)
	at
   ......
org.jboss.cache.invocation.CacheInvocationDelegate.put(CacheInvocationDelegate.java:488)

What's going on here is fetchWorkspaceNode() is walking up the tree from 
/TS/test/org/hibernate/cache/UpdateTimestampsCache/Accounts to Fqn.ROOT, 
calling lockAndCreateWorkspaceNode() on each Fqn.  When it gets to 
Fqn.ROOT it fails.  This is because lockAndCreateWorkspaceNode() wants a 
write lock on the target node before making the workspace copy. In this 
case the WL can't be obtained, because a RL is held by the suspended tx.

Does lockAndCreateWorkspaceNode() need a write lock here?

I'll add a unit test for this; seems like a good candidate for the start 
of the integration tests I'd talked about.  This one is for sure a weird 
usage; hard to imagine a regular JBC test exercising this call pattern.

- Brian

Manik Surtani wrote:
> Guys, this is now done.
> 
> It is pretty much a snapshot of trunk as of today, with Elias' Amazon S3 
> cache loader stuff removed (don't want to introduce this in a micro 
> version - it will be there in 2.2.0).
> 
> Brian, I have even built a snapshot on the JBoss snapshot Maven repo, so 
> please give it a whirl and see if issues like JBCACHE-1170 have 
> successfully been taken care of.
> 
> Cheers,
> -- 
> Manik Surtani
> Lead, JBoss Cache
> manik at jboss.org
> 
> 
> 
> 
> 
> 
> _______________________________________________
> jbosscache-dev mailing list
> jbosscache-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jbosscache-dev

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com



More information about the jbosscache-dev mailing list