[jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103

Manik Surtani manik at jboss.org
Tue Jun 19 07:27:05 EDT 2007


On 19 Jun 2007, at 09:04, Bela Ban wrote:

> Is this going to solve JBCACHE-1103 ? I mean, +1 for the lock  
> refactoring, sounds like it needs to be done anyway. But, because  
> you have to be thread safe at one point, aren't you deferring the  
> issue to a later stage ? At one point you have to acquire a lock to  
> prevent concurrent loading, and I'm afraid we'll have the same issue.
>
> But, wait, if you defer lock acquisition till after the synchronous  
> cluster method call, you should be fine. Is this what you have in  
> mind ?

There are 2 locks here that need to be done.  One lock is on the  
Node, via the lock interceptor, which happens *after* the cluster  
method call.  This is not the problem.

We also had a level of sync in the cache loader interceptor to  
prevent the same Fqn being loaded twice from the CL (or a  
simultaneous load and store, etc).  This bit was the problem, since  
this sync had to happen *before* any loading was done by the CL.

Regarding the other CL impls, some of them already were threadsafe  
(BDBJE, Jdbm) and some weren't (JDBC, File). I created a StripedLock  
class which uses a set of ReentrantReadWriteLocks applied to the Fqn  
in question, and the JDBC and File CLs now use this to achieve thread  
safety (more efficient than synchronized blocks and better mem usage  
than one lock per Fqn - usual benefits of lock striping).

You're right that the improvements in locking here are not quite  
central to solving 1103, but somewhat related.

The real solution to 1103 is in the CCL - remotely originating put 
()'s shouldn't need a load (and hence not need to wait on the Fqn  
being loaded) *only if* we're talking about the CCL, since CCLs don't  
perform a corresponding store (they're read-only).  Therefore this  
logic cannot be in the CLI as it doesn't apply to all CL impls.


>
> On the same topic: I looked into interrupting clustered method  
> calls in JGroups (http://jira.jboss.com/jira/browse/JGRP-533), and  
> came to the conclusion that it isn't feasible. When a thread is  
> blocked on FC.down(), waiting for credits, then I can interrupt it,  
> but it will go right back in its loop waiting for credits,  
> therefore blocking again. Unless I change the loop condition  
> (running & waiting-for-credits), the thread will always block if  
> there aren't enough credits available.
>
> Thoughts ?
>
>
> Manik Surtani wrote:
>>
>>
>> Ok, I spent a bit of time investigating this, and the original  
>> sync blocks in the CacheLoaderInterceptor, superceded by the  
>> synchronization on an Fqn in subsequent versions are all hacks to  
>> get around the fact that some cache loader implementations  
>> themselves aren't thread safe.
>>
>> I wrote a test to check the thread safety of cache loader impls,  
>> and most of what we have fail.  I patched the  
>> DummyInMemoryCacheLoader to be thread safe (a lot of  
>> synchronization within the impl) and removed the unnecessary locks  
>> in the interceptors and things work fine.  I tried this with some  
>> of the few thread-safe cache loader impls we have (BDBJE, for  
>> example) and this worked well as well.
>>
>> So what I propose is this:
>>
>> 1) Patch CLI to NOT use the Fqn locks (in BaseCacheLoaderInterceptor)
>> 2) Make sure we acquire locks (lock() call up the interceptor  
>> stack) *before* attempting to load node, but after creating temp node
>> 3) Make sure cache loader impls are thread safe
>> 4) Add thread safety test to CacheLoaderTestBase
>> 5) Patch loaders that aren't threadsafe and fail test in (4) -  
>> FileCacheLoader, JDBCCacheLoader, etc.
>>
>> I'm going to do this on my local checkout of HEAD and run the  
>> regression tests overnight.  If this works, I suggest backporting  
>> this to 1.4.x and regressing there.
>>
>> WDYT?
>
> -- 
> Bela Ban
> Lead JGroups / JBoss Clustering team
> JBoss - a division of Red Hat

--
Manik Surtani

Lead, JBoss Cache
JBoss, a division of Red Hat






More information about the jbosscache-dev mailing list