[jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103

Tue Jun 19 04:04:31 EDT 2007

Is this going to solve JBCACHE-1103 ? I mean, +1 for the lock 
refactoring, sounds like it needs to be done anyway. But, because you 
have to be thread safe at one point, aren't you deferring the issue to a 
later stage ? At one point you have to acquire a lock to prevent 
concurrent loading, and I'm afraid we'll have the same issue.

But, wait, if you defer lock acquisition till after the synchronous 
cluster method call, you should be fine. Is this what you have in mind ?

On the same topic: I looked into interrupting clustered method calls in 
JGroups (http://jira.jboss.com/jira/browse/JGRP-533), and came to the 
conclusion that it isn't feasible. When a thread is blocked on 
FC.down(), waiting for credits, then I can interrupt it, but it will go 
right back in its loop waiting for credits, therefore blocking again. 
Unless I change the loop condition (running & waiting-for-credits), the 
thread will always block if there aren't enough credits available.

Thoughts ?

Manik Surtani wrote:
>
>
> Ok, I spent a bit of time investigating this, and the original sync 
> blocks in the CacheLoaderInterceptor, superceded by the 
> synchronization on an Fqn in subsequent versions are all hacks to get 
> around the fact that some cache loader implementations themselves 
> aren't thread safe.
>
> I wrote a test to check the thread safety of cache loader impls, and 
> most of what we have fail.  I patched the DummyInMemoryCacheLoader to 
> be thread safe (a lot of synchronization within the impl) and removed 
> the unnecessary locks in the interceptors and things work fine.  I 
> tried this with some of the few thread-safe cache loader impls we have 
> (BDBJE, for example) and this worked well as well.
>
> So what I propose is this:
>
> 1) Patch CLI to NOT use the Fqn locks (in BaseCacheLoaderInterceptor)
> 2) Make sure we acquire locks (lock() call up the interceptor stack) 
> *before* attempting to load node, but after creating temp node
> 3) Make sure cache loader impls are thread safe
> 4) Add thread safety test to CacheLoaderTestBase
> 5) Patch loaders that aren't threadsafe and fail test in (4) - 
> FileCacheLoader, JDBCCacheLoader, etc.
>
> I'm going to do this on my local checkout of HEAD and run the 
> regression tests overnight.  If this works, I suggest backporting this 
> to 1.4.x and regressing there.
>
> WDYT?

-- 
Bela Ban
Lead JGroups / JBoss Clustering team
JBoss - a division of Red Hat