[jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103

Bruno Georges bgeorges at redhat.com
Tue Jun 19 01:18:34 EDT 2007


+1
On 19 Jun 2007, at 01:14, Jason T. Greene wrote:

> Simply awesome!
>
> On Tue, 2007-06-19 at 00:08 +0100, Manik Surtani wrote:
>> On 18 Jun 2007, at 11:48, Manik Surtani wrote:
>>
>>>
>>> On 15 Jun 2007, at 13:09, Bela Ban wrote:
>>>
>>>>
>>>>
>>>> Jason T. Greene wrote:
>>>>> That should be ok though because the CCL will still timeout on  
>>>>> that
>>>>> lock. The original problem was that the same thread with the CCL
>>>>> lock
>>>>> was blocking on an FC lock so that the CCL lock would never be
>>>>> released
>>>>> (since the FC lock was higher in the stack).
>>>>
>>>> See my previous reply. Yes, it blocked on FC.down() because it
>>>> didn't receive credits in up(). But up() wasn't called because
>>>> there was a replication message ahead of it in the queue that
>>>> blocked on the FQN held by the CCL.
>>>>
>>>> So to tackle this, my suggestion were, in this order:
>>>> #1 Don't hold a lock while making a synchronous cluster method
>>>> call. That's a big no no, especially in pre-2.5 releases. We had
>>>> lots of bugs in the clustering code due to such code. Then Brian
>>>> cleaned up all of it... :-)
>>>> #2 The timeout mechanism in JGroups which uses threads. Ugly, and
>>>> a hack, and only needed for 2.4. As I argued, this will avoid the
>>>> deadlock, but it will constantly time out (assuming some traffic).
>>>>
>>>> The root cause of this is #1
>>>
>>> Let me look into why we had #1 anyway.  Originally the 1.2.x
>>> codebase used a synchronized block on the CacheLoaderInterceptor
>>> for this which meant that only one thread could pass through this
>>> interceptor at any given time.  I changed this to lock on the Fqn
>>> in question so at least if the Fqns didn't overlap multiple threads
>>> could go thru this interceptor.
>>>
>>> The reason behind it seems to be so that the CacheLoader impl does
>>> not have to deal with concurrent calls on the same node, but
>>> thinking about it, I feel this is something that should be handled
>>> in each CacheLoader impl, which should be thread safe.
>>
>> Ok, I spent a bit of time investigating this, and the original sync
>> blocks in the CacheLoaderInterceptor, superceded by the
>> synchronization on an Fqn in subsequent versions are all hacks to get
>> around the fact that some cache loader implementations themselves
>> aren't thread safe.
>>
>> I wrote a test to check the thread safety of cache loader impls, and
>> most of what we have fail.  I patched the DummyInMemoryCacheLoader to
>> be thread safe (a lot of synchronization within the impl) and removed
>> the unnecessary locks in the interceptors and things work fine.  I
>> tried this with some of the few thread-safe cache loader impls we
>> have (BDBJE, for example) and this worked well as well.
>>
>> So what I propose is this:
>>
>> 1) Patch CLI to NOT use the Fqn locks (in BaseCacheLoaderInterceptor)
>> 2) Make sure we acquire locks (lock() call up the interceptor stack)
>> *before* attempting to load node, but after creating temp node
>> 3) Make sure cache loader impls are thread safe
>> 4) Add thread safety test to CacheLoaderTestBase
>> 5) Patch loaders that aren't threadsafe and fail test in (4) -
>> FileCacheLoader, JDBCCacheLoader, etc.
>>
>> I'm going to do this on my local checkout of HEAD and run the
>> regression tests overnight.  If this works, I suggest backporting
>> this to 1.4.x and regressing there.
>>
>> WDYT?
>>
>> Cheers,
>> --
>> Manik Surtani
>>
>> Lead, JBoss Cache
>> JBoss, a division of Red Hat
>>
>>
>>
>> _______________________________________________
>> jbosscache-dev mailing list
>> jbosscache-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
> -- 
> Jason T. Greene
> Lead, POJO Cache
> JBoss, a division of Red Hat
>
> _______________________________________________
> jbosscache-dev mailing list
> jbosscache-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jbosscache-dev




More information about the jbosscache-dev mailing list