Re: [jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103

Monday, 18 June 2007

On 15 Jun 2007, at 13:09, Bela Ban wrote:

...

 Jason T. Greene wrote:
> That should be ok though because the CCL will still timeout on that
> lock. The original problem was that the same thread with the CCL lock
> was blocking on an FC lock so that the CCL lock would never be  
> released
> (since the FC lock was higher in the stack).

 See my previous reply. Yes, it blocked on FC.down() because it  
 didn't receive credits in up(). But up() wasn't called because  
 there was a replication message ahead of it in the queue that  
 blocked on the FQN held by the CCL.

 So to tackle this, my suggestion were, in this order:
 #1 Don't hold a lock while making a synchronous cluster method  
 call. That's a big no no, especially in pre-2.5 releases. We had  
 lots of bugs in the clustering code due to such code. Then Brian  
 cleaned up all of it... :-)
 #2 The timeout mechanism in JGroups which uses threads. Ugly, and a  
 hack, and only needed for 2.4. As I argued, this will avoid the  
 deadlock, but it will constantly time out (assuming some traffic).

 The root cause of this is #1 
Let me look into why we had #1 anyway.  Originally the 1.2.x codebase  
used a synchronized block on the CacheLoaderInterceptor for this  
which meant that only one thread could pass through this interceptor  
at any given time.  I changed this to lock on the Fqn in question so  
at least if the Fqns didn't overlap multiple threads could go thru  
this interceptor.

The reason behind it seems to be so that the CacheLoader impl does  
not have to deal with concurrent calls on the same node, but thinking  
about it, I feel this is something that should be handled in each  
CacheLoader impl, which should be thread safe.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103