Re: [jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103

Friday, 15 June 2007

Jason T. Greene wrote:
...
 That should be ok though because the CCL will still timeout on that
 lock. The original problem was that the same thread with the CCL lock
 was blocking on an FC lock so that the CCL lock would never be released
 (since the FC lock was higher in the stack). 
See my previous reply. Yes, it blocked on FC.down() because it didn't 
receive credits in up(). But up() wasn't called because there was a 
replication message ahead of it in the queue that blocked on the FQN 
held by the CCL.

So to tackle this, my suggestion were, in this order:
#1 Don't hold a lock while making a synchronous cluster method call. 
That's a big no no, especially in pre-2.5 releases. We had lots of bugs 
in the clustering code due to such code. Then Brian cleaned up all of 
it... :-)
#2 The timeout mechanism in JGroups which uses threads. Ugly, and a 
hack, and only needed for 2.4. As I argued, this will avoid the 
deadlock, but it will constantly time out (assuming some traffic).

The root cause of this is #1

-- 
Bela Ban
Lead JGroups / JBoss Clustering team
JBoss - a division of Red Hat

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [jbosscache-dev] ClusteredCacheLoader deadlocks and JBCCACHE-1103