[jbosscache-dev] Cache unable to write to cluster

Brian Stansberry brian.stansberry at redhat.com
Wed Nov 12 10:01:18 EST 2008


Bela Ban wrote:
> 
> 
> Manik Surtani wrote:
>> Ok - so we could add an extra check into the view change listener to 
>> force an unblock if a member who initiated a FLUSH dies.  We would 
>> also have to record the address of the member initiating the FLUSH in 
>> the flushBlockGate.
> 
> +1.
> 
> I'm also puzzled as to why this could evade detection for so long... 
> Have we started to use channels in a different way now ? E.g. concurrent 
> startup ?
> 

3 possible factors:

1) The AS now creates/starts a cache when it needs it, i.e. as part of 
deploy of a clustered webapp or SFSB, rather than at AS start.  Effect 
of this is during a testsuite run there is a lot more starting/stopping 
of caches and associated channels than there was back in the day. So 
intermittent failures will happen more.  This is a fairly old change 
though. But for sure it increases odds of these failures vs. say the 
first half of this year.

2) AS upgraded to 2.6.5 on October 15. I saw and reported an 
intermittent flush failure on October 18. Maybe a relationship; don't know.

3) In our second week in Brno, I changed the protocol stacks to add > 1 
min_threads to the pools. Clebert seems to feel the JBM failures he 
started reported popped up when he changed the stack JBM is testing 
against to match the AS stack; making localhost=true and min_threads > 1 
were the most significant changes.

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com




More information about the jbosscache-dev mailing list