Re: [jbosscache-dev] Cache unable to write to cluster

Wednesday, 12 November 2008

Bela Ban wrote:
...

 Manik Surtani wrote:
> Ok - so we could add an extra check into the view change listener to 
> force an unblock if a member who initiated a FLUSH dies.  We would 
> also have to record the address of the member initiating the FLUSH in 
> the flushBlockGate.

 +1.

 I'm also puzzled as to why this could evade detection for so long... 
 Have we started to use channels in a different way now ? E.g. concurrent 
 startup ?

3 possible factors:

1) The AS now creates/starts a cache when it needs it, i.e. as part of 
deploy of a clustered webapp or SFSB, rather than at AS start.  Effect 
of this is during a testsuite run there is a lot more starting/stopping 
of caches and associated channels than there was back in the day. So 
intermittent failures will happen more.  This is a fairly old change 
though. But for sure it increases odds of these failures vs. say the 
first half of this year.

2) AS upgraded to 2.6.5 on October 15. I saw and reported an 
intermittent flush failure on October 18. Maybe a relationship; don't know.

3) In our second week in Brno, I changed the protocol stacks to add > 1 
min_threads to the pools. Clebert seems to feel the JBM failures he 
started reported popped up when he changed the stack JBM is testing 
against to match the AS stack; making localhost=true and min_threads > 1 
were the most significant changes.

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry(a)redhat.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [jbosscache-dev] Cache unable to write to cluster