New subject: JGroups and concurrent FLUSHes

Friday, 27 February 2009

Sounds like an overly large change for a micro release on the highly 
stable 2.6 branch. JGroups' 2.7 branch seems more appropriate.  NBST in 
JBC is not stable tech, so I don't like the idea of destabilizing a 
highly stable branch to cater to it.

 From a technical POV, what's discussed sounds fine. ;)

Bela Ban wrote:
...
 Copied Brian, who will probably not like this. I guess... :-)

 Actually, we should move this discussion over to jbosscache-dev... 
 (copied). Please reply to jbosscache-dev from now on

 Vladimir Blagojevic wrote:
> As Bela and I talked this option simplifies FLUSH quite a bit but puts 
> the burden and the *freedom* of retry management on application code. 
> My only concern is compatibility if we are going to stick this into 
> 2.6.9. This changes flush semantics quite a bit and perhaps we can 
> talk about this as well.
>
> On 2/27/09 9:52 AM, Bela Ban wrote:
>> Manik and I discussed this over the phone. Some items we came up with:
>>
>>    * Concurrent partial flushes won't happen because if a state
>>      requester or provider isin the process of transferring state,
>>      it'll reject the new state transfer
>>    * Concurrent total and partial flushes *are* possible: a view change
>>      and a partial state, executed concurrently
>>          o We cannot disable flushing for view changes, because the
>>            user probably wanted this with placing FLUSH into the stack.
>>            OTOH, because JBC NBST requires FLUSH (because of the
>>            partial flushing), we need to be able to disable flushing
>>            for view changes. Hence the previous email.
>>          o If a view change flush is in progress, currently a partial
>>            flush would fail. Manik will change code to make the partial
>>            flushing back off and retry, up to a number of times, if
>>            this happens
>>          o If a view change happens, but a partial flush is already in
>>            progress, the view change will fail ! We'll change code such
>>            that the coordinator backs off and retries a number of
>>            times, before giving up.
>>          o Because total flushes and partial flushes are usually very
>>            short, the backing off and retrying mechanism should work
>>            most of the time
>>          o Would establishing total order between total and partial
>>            flushes help ? Vladimir and Bela to investigate
>>          o Vladimir: make sure that if A flushes A,B and C
>>            successfully, but fails for D and E, A only aborts the flush
>>            on A,B and C, but *not* on D or E ! A member cannot abort
>>            flushes started by someone else !
>>
>>
>

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry(a)redhat.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: JGroups and concurrent FLUSHes