[jbosscache-dev] Issues with FLUSH and JBC

Ben Wang ben.wang at jboss.com
Wed Sep 27 08:47:55 EDT 2006


Is solution D out of the picture already? I mean if we really can't find a good enough solution to solve it, why not just accpet it. State transfer, IMO, should not happen that often anyway, if it is just a new node joining. I think the important thing is keep the state valid/consistent.

If it is because of network instability, then we will see lots of tx problems anyway. Are we throwing more fuel into the fire? 

-Ben

-----Original Message-----
From: jbosscache-dev-bounces at lists.jboss.org [mailto:jbosscache-dev-bounces at lists.jboss.org] On Behalf Of Bela Ban
Sent: Wednesday, September 27, 2006 4:37 PM
To: Brian Stansberry
Cc: jbosscache-dev at lists.jboss.org
Subject: Re: [jbosscache-dev] Issues with FLUSH and JBC

What's the consensus as to how wwe should proceed ?

   1. Solution A with not blocking unicasts during a flush or
   2. Solution B where we block later (on FLUSH_COMPLETED) rather than
      on START_FLUSH

?

Vladimir is for #2.

How about adding of the unblock() callback in a separate listener interface ? I'd rather add this sooner than later. We would, however, also have to make sure that we actually do call this method.

Vladimir: let's have a call on this today, so we can see how to proceed...


Brian Stansberry wrote:
> Bela Ban wrote:
>> Okay, my comments will be available in book form at Prentice hall 
>> this fall... :-)
>>
>
> LOL. I'll try to reform. At least my overly long messages are on 
> e-mail and so don't kill trees. :)
>
>> Just kidding, here are some comments:
>>
>> * I don't want to change the entire implementation of FLUSH this 
>> late, 2.4 is overdue for a final release. So option B doesn't like 
>> that appealing to me o OTOH: if we can resolve the issue, why not...
>> * A: what if we block only **multicast** messages, but not
>> **unicast** messages ? This would solve issue A, but maybe there are 
>> use cases that it won't solve... We can assume that unicast messages 
>> are always responses to multicasts, so they should be allowed to 
>> complete. If this solution flies, then we have a quickfix for our 
>> problem and can *really* cleanly fix it in the next release...
>
> We'd need to be sure JBC didn't make any unicast calls (besides RPC
> responses) during the state transfer. Possible unicast calls I can 
> think of are:
>
> 1) Request for partial state transfer (with the current RPC-based 
> mechanism). E.g. 3 node cluster, node B redeploys a webapp and asks 
> for partial state transfer while node C is doing an initial state transfer.
> This would be an odd case though; typically you disable initial state 
> transfer if you're going to use the activate/inactivateRegion API.
>
> 2) Calls related to buddy group assignments. Need to think about this 
> a bit. But if they are using BR they won't be using initial state 
> transfer, so probably not an issue.
>
>> * B: okay, but if my proposed solution above works, we can do this in 
>> 2.5...
>> * C: this is essentially implementing the flush protocol at the 
>> application level, which is not a bad idea because the app always has 
>> more information than JGroups. However, it is probably a bit too 
>> redundant, and also requires quite a number of changes, which is also 
>> later for JBC 1.4 (SP?)...
>
> Yeah, it is a lot for 1.4. IMHO definitely moves it beyond the realm 
> of an SP2, into 1.4.1.
>
>> * I might have to add an additional callback blockCompleted() or
>> unblock() to JGroups, to notify members that the FLUSH phase has 
>> completed and everybody can resume sending messages.
>> I'm currently
>> investigating this... Downside: an API change, so possibly a new 
>> ExtendedXXX interface which would get merged in JGroups 3.0
>>
>
> This would be needed with B if our current algorithm for JBC is going 
> to work.
>
>>> A downside of this idea is it changes the semantics of flush and 
>>> requires JGroups changes. We'd definitely like input from Bela on 
>>> this. Also, since we initially rejecting it, we haven't fully 
>>> thought it through. (As I'm editing this to send out I see there is 
>>> no way to tell JBC after it returns from block() to not let any 
>>> "new" activity through -- big hole. I'm back to rejecting this
>>> approach.)
>> Here, we might have to introduce additional callbacks, e.g.
>> - block(): stop sending messages. FLUSH doesn't block yet though, so 
>> if an app ignores the convention and keeps sending messages it will 
>> succeed
>> - No callback when FLUSH actually does block sending of messages
>> - unblock(): called when the app can resume sending messages.
>> FLUSH does not block sending of messages anymore
>>
>
> Yep. Our current algorithm does the following during the block() call:
>
> 1) Create a latch or something that prevents new transactions 
> acquiring locks or existing transactions proceeding into the 2PC (i.e. 
> prevent
> prepare() call.)
> 2) Give transactions already in the 2PC time to complete. If they 
> don't, eventually roll them back.
> 3) Release the latch.
> 4) Immediately return from block(). (Vladimir -- problem here; there's 
> a race condition between threads released in #3 and the return from 
> block(). We need to figure out how to deal with that.)
>
> We count on FLUSH preventing the threads released in #3 sending any
> prepare() calls until the state transfer is done. Solution B breaks 
> this for the period until FLUSH_COMPLETED is sent.
>
> An unblock() callback would help here, as we'd release the latch then.
>
>> I don't think the semantic changes are that big, actually you could 
>> argue there are *no* semantic changes as block() is an inidication 
>> that message sending will block, here we're just saying it will block 
>> some time in the (near) future.
>
> +1.
>

--
Bela Ban
Lead JGroups / Manager JBoss Clustering Group JBoss - a division of Red Hat _______________________________________________
jbosscache-dev mailing list
jbosscache-dev at lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jbosscache-dev




More information about the jbosscache-dev mailing list