Bela Ban wrote:
Okay, my comments will be available in book form at Prentice
hall this fall... :-)
LOL. I'll try to reform. At least my overly long messages are on e-mail
and so don't kill trees. :)
Just kidding, here are some comments:
* I don't want to change the entire implementation of FLUSH this
late, 2.4 is overdue for a final release. So option B
doesn't like
that appealing to me
o OTOH: if we can resolve the issue, why not...
* A: what if we block only **multicast** messages, but not
**unicast** messages ? This would solve issue A, but maybe there
are use cases that it won't solve... We can assume that unicast
messages are always responses to multicasts, so they should be
allowed to complete. If this solution flies, then we have a
quickfix for our problem and can *really* cleanly fix it in the
next release...
We'd need to be sure JBC didn't make any unicast calls (besides RPC
responses) during the state transfer. Possible unicast calls I can
think of are:
1) Request for partial state transfer (with the current RPC-based
mechanism). E.g. 3 node cluster, node B redeploys a webapp and asks for
partial state transfer while node C is doing an initial state transfer.
This would be an odd case though; typically you disable initial state
transfer if you're going to use the activate/inactivateRegion API.
2) Calls related to buddy group assignments. Need to think about this a
bit. But if they are using BR they won't be using initial state
transfer, so probably not an issue.
* B: okay, but if my proposed solution above works, we can do
this in 2.5...
* C: this is essentially implementing the flush protocol at the
application level, which is not a bad idea because the
app always
has more information than JGroups. However, it is probably a bit
too redundant, and also requires quite a number of
changes, which
is also later for JBC 1.4 (SP?)...
Yeah, it is a lot for 1.4. IMHO definitely moves it beyond the realm of
an SP2, into 1.4.1.
* I might have to add an additional callback blockCompleted() or
unblock() to JGroups, to notify members that the FLUSH phase has
completed and everybody can resume sending messages.
I'm currently
investigating this... Downside: an API change, so possibly a new
ExtendedXXX interface which would get merged in JGroups 3.0
This would be needed with B if our current algorithm for JBC is going to
work.
> A downside of this idea is it changes the semantics of flush and
> requires JGroups changes. We'd definitely like input from Bela on
> this. Also, since we initially rejecting it, we haven't fully
> thought it through. (As I'm editing this to send out I see there is
> no way to tell JBC after it returns from block() to not let any
> "new" activity through -- big hole. I'm back to rejecting this
> approach.)
Here, we might have to introduce additional callbacks, e.g.
- block(): stop sending messages. FLUSH doesn't block yet
though, so if an app ignores the convention and keeps sending
messages it will succeed
- No callback when FLUSH actually does block sending of messages
- unblock(): called when the app can resume sending messages.
FLUSH does not block sending of messages anymore
Yep. Our current algorithm does the following during the block() call:
1) Create a latch or something that prevents new transactions acquiring
locks or existing transactions proceeding into the 2PC (i.e. prevent
prepare() call.)
2) Give transactions already in the 2PC time to complete. If they
don't, eventually roll them back.
3) Release the latch.
4) Immediately return from block(). (Vladimir -- problem here; there's a
race condition between threads released in #3 and the return from
block(). We need to figure out how to deal with that.)
We count on FLUSH preventing the threads released in #3 sending any
prepare() calls until the state transfer is done. Solution B breaks
this for the period until FLUSH_COMPLETED is sent.
An unblock() callback would help here, as we'd release the latch then.
I don't think the semantic changes are that big, actually you
could argue there are *no* semantic changes as block() is an
inidication that message sending will block, here we're just
saying it will block some time in the (near) future.
+1.