Okay, my comments will be available in book form at Prentice hall this
Just kidding, here are some comments:
* I don't want to change the entire implementation of FLUSH this
late, 2.4 is overdue for a final release. So option B doesn't like
that appealing to me
o OTOH: if we can resolve the issue, why not...
* A: what if we block only **multicast** messages, but not
**unicast** messages ? This would solve issue A, but maybe there
are use cases that it won't solve... We can assume that unicast
messages are always responses to multicasts, so they should be
allowed to complete. If this solution flies, then we have a
quickfix for our problem and can *really* cleanly fix it in the
* B: okay, but if my proposed solution above works, we can do this
* C: this is essentially implementing the flush protocol at the
application level, which is not a bad idea because the app always
has more information than JGroups. However, it is probably a bit
too redundant, and also requires quite a number of changes, which
is also later for JBC 1.4 (SP?)...
* I might have to add an additional callback blockCompleted() or
unblock() to JGroups, to notify members that the FLUSH phase has
completed and everybody can resume sending messages. I'm currently
investigating this... Downside: an API change, so possibly a new
ExtendedXXX interface which would get merged in JGroups 3.0
More comments inline
Brian Stansberry wrote:
Problem is as follows. 2 node REPL_SYNC cluster, A B where A is just
starting up and thus initiates a FLUSH:
1) JBC on B has tx in progress, just starting the 2PC. Sends out the
2) A sends out a START_FLUSH message.
3) A gets START_FLUSH, calls block() on JBC.
4) JBC on A is new, doesn't have much going on, very quickly returns
from block(). A will no longer pass *down* any messages below FLUSH.
Except unicasts... ?
5) A gets the prepare() (no problem, FLUSH doesn't block up
just down messages.)
6) A executes the prepare(), but can't send the response to B because
FLUSH is blocking the channel.
With my solution, it *would* be able to send the PREPARE or
7) B gets the START_FLUSH, calls block() on JBC.
8) JBC B doesn't immediately return from block() as it is giving the
prepare() some time to complete (avoid unnecessary tx rollback). But
prepare() won't complete because A's channel is blocking the RPC
response!! Eventually JBC B's block() impl will have to roll back the
It wouldn't with my proposed solution
B) A solution we discussed, rejected and then came back to this
(please read FLUSH.txt to understand the change we're discussing):
Channel does not block down messages when block() returns. Rather it
just sends out a FLUSH_OK message (see FLUSH.txt). It shouldn't
initiate any new cluster activity (e.g. a prepare()) after sending
FLUSH_OK, but it can respond to RPC calls. When it gets a FLUSH_OK from
all the other members, it then blocks down messages and multicasts a
FLUSH_COMPLETED to the cluster.
Differences from the current FLUSH impl:
1) Node doesn't begin blocking down messages before sending FLUSH_OK.
2) Node begins blocking down messages before sending FLUSH_COMPLETED.
3) Node multicasts FLUSH_COMPLETED, rather than unicasting to the node
that initiated the FLUSH.
4) Nodes regard the FLUSH_COMPLETED as the last message from another
node, rather than the FLUSH_OK.
A downside of this idea is it changes the semantics of flush and
requires JGroups changes. We'd definitely like input from Bela on this.
Also, since we initially rejecting it, we haven't fully thought it
through. (As I'm editing this to send out I see there is no way to tell
JBC after it returns from block() to not let any "new" activity through
-- big hole. I'm back to rejecting this approach.)
Here, we might have to introduce additional callbacks, e.g.
- block(): stop sending messages. FLUSH doesn't block yet though, so if
an app ignores the convention and keeps sending messages it will succeed
- No callback when FLUSH actually does block sending of messages
- unblock(): called when the app can resume sending messages. FLUSH does
not block sending of messages anymore
I don't think the semantic changes are that big, actually you could
argue there are *no* semantic changes as block() is an inidication that
message sending will block, here we're just saying it will block some
time in the (near) future.
Lead JGroups / Manager JBoss Clustering Group
JBoss - a division of Red Hat