[
https://issues.jboss.org/browse/JGRP-1495?page=com.atlassian.jira.plugin....
]
David Hotham updated JGRP-1495:
-------------------------------
Description:
I first raised this, or something very like it, in JGRP-1468; but it got lost among the
other fixes that were made in that issue.
I've just seen a case where after a merge-view the following happens at a node W:
{noformat}
SEQUENCER GMS APPLICATION
| | |
B (view) | | |
------------>| | |
|----------------->| |
| |----\ |
C (view) | | \ |
------------>| | \ |
|----------------->| \ |
| | \ |
C (msg) | | | |
------------>| | | |
|---------------------------|-------->|
| | | |
| | \ |
| | \------->|
{noformat}
- Initially the view is {B,W}
- There's a merge view, in which B and C were the old coordinators
- B and C both broadcast the new view {B,C,T,W}. In fact both know the physical address
of the member shown above (or we're using UDP multicast, if you like); so the view is
sent to this node twice.
- B's view arrives on thread Incoming-1, and gets as far as GMS
- C's view arrives on thread Incoming-2, and gets as far as GMS (where it is dropped,
since it is the same view)
- A message from C, prompted by the change in view, arrives on thread Incoming-2
- This overtakes the view on thread Incoming-1, and is delivered to the application
- Only then does thread Incoming-1 get to deliver the view to the application
ie from the application's point of view: C broadcast the view and then the message;
whereas at the node shown above the application received the message and then the view.
I think that the fix will simply be to put a lock in SEQUENCER around up_prot.up(evt) in
SEQUENCER.deliver(). That way messages will be delivered to the application in the same
order as they arrive at SEQUENCER.
Edit: updated description for clarity
was:
I first raised this, or something very like it, in JGRP-1468; but it got lost among the
other fixes that were made in that issue.
I've just seen a case where after a merge-view the following happens:
{noformat}
SEQUENCER GMS APPLICATION
| | |
B (view) | | |
------------>| | |
|----------------->| |
| |----\ |
C (view) | | \ |
------------>| | \ |
|----------------->| \ |
| | \ |
C (msg) | | | |
------------>| | | |
|---------------------------|-------->|
| | | |
| | \ |
| | \------->|
{noformat}
- There's a merge view, in which B and C were the old coordinators
- B and C both broadcast the new view. In fact both know the physical address of the
member shown above (or we're using UDP multicast, if you like); so the view is sent to
this node twice.
- B's view arrives on thread Incoming-1, and gets as far as GMS
- C's view arrives on thread Incoming-2, and gets as far as GMS (where it is dropped,
since it is the same view)
- A message from C, prompted by the change in view, arrives on thread Incoming-2
- This overtakes the view on thread Incoming-1, and is delivered to the application
- Only then does thread Incoming-1 get to deliver the view to the application
ie from the application's point of view: C broadcast the view and then the message;
whereas at the node shown above the application received the message and then the view.
I think that the fix will simply be to put a lock in SEQUENCER around up_prot.up(evt) in
SEQUENCER.deliver(). That way messages will be delivered to the application in the same
order as they arrive at SEQUENCER.
SEQUENCER needs a lock for delivery of messages
-----------------------------------------------
Key: JGRP-1495
URL:
https://issues.jboss.org/browse/JGRP-1495
Project: JGroups
Issue Type: Feature Request
Affects Versions: 3.1
Reporter: David Hotham
Assignee: Bela Ban
I first raised this, or something very like it, in JGRP-1468; but it got lost among the
other fixes that were made in that issue.
I've just seen a case where after a merge-view the following happens at a node W:
{noformat}
SEQUENCER GMS APPLICATION
| | |
B (view) | | |
------------>| | |
|----------------->| |
| |----\ |
C (view) | | \ |
------------>| | \ |
|----------------->| \ |
| | \ |
C (msg) | | | |
------------>| | | |
|---------------------------|-------->|
| | | |
| | \ |
| | \------->|
{noformat}
- Initially the view is {B,W}
- There's a merge view, in which B and C were the old coordinators
- B and C both broadcast the new view {B,C,T,W}. In fact both know the physical address
of the member shown above (or we're using UDP multicast, if you like); so the view is
sent to this node twice.
- B's view arrives on thread Incoming-1, and gets as far as GMS
- C's view arrives on thread Incoming-2, and gets as far as GMS (where it is
dropped, since it is the same view)
- A message from C, prompted by the change in view, arrives on thread Incoming-2
- This overtakes the view on thread Incoming-1, and is delivered to the application
- Only then does thread Incoming-1 get to deliver the view to the application
ie from the application's point of view: C broadcast the view and then the message;
whereas at the node shown above the application received the message and then the view.
I think that the fix will simply be to put a lock in SEQUENCER around up_prot.up(evt) in
SEQUENCER.deliver(). That way messages will be delivered to the application in the same
order as they arrive at SEQUENCER.
Edit: updated description for clarity
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira