[jboss-jira] [JBoss JIRA] (JGRP-1484) SEQUENCER and merge-views broken

Bela Ban (JIRA) jira-events at lists.jboss.org
Mon Sep 3 12:08:33 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715457#comment-12715457 ] 

Bela Ban commented on JGRP-1484:
--------------------------------

Hmm. I assume the reason you have SEQUENCER *below* GMS is that the order of views and messages is the same everywhere.

I've thought about how to keep this property in the case of a merge, but so far haven't come up with a good solution.

In your suggested solution, when a merge participant gets a (unicast) INSTALL_MERGE_VIEW, it installs the MergeView directly. However, this destroys the property above, as the order of the view installation with respect to messages is not the same across all members.

I experimented with marking the MergeView multicast as NO_TOTAL_ORDER, ie. bypassing SEQUENCER for the MergeView installation: this would allow a merge participant/coordinator to install a new MergeView. However, the consequence of this is the same as with your solution: we'd lose the ordering between MergeViews and messages.

The reason why we can't simply multicast the MergeView is that the coordinator might be in a different partition, and we'd discard any multicasts coming from a member not in our (pre-MergeView) view.

E.g. {A,B} and {C,D} where B and C and merge coordinators would work, as A and C would multicast the MergeView locally, and then the new coordinator (e.g. A) would be visible by all members.

However, the case below wouldn't work:
A: {D,A}
B,C,D: {D,B,C}

If the coordinator in A's partition is D, then the INSTALL_MERGE_VIEW would trigger a multicast to disseminate the new MergeView, and A (the merge participant) would forward it to D. Say the new coordinator of the MergeView is C, then C's VIEW multicast would be discarded by A.
                
> SEQUENCER and merge-views broken
> --------------------------------
>
>                 Key: JGRP-1484
>                 URL: https://issues.jboss.org/browse/JGRP-1484
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.0.10
>            Reporter: David Hotham
>            Assignee: Bela Ban
>             Fix For: 3.2
>
>
> Here's a new way in which putting SEQUENCER below GMS is broken.
> Start with A having view B|4 [B,A], while B, C and D all have view B|7 [B,C,D].
> Now we start a merge, in which B is coordinator.  B creates the view C|8 [C, D, A, B].
> (I've opened a pull request saying that B surely shouldn't issue a view where the ViewID says that C was the creator.  But I think that this is incidental, and not key to the bug that I'm reporting here).
> Now B sends INSTALL_MERGE_VIEW to B (a coordinator) and A (a merge participant, per Util.determineMergeParticipants).
> B gets this first and broadcasts the new view to [B, C, D].  In particular, B is now not a coordinator.
> Then A gets the INSTALL_MERGE_VIEW, and it too tries broadcasting the new view.  SEQUENCER gets involved, and forwards the broadcast to B (as the coordinator in the old view).  B discards this; it's no longer a coordinator.
> So the new view is not installed at A.  All future broadcasts from A are forwarded to B, who discards them.  The group is fractured, and none of A's broadcasts are delivered.
> I'm not sure what the right fix would be.  I wonder whether things should be arranged so that in a merge:
> -  coordinators behave as today, broadcasting the new view to their own sub-groups
> -  but mere participants do not do this: they should just have the new view installed on them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list