[jboss-jira] [JBoss JIRA] (JGRP-1484) SEQUENCER and merge-views broken

Bela Ban (JIRA) jira-events at lists.jboss.org
Tue Sep 4 07:08:33 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715686#comment-12715686 ] 

Bela Ban commented on JGRP-1484:
--------------------------------

First, we're installing the MergeView via a unicast in the participants and a broadcast in the coordinators' partition. If we're broadcasting regular messages during this, then the ordering between MergeViews and messages is undefined, as there is no ordering between unicasts and multicasts. Therefore, with both your and my solution, ordering between MergeView and messages is going to be temporarily broken.

Second, I don't like to unicast MergeView in all participants. Imagine if we have hundreds or thousands of nodes: usually, we have only a few merge coordinators, but many participants. Under your proposal, this would means, we're sending hundreds of INSTALL_MERGE_VIEW unicasts to all participants. I don't like this.

OK, I committed by JGRP-1484, why don't you pull my changes and see if your system still works ?
Unless you bring more arguments in favor of installing MergeViews in participants directly, I'm going to go with the NO_TOTAL_ORDER solution.

                
> SEQUENCER and merge-views broken
> --------------------------------
>
>                 Key: JGRP-1484
>                 URL: https://issues.jboss.org/browse/JGRP-1484
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.0.10
>            Reporter: David Hotham
>            Assignee: Bela Ban
>             Fix For: 3.2
>
>
> Here's a new way in which putting SEQUENCER below GMS is broken.
> Start with A having view B|4 [B,A], while B, C and D all have view B|7 [B,C,D].
> Now we start a merge, in which B is coordinator.  B creates the view C|8 [C, D, A, B].
> (I've opened a pull request saying that B surely shouldn't issue a view where the ViewID says that C was the creator.  But I think that this is incidental, and not key to the bug that I'm reporting here).
> Now B sends INSTALL_MERGE_VIEW to B (a coordinator) and A (a merge participant, per Util.determineMergeParticipants).
> B gets this first and broadcasts the new view to [B, C, D].  In particular, B is now not a coordinator.
> Then A gets the INSTALL_MERGE_VIEW, and it too tries broadcasting the new view.  SEQUENCER gets involved, and forwards the broadcast to B (as the coordinator in the old view).  B discards this; it's no longer a coordinator.
> So the new view is not installed at A.  All future broadcasts from A are forwarded to B, who discards them.  The group is fractured, and none of A's broadcasts are delivered.
> I'm not sure what the right fix would be.  I wonder whether things should be arranged so that in a merge:
> -  coordinators behave as today, broadcasting the new view to their own sub-groups
> -  but mere participants do not do this: they should just have the new view installed on them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list