[jboss-jira] [JBoss JIRA] (JGRP-1452) SEQUENCER goes wrong when members fail simultaneously

David Hotham (JIRA) jira-events at lists.jboss.org
Mon May 14 11:25:17 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12692699#comment-12692699 ] 

David Hotham commented on JGRP-1452:
------------------------------------

I seem to have lost the trace to go with this issue, but I think that the answer is that we never get a VIEW_CHANGE.  That would happen on receipt of the broadcast view; but the view is never broadcast - just forwarded to a dead member.

(Possibly the unusual thing here is that I have SEQUENCER below GMS, per JGRP-1426.  If that were not the case then I think that the view would be broadcast and we'd be OK).
                
> SEQUENCER goes wrong when members fail simultaneously
> -----------------------------------------------------
>
>                 Key: JGRP-1452
>                 URL: https://issues.jboss.org/browse/JGRP-1452
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.0.9
>            Reporter: David Hotham
>            Assignee: Bela Ban
>             Fix For: 3.1
>
>
> Consider the case where current view is [A, B, C, D], and A and B both die more or less simultaneously.
> C will now try to broadcast the new view [C, D].  But if SEQUENCER is in the stack this goes wrong: SEQUENCER on C doesn't yet know that it is coordinator and tries to forward to either A or B.  The change of view gets stuck.
> The problem looks to be in handleSuspect().  This assumes that there is at most one suspect, removes that from the list of members, and figures that whoever is left will be the new coordinator.  But this fails in the case just described.
> IMHO it's a mistake for SEQUENCER to try and duplicate the work that the GMS layer does in the new view.  I'm currently trying a fix that removes handleSuspect() from SEQUENCER altogether, and instead pays attention to TMP_VIEW events.  This seems to be working, I think.
> David

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list