[jboss-jira] [JBoss JIRA] (JGRP-1452) SEQUENCER goes wrong when members fail simultaneously

David Hotham (JIRA) jira-events at lists.jboss.org
Sun Apr 15 15:33:17 EDT 2012


David Hotham created JGRP-1452:
----------------------------------

             Summary: SEQUENCER goes wrong when members fail simultaneously
                 Key: JGRP-1452
                 URL: https://issues.jboss.org/browse/JGRP-1452
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 3.0.9
            Reporter: David Hotham
            Assignee: Bela Ban


Consider the case where current view is [A, B, C, D], and A and B both die more or less simultaneously.

C will now try to broadcast the new view [C, D].  But if SEQUENCER is in the stack this goes wrong: SEQUENCER on C doesn't yet know that it is coordinator and tries to forward to either A or B.  The change of view gets stuck.

The problem looks to be in handleSuspect().  This assumes that there is at most one suspect, removes that from the list of members, and figures that whoever is left will be the new coordinator.  But this fails in the case just described.

IMHO it's a mistake for SEQUENCER to try and duplicate the work that the GMS layer does in the new view.  I'm currently trying a fix that removes handleSuspect() from SEQUENCER altogether, and instead pays attention to TMP_VIEW events.  This seems to be working, I think.

David

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list