[
https://issues.jboss.org/browse/JGRP-1452?page=com.atlassian.jira.plugin....
]
David Hotham commented on JGRP-1452:
------------------------------------
I seem to have lost the trace to go with this issue, but I think that the answer is that
we never get a VIEW_CHANGE. That would happen on receipt of the broadcast view; but the
view is never broadcast - just forwarded to a dead member.
(Possibly the unusual thing here is that I have SEQUENCER below GMS, per JGRP-1426. If
that were not the case then I think that the view would be broadcast and we'd be OK).
SEQUENCER goes wrong when members fail simultaneously
-----------------------------------------------------
Key: JGRP-1452
URL:
https://issues.jboss.org/browse/JGRP-1452
Project: JGroups
Issue Type: Bug
Affects Versions: 3.0.9
Reporter: David Hotham
Assignee: Bela Ban
Fix For: 3.1
Consider the case where current view is [A, B, C, D], and A and B both die more or less
simultaneously.
C will now try to broadcast the new view [C, D]. But if SEQUENCER is in the stack this
goes wrong: SEQUENCER on C doesn't yet know that it is coordinator and tries to
forward to either A or B. The change of view gets stuck.
The problem looks to be in handleSuspect(). This assumes that there is at most one
suspect, removes that from the list of members, and figures that whoever is left will be
the new coordinator. But this fails in the case just described.
IMHO it's a mistake for SEQUENCER to try and duplicate the work that the GMS layer
does in the new view. I'm currently trying a fix that removes handleSuspect() from
SEQUENCER altogether, and instead pays attention to TMP_VIEW events. This seems to be
working, I think.
David
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira