[jboss-jira] [JBoss JIRA] (JGRP-1481) SEQUENCER can try forwarding to null coordinator
Bela Ban (JIRA)
jira-events at lists.jboss.org
Thu Jun 21 09:54:12 EDT 2012
[ https://issues.jboss.org/browse/JGRP-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702703#comment-12702703 ]
Bela Ban commented on JGRP-1481:
--------------------------------
OK, I agree on the first part: setCoord() should not be called when coord_changed is false. This was done because originally setCoord() also set the 'view' field, but since this is not the case anymore, I removed setCoord().
I don't think that setting the coordinator *before* flush() will work: in flush(), we wait until all in-flight sending threads have completed and only *then* set coord, or else existing threads might forward their messages to the new coordinator *before* the old messages have been sent. This would be incorrect.
BTW: what did you mean when you said "a group go wrong" ?
I committed my changes, please take a look.
> SEQUENCER can try forwarding to null coordinator
> ------------------------------------------------
>
> Key: JGRP-1481
> URL: https://issues.jboss.org/browse/JGRP-1481
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.0.10
> Reporter: David Hotham
> Assignee: Bela Ban
> Fix For: 3.1
>
>
> I've just seen a group go wrong and I think that the first sign of trouble was this line of trace:
> {noformat}
> 2012-06-15 08:55:54.690 [ForkJoinPool-1-worker-0] TRACE org.jgroups.protocols.SEQUENCER - [CFS-A-four]: forwarding CFS-A-four::1 to coord null
> {noformat}
> ... showing a member trying to forward to a null coordinator.
> This comes shortly after a new view was installed so SEQUENCER should have seen the VIEW_CHANGE. I think that the suspect code is in handleViewChange()
> {noformat}
> if(!coord_changed) {
> setCoord(new_coord);
> return;
> }
> {noformat}
> Surely the intention isn't to call setCoord only if the coodinator hasn't changed?
> I see that setCoord will get called in due course by the flusher. But perhaps it would be safer to do something like:
> {noformat}
> if(!coord_changed) {
> return;
> }
> stopFlusher();
> setCoord(new_coord);
> startFlusher(new_coord); // needs to be done in the background, to prevent blocking if down() would block
> {noformat}
> ... and possibly the call to setCoord in flush() can then be removed?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list