[jboss-jira] [JBoss JIRA] (JGRP-1481) SEQUENCER can try forwarding to null coordinator

Bela Ban (JIRA) jira-events at lists.jboss.org
Thu Jun 21 09:54:12 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702703#comment-12702703 ] 

Bela Ban commented on JGRP-1481:
--------------------------------

OK, I agree on the first part: setCoord() should not be called when coord_changed is false. This was done because originally setCoord() also set the 'view' field, but since this is not the case anymore, I removed setCoord().

I don't think that setting the coordinator *before* flush() will work: in flush(), we wait until all in-flight sending threads have completed and only *then* set coord, or else existing threads might forward their messages to the new coordinator *before* the old messages have been sent. This would be incorrect.

BTW: what did you mean when you said "a group go wrong" ?

I committed my changes, please take a look.
                
> SEQUENCER can try forwarding to null coordinator
> ------------------------------------------------
>
>                 Key: JGRP-1481
>                 URL: https://issues.jboss.org/browse/JGRP-1481
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.0.10
>            Reporter: David Hotham
>            Assignee: Bela Ban
>             Fix For: 3.1
>
>
> I've just seen a group go wrong and I think that the first sign of trouble was this line of trace:
> {noformat}
> 2012-06-15 08:55:54.690 [ForkJoinPool-1-worker-0] TRACE org.jgroups.protocols.SEQUENCER - [CFS-A-four]: forwarding CFS-A-four::1 to coord null
> {noformat}
> ... showing a member trying to forward to a null coordinator.
> This comes shortly after a new view was installed so SEQUENCER should have seen the VIEW_CHANGE.  I think that the suspect code is in handleViewChange()
> {noformat}
>         if(!coord_changed) {
>             setCoord(new_coord);
>             return;
>         }
> {noformat}
> Surely the intention isn't to call setCoord only if the coodinator hasn't changed?
> I see that setCoord will get called in due course by the flusher.  But perhaps it would be safer to do something like:
> {noformat}
>         if(!coord_changed) {
>             return;
>         }
>         stopFlusher();
>         setCoord(new_coord);
>         startFlusher(new_coord); // needs to be done in the background, to prevent blocking if down() would block
> {noformat}
> ... and possibly the call to setCoord in flush() can then be removed?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list