[jboss-jira] [JBoss JIRA] Commented: (JGRP-981) Wrong view with concurrent startup

Bela Ban (JIRA) jira-events at lists.jboss.org
Mon May 18 16:46:05 EDT 2009


    [ https://jira.jboss.org/jira/browse/JGRP-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12468082#action_12468082 ] 

Bela Ban commented on JGRP-981:
-------------------------------

OK, I was able to reproduce this here withy a simple Draw demo. What happens is:

  1. Start A, becomes coordinator. V1={A}
  2. Start B and C concurrently
  3. A processes B's JOIN request and sends back V2={A,B} to B
  4. B is in the process of installing this view but hasn't yet changed
     to Participant (from Client)
  5. A processes C's JOIN request and sends back V3={A,B,C} to C
  6. A multicasts V3={A,B,C} to the group
  7. Since B is still a Client (ClientGmsImpl), it discards V3
  8. Therefore B is stuck with V2 and won't install V3 until prodded by
     VIEW_SYNC

This is fixed in 2.6 and later: the coordinator waits until it has acks from all members in which a new view is installed, e.g. we would wait in step #4 until the Client becomes a Participant, and therfore not drop V3.

In case we exceed the view_ack_collection_timeout, VIEW_SYNC will install the correct view.

>From 2.8 on, the new merge ([1]) will replace VIEW_SYNC and shunning altogether.

A clear WORKAROUND is to stagger startup of multiple nodes, all that's needed is a few hundred ms


[1] https://jira.jboss.org/jira/browse/JGRP-937



> Wrong view with concurrent startup
> ----------------------------------
>
>                 Key: JGRP-981
>                 URL: https://jira.jboss.org/jira/browse/JGRP-981
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.4.5
>            Reporter: Dennis Reed
>            Assignee: Bela Ban
>             Fix For: 2.4.7
>
>
> When starting multiple nodes simultaneously, one node sometimes retains an old view.
> Other than the obvious issues, this prevents STABLE from working, and NAKACK will eventually use up the heap.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list