[jboss-jira] [JBoss JIRA] (JGRP-1523) FD_ALL does not unsuspect on heartbeat

Bela Ban (JIRA) jira-events at lists.jboss.org
Fri Oct 19 02:50:02 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727900#comment-12727900 ] 

Bela Ban edited comment on JGRP-1523 at 10/19/12 2:48 AM:
----------------------------------------------------------

Actually, the above idea to only check the suspected members on suspect() won't cut it, as an UNSUSPECT event has to be sent up and down the stack, because GMS also keeps a list of suspected members.

SOLUTION:
- On reception of a message (if msg_counts_as_heartbeat=true) or a heartbeat from P, if P is in suspected_mbrs: remove P from suspected_mbrs and send an UNSUSPECT event up and down the stack

- We could maintain a volatile variable has_suspected_mbrs which is set to true when suspected_mbrs is > 0, and false otherwise, so the cost would be minimal
                
      was (Author: belaban):
    We also need to send an UNSUSPECT up and down the stack, as GMS also keeps a list of suspected members.
                  
> FD_ALL does not unsuspect on heartbeat
> --------------------------------------
>
>                 Key: JGRP-1523
>                 URL: https://issues.jboss.org/browse/JGRP-1523
>             Project: JGroups
>          Issue Type: Quality Risk
>    Affects Versions: 3.0.14
>            Reporter: Jan Boehm
>            Assignee: Bela Ban
>             Fix For: 3.2
>
>
> FD_ALL stores suspected nodes in "suspected_mbrs" when it receives no heartbeats. If it does not predict the local node as new coordinator it does not pass this suspicion upwards. Since UNSUSPECT from upwards is the only event that removes a node from suspected_mbrs the set retains all nodes except the nodes that where newly suspected when local became the potential new coordinator.
> This seems wasteful and wrong (it leads to wrong results if there are "stale" suspects that would be preferred as new coordinators). The timestamps for nodes in suspected_mbrs should be rechecked in FD_ALL.suspect before adding the new nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list