[jboss-jira] [JBoss JIRA] (JGRP-1523) FD_ALL does not unsuspect on heartbeat
Jan Boehm (JIRA)
jira-events at lists.jboss.org
Wed Oct 17 07:35:01 EDT 2012
[ https://issues.jboss.org/browse/JGRP-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727176#comment-12727176 ]
Jan Boehm commented on JGRP-1523:
---------------------------------
More explanation (from IRC):
- suppose a cluster view containing nodes [A B C]
- the node that we look at is C
- first it suspects A as beeing in fault (in my use case this is due to an infinispan bug, that causes OOB messages to be dropped because no threads are available to handle them, in this case heartbeats)
- since B is in the member list before C, C will not propagate the suspicion
- now the load on C decreases, and new heartbeats from A arrive, but C still keeps it on the suspect list
- now if B becomes suspect (for the same reasons as A before) C takes action
> FD_ALL does not unsuspect on heartbeat
> --------------------------------------
>
> Key: JGRP-1523
> URL: https://issues.jboss.org/browse/JGRP-1523
> Project: JGroups
> Issue Type: Quality Risk
> Affects Versions: 3.0.14
> Reporter: Jan Boehm
> Assignee: Bela Ban
> Fix For: 3.0.15, 3.2
>
>
> FD_ALL stores suspected nodes in "suspected_mbrs" when it receives no heartbeats. If it does not predict the local node as new coordinator it does not pass this suspicion upwards. Since UNSUSPECT from upwards is the only event that removes a node from suspected_mbrs the set retains all nodes except the nodes that where newly suspected when local became the potential new coordinator.
> This seems wasteful and wrong (it leads to wrong results if there are "stale" suspects that would be preferred as new coordinators). The timestamps for nodes in suspected_mbrs should be rechecked in FD_ALL.suspect before adding the new nodes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list