[
https://issues.jboss.org/browse/JGRP-1523?page=com.atlassian.jira.plugin....
]
Bela Ban edited comment on JGRP-1523 at 10/19/12 2:48 AM:
----------------------------------------------------------
Actually, the above idea to only check the suspected members on suspect() won't cut
it, as an UNSUSPECT event has to be sent up and down the stack, because GMS also keeps a
list of suspected members.
SOLUTION:
- On reception of a message (if msg_counts_as_heartbeat=true) or a heartbeat from P, if P
is in suspected_mbrs: remove P from suspected_mbrs and send an UNSUSPECT event up and down
the stack
- We could maintain a volatile variable has_suspected_mbrs which is set to true when
suspected_mbrs is > 0, and false otherwise, so the cost would be minimal
was (Author: belaban):
We also need to send an UNSUSPECT up and down the stack, as GMS also keeps a list of
suspected members.
FD_ALL does not unsuspect on heartbeat
--------------------------------------
Key: JGRP-1523
URL:
https://issues.jboss.org/browse/JGRP-1523
Project: JGroups
Issue Type: Quality Risk
Affects Versions: 3.0.14
Reporter: Jan Boehm
Assignee: Bela Ban
Fix For: 3.2
FD_ALL stores suspected nodes in "suspected_mbrs" when it receives no
heartbeats. If it does not predict the local node as new coordinator it does not pass this
suspicion upwards. Since UNSUSPECT from upwards is the only event that removes a node from
suspected_mbrs the set retains all nodes except the nodes that where newly suspected when
local became the potential new coordinator.
This seems wasteful and wrong (it leads to wrong results if there are "stale"
suspects that would be preferred as new coordinators). The timestamps for nodes in
suspected_mbrs should be rechecked in FD_ALL.suspect before adding the new nodes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira