[jboss-jira] [JBoss JIRA] (JGRP-2451) FD_ALL2: improvements
Bela Ban (Jira)
issues at jboss.org
Fri Feb 14 08:39:00 EST 2020
[ https://issues.redhat.com/browse/JGRP-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bela Ban updated JGRP-2451:
---------------------------
Description:
Improvements to {{FD_ALL2}}.
* Messages should count as heartbeats ({{msg_counts_as_heartbeat}} should be *default*, and as such, deprecated).
* When a multicast message is sent before {{interval}} elapsed, we suppress sending a heartbeat
* There's a map associating members with booleans. True means a heartbeat was received since the last check, false means it wasn't. On a check, the booleans are all set to false.
It is crucial that setting the in the map is quick (not like in {{FD_ALL}}, where we fetch the current time from the time service), especially since this is done on every message.
The advantage is that we only send heartbeats when there is no (multicast) traffic, and we don't suspect a member P when heartbeats have been missing despite receiving traffic from P.
We need to think about whether to consider unicast messages, too, on the sender side: we could populate a bit map with messages sent to members: on a unicast message to P, P's bit would be set in the bit. On a multicast message, all bits would be set. Then, we could selectively send heartbeats only to members with bits set to 0.
However, this is only feasible with sending a message N-1 times (e.g. TCP); for UDP we don't have such an 'anycast' available.
was:
New heartbeat protocol, based on {{FD_ALL2}}.
* Messages should count as heartbeats ({{msg_counts_as_heartbeat}} should be default, and as such, deprecated).
* When a multicast message is sent before {{interval}} elapsed, we suppress sending a heartbeat
* A simple bitmap represents members: bits set to {{1}} mean we received a heartbeat since the last check (at {{timeout}} ms), bits set to {{0}} lead to suspicions
* The bitmap is reset on each check
It is crucial that setting a bit in the bitmap is quick (not like in {{FD_ALL}}, where we fetch the current time from the time service), especially since this is done on every message.
The advantage of {{FD_ALL3}} over previous failure detection protocols is that we only send heartbeats when there is no (multicast) traffic, and we don't suspect a member P when heartbeats have been missing despite receiving traffic from P.
We need to think about whether to consider unicast messages, too, on the sender side: we could populate a bit map with messages sent to members: on a unicast message to P, P's bit would be set in the bit. On a multicast message, all bits would be set. Then, we could selectively send heartbeats only to members with bits set to 0.
However, this is only feasible with sending a message N-1 times (e.g. TCP); for UDP we don't have such an 'anycast' available.
> FD_ALL2: improvements
> ---------------------
>
> Key: JGRP-2451
> URL: https://issues.redhat.com/browse/JGRP-2451
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Major
> Fix For: 5.0, 4.2.0
>
>
> Improvements to {{FD_ALL2}}.
> * Messages should count as heartbeats ({{msg_counts_as_heartbeat}} should be *default*, and as such, deprecated).
> * When a multicast message is sent before {{interval}} elapsed, we suppress sending a heartbeat
> * There's a map associating members with booleans. True means a heartbeat was received since the last check, false means it wasn't. On a check, the booleans are all set to false.
> It is crucial that setting the in the map is quick (not like in {{FD_ALL}}, where we fetch the current time from the time service), especially since this is done on every message.
> The advantage is that we only send heartbeats when there is no (multicast) traffic, and we don't suspect a member P when heartbeats have been missing despite receiving traffic from P.
> We need to think about whether to consider unicast messages, too, on the sender side: we could populate a bit map with messages sent to members: on a unicast message to P, P's bit would be set in the bit. On a multicast message, all bits would be set. Then, we could selectively send heartbeats only to members with bits set to 0.
> However, this is only feasible with sending a message N-1 times (e.g. TCP); for UDP we don't have such an 'anycast' available.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
More information about the jboss-jira
mailing list