[jboss-jira] [JBoss JIRA] (JGRP-2451) FD_ALL3: improvements over FD_ALL

Dan Berindei (Jira) issues at jboss.org
Wed Mar 4 07:43:21 EST 2020


    [ https://issues.redhat.com/browse/JGRP-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13988718#comment-13988718 ] 

Dan Berindei commented on JGRP-2451:
------------------------------------

I propose something like {{FD_ALL2}}, but with {{N=timeout/interval}} bits per member
# Every time the timer task runs:
## If all N bits are 0, suspect the member
## Increment the bit index
## Set the current bit to 0
# Every time a message is received:
## Set the current bit to 1

E.g. with timeout=3s and interval=20s, N=3 bits
# Start at time T with all bits set
# At time T+20s, all bits are set, and the timer sets bit(1) = 0
# At any time between T+20 and T+40, a message arrives, and it sets bit(1) = 1
# At time T+40s, the timer sets bit(2) = 0
# At time T+60s, the timer sets bit(0) = 0
# At time T+80s, the timer sets bit(1) = 0
# At time T+100s, the timer finally sees that all bits are 0 and suspects the node
Thus the time between the last heartbeat and the node being suspected is between 60s and 80s.

> FD_ALL3: improvements over FD_ALL
> ---------------------------------
>
>                 Key: JGRP-2451
>                 URL: https://issues.redhat.com/browse/JGRP-2451
>             Project: JGroups
>          Issue Type: Feature Request
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>            Priority: Major
>             Fix For: 5.0, 4.2.1
>
>
> Improvements to {{FD_ALL}}. 
> * Messages should count as heartbeats ({{msg_counts_as_heartbeat}} should be *default*, and as such, deprecated/removed).
> * When a multicast message is sent before {{interval}} elapsed, we suppress sending a heartbeat
> It is crucial that setting the  in the map is quick, especially since this is done on every message. This should not be an issue, as we fetch the current time from the time service, which does *not* call {{System.nanoTime()}} or {{System.currentTimeMillis()}} every time.
> The advantage  is that we only send heartbeats when there is no (multicast) traffic, and we don't suspect a member P when heartbeats have been missing despite receiving traffic from P.
> We need to think about whether to consider unicast messages, too, on the sender side: we could populate a bit map with messages sent to members: on a unicast message to P, P's bit would be set in the bit. On a multicast message, all bits would be set. Then, we could selectively send heartbeats only to members with bits set to 0.
> However, this is only feasible with sending a message N-1 times (e.g. TCP); for UDP we don't have such an 'anycast' available.



--
This message was sent by Atlassian Jira
(v7.13.8#713008)


More information about the jboss-jira mailing list