[jboss-jira] [JBoss JIRA] Created: (JGRP-1243) FD_SOCK: reduce number of messages sent on a suspicion

Bela Ban (JIRA) jira-events at lists.jboss.org
Thu Sep 30 03:11:39 EDT 2010


FD_SOCK: reduce number of messages sent on a suspicion
------------------------------------------------------

                 Key: JGRP-1243
                 URL: https://jira.jboss.org/browse/JGRP-1243
             Project: JGroups
          Issue Type: Feature Request
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 2.10.1, 2.11


- When B suspects C, B multicasts a SUSPECT(C) message
- Everyone receives the SUSPECT(C) message and passes it up and down the stack as a SUSPECT(C) event
- VERIFY_SUSPECT *on every member* sends one (or more) ARE_YOU_DEAD messages to C
- C replies to the sender with a I_AM_NOT_DEAD messages, or not if crashed
- However, only the coordinator (or next in line) actually processes the SUSPECT(C) event in GMS !
--> All of the VERIFY_SUSPECT processing is superfluous unless it is the coord or next-in-line !

The number of messages used for a false suspicion are (1 SUSPECT mcast) + ((N-1) ARE_YOU_DEAD unicasts) + ((N-1) I_AM_NOT_DEAD unicasts)) !

SOLUTION:
- The SUSPECT(C) message could be sent as a unicast only to the coordinator and the next-in-line member. Maybe we could use a max_rank=2 for this, similar to the suggested solution for FD_ALL ? This would be good for non multicast based transports, e.g. TCP
- The SUSPECT(C) message is multicast to everyone, but only the coord and next-in-line start the VERIFY_SUSPECT processing

Issue: if we have {A,B,C,D,E}, what happens if A,B and C crash at the same time ?
- E's connection to A closes: E sends a SUSPECT(A) to B and C (excluding suspected A)
--> B and C are dead and won't process the message !
- Then E suspects B and sends a SUSPECT(A,B) to C and D (excluding suspected A and B)
- C adds A and B to its suspect list and finds out it is the next-in-line
- C then runs the VERIFY_SUSPECT protocol
- C passes the SUSPECT(A,B) event up the stack
- C becomes the new coord


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list