[jboss-jira] [JBoss JIRA] (JGRP-1393) Optimization of concurrent joining to a non-existing cluster

Bela Ban (Created) (JIRA) jira-events at lists.jboss.org
Fri Nov 25 07:51:40 EST 2011


Optimization of concurrent joining to a non-existing cluster
------------------------------------------------------------

                 Key: JGRP-1393
                 URL: https://issues.jboss.org/browse/JGRP-1393
             Project: JGroups
          Issue Type: Enhancement
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 3.1


When we have no members running yet, and A, B, C and D join a cluster at exactly the same time, the following can happen:
- A starts, sends a discovery request. B and C reply. A returns after N seconds with responses from A, B and C.
- B starts, sends a discovery request. A and C reply. B returns after N seconds with responses from A, B and C.
- C starts, sends a discovery request. A and B  reply. C returns after N seconds with responses from A, B and C
- D starts, sends a discovery request. A, B and C reply. C returns after N seconds with responses from A, B, C and D

Responses are:
A: ABC
B: ABC
C: ABC
D: ABCD

Note that A, B and C don't have D's response.

The algorithm now has every member sort all of the responses, and pick the first as new coordinator. Say we have the following sorted lists:

A: BAC
B: BAC
C: BAC
D: DBAC

The issue is now that B *and* D will become coordinator, and we have to have a merge to establish the correct cluster membership.

The reason is that - apparently - A, B and C started a bit (we're talking 1-2 ms) sooner than D, and so D didn't get their discovery requests, and thus didn't send back a discovery response.

Even though D started a bit after A, B and C, the latter will still receive D's discovery *request* (but not response). We can now take advantage of this and simply add D's address to the discovery responses of every member when we receive D's discovery *request*, in addition to D's *response*.

This will greatly reduce the chances of a merge having to be done as a result of concurrent startup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list