[
https://issues.jboss.org/browse/JGRP-1379?page=com.atlassian.jira.plugin....
]
Bela Ban edited comment on JGRP-1379 at 11/2/11 10:57 AM:
----------------------------------------------------------
We can reduce the number of merge responses using the algorithm below.
On a merge request from P:
#1 If we're the coordinator --> send a response
OR
#2 If P is the coordinator of our sub-cluster --> send a response
Else suppress the response
E.g. if we have sub-clusters {A,B,C,D} and {D,E,F,G}:
- A multicasts a discovery request
- B, C, and D see that the request's sender is A (the coord), so they reply
- D replies, too, because it is the coord of its own subcluster
- However, E, F and G don't reply as A is neither their coord, nor are they coords !
This is important if we have large clusters and many subclusters
was (Author: belaban):
We can reduce the number of merge responses using the algorithm below.
On a merge request from P:
#1 If we're the coordinator --> send a response
OR
#2 If P is the coordinator of our sub-cluster --> send a response
Else suppress the response
E.g. if we have sub-clusters {A,B,C,D} and {D,E,F,G}:
- A multicasts a discovery request
- B, C, and D see that the request's sender is A (the coord), so they reply
- D replies, too, because it is the coord of its own subcluster
- However, E, F and G don't reply as A is neither their coord, nor are they coords !
Make merging more scalable / robust
-----------------------------------
Key: JGRP-1379
URL:
https://issues.jboss.org/browse/JGRP-1379
Project: JGroups
Issue Type: Task
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 3.0
Make the MERGE2/GMS/Merger code more robust and scale better in large clusters.
- While a merge is going on, stop sending out discovery requests. This reduces
unnecessary traffic, especially in large clusters where discovery responses include the
entire view of a sub-cluster
- If we start a merge, or receive a MERGE-REQUEST, start a timer which cancels the merge
after <merge_timeout *2> milliseconds. This is similar to the MergeKiller code, and
prevents stale merges, e.g. by a crashed merge leader
- If we have merge participants A,B,C,D,E but A only receives merge responses from
itself, B and D, then don't cancel the merge, but instead proceed with merging A, B
and D. This is currently not done, but a merge is cancelled when we don't get
responses from every participant.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira