[jboss-jira] [JBoss JIRA] (JGRP-1379) Make merging more scalable / robust

Bela Ban (Commented) (JIRA) jira-events at lists.jboss.org
Tue Oct 25 08:54:45 EDT 2011


    [ https://issues.jboss.org/browse/JGRP-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637149#comment-12637149 ] 

Bela Ban commented on JGRP-1379:
--------------------------------

MERGE2.merge_fast compounds the problem if we have a large cluster and have a lot of subpartitions, e.g. after powering the switch off and then on again.
In the worst case, merge_fast causes N * N discovery requests to be sent in a cluster of N nodes.
We should probably turn merge_fast off in large clusters (or do it dynamically with ergonomics), or set merge_fast dynamically, based on the discovery traffic (rate of discovery requests and responses)
                
> Make merging more scalable / robust
> -----------------------------------
>
>                 Key: JGRP-1379
>                 URL: https://issues.jboss.org/browse/JGRP-1379
>             Project: JGroups
>          Issue Type: Task
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.0
>
>
> Make the MERGE2/GMS/Merger code more robust and scale better in large clusters.
> - While a merge is going on, stop sending out discovery requests. This reduces unnecessary traffic, especially in large clusters where discovery responses include the entire view of a sub-cluster
> - If we start a merge, or receive a MERGE-REQUEST, start a timer which cancels the merge after <merge_timeout *2> milliseconds. This is similar to the MergeKiller code, and prevents stale merges, e.g. by a crashed merge leader
> - If we have merge participants A,B,C,D,E but A only receives merge responses from itself, B and D, then don't cancel the merge, but instead proceed with merging A, B and D. This is currently not done, but a merge is cancelled when we don't get responses from every participant.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list