[jboss-jira] [JBoss JIRA] Updated: (JGRP-937) MERGE4: get rid of shunning and use only merging (getting rid of shunfests)
Bela Ban (JIRA)
jira-events at lists.jboss.org
Wed Mar 25 12:26:22 EDT 2009
[ https://jira.jboss.org/jira/browse/JGRP-937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bela Ban updated JGRP-937:
--------------------------
Attachment: udp-2.6.xml
To reproduce:
- Start 20 nodes on different machines
- Use udp-2.6.xml (attached). This contains DELAY with a 500ms delay for sending of messages
- Pull some plugs, kill the switch etc
- The cluster will never merge again
(This actually does work with FD)
> MERGE4: get rid of shunning and use only merging (getting rid of shunfests)
> ---------------------------------------------------------------------------
>
> Key: JGRP-937
> URL: https://jira.jboss.org/jira/browse/JGRP-937
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 2.6.9
>
> Attachments: udp-2.6.xml
>
>
> If we have FD_ALL plus shunning, the following scenario can happen:
> - A network partition with subgroups {A} and {B,C,D,E,F,G}
> - The partition heals
> - A gets heartbeats from all members of the 2nd subgroup
> - A's FD_ALL.shun will shun all members of the 2nd subgroup !
> - And vice versa, this leads to a shunfest and large clusters might never merge back again
> SOLUTION:
> - Get rid of shunning (GMS.shun is false by default anyway, now also set FD/FD_ALL.shun to false)
> - MERGE4 periodically compares discovery results to its view
> (- This might be done a few times)
> - Then MERGE4 initiates a merge between all members who have differing views
> - Make sure digests get merged correctly (min/max)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list