[jboss-jira] [JBoss JIRA] (JGRP-1910) MERGE3: Do not lose any members from view during a series of merges

Matej Čimbora (JIRA) issues at jboss.org
Mon May 11 08:42:19 EDT 2015


    [ https://issues.jboss.org/browse/JGRP-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066865#comment-13066865 ] 

Matej Čimbora commented on JGRP-1910:
-------------------------------------

[~belaban], [~rvansa]: Thanks for sharing your thoughts on this. As the issue is on_qa again, we need to agree on the outcome. Obviously, everything works as expected with reliable network. As some of messages may get dropped in unreliable networks, it looks like we currently can't avoid parallel merges (the chance of them occuring can only be mitigated), therefore some members can be lost in consecutive merges. Bela, can you please confirm this is the final state of implementation, or are there any plans to change the algorhitm as Radim suggested?

> MERGE3: Do not lose any members from view during a series of merges
> -------------------------------------------------------------------
>
>                 Key: JGRP-1910
>                 URL: https://issues.jboss.org/browse/JGRP-1910
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Radim Vansa
>            Assignee: Bela Ban
>             Fix For: 3.6.3
>
>         Attachments: SplitMergeFailFastTest.java, SplitMergeTest.java
>
>
> When connection between nodes is re-established, MERGE3 should merge the cluster together. This often does not involve a single MergeView but a series of such events. The problematic property of this protocol is that some of those views can lack certain members, though these are reachable.
> This causes problem in Infinispan since the cache cannot be fully rebalanced before another merge arrives, and all owners of certain segment can be gradually removed (and added again) to the view, while this is not detected as partition but crashed nodes -> losing all owners means data loss.
> Removing members from view should be the role of FDx protocols, not MERGEx.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)



More information about the jboss-jira mailing list