[jboss-jira] [JBoss JIRA] (JGRP-2276) MERGE3: a dead member as merge leader will never trigger a merge

Bela Ban (JIRA) issues at jboss.org
Tue Jun 12 11:42:00 EDT 2018


    [ https://issues.jboss.org/browse/JGRP-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590397#comment-13590397 ] 

Bela Ban commented on JGRP-2276:
--------------------------------

OK, first things first: I fixed the way merge works.

Rather than checking who's supposed to be the merge leader in {{MERGE3}}, this decision is made by {{Merger}} (in {{GMS}}). {{Merger}} has to do this anyway and has more information (namely the full views) at its disposal, so it can remove subgroup coordinators from the merge that are unresponsive (or dead).

This means potentially a bit more traffic ({{MERGE3}} has to fetch the full views from the subgroup coordinators), but this is done only when {{MERGE3}} finds differing ViewIds, so when a partition occurred. Also, different members run the consistency check at different times, and when a member is about to start checking its views, but a merge has already been started and completed by a different member, there would not be an additional merge process triggered.

> MERGE3: a dead member as merge leader will never trigger a merge
> ----------------------------------------------------------------
>
>                 Key: JGRP-2276
>                 URL: https://issues.jboss.org/browse/JGRP-2276
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 4.0.12
>
>
> When we have member(s) which have a view with a dead member as coordinator, when the dead member becomes part of the subgroup coordinators and _happens to be chosen as merge leader_, then a merge will never ensue. 
> Example:
> * Member 2 (with view 2|5) was the previous coordinator and left the cluster, installing view 3|6 before stopping
> ** View 2|5=\{2,3,4,5,6,7\}; view 3|6=\{3,4,5,6,7\}
> * Member 7 didn't get view 3|6 and still has view 2|5
> * Everybody else has view 3|6
> * MERGE3 gets the following views:
> ** 2|5: 7 // member 7 has this view
> ** 3|6: 3,4,5,6 // members 3,4,5 and 6 have this view
> * 2 and 3 are added to a _sorted set_ and the first member of the set (2) is chose as merge leader. 3 doesn't take any action, as it notices it won't be the merge leader
> ** The reason 2 was first in the sorted set is that (possibly by coincidence) its UUID is *lower* than that of 3. If this wasn't the case, 3 would be merge leader and start (and successfully complete) a merge. However, with dead member 2 being picked as merge leader, a merge will never be triggered!



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)


More information about the jboss-jira mailing list