[jboss-jira] [JBoss JIRA] (JGRP-1876) MERGE3 : Strange number and content of subgroups

Radim Vansa (JIRA) issues at jboss.org
Fri Jan 23 11:10:49 EST 2015


    [ https://issues.jboss.org/browse/JGRP-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034670#comment-13034670 ] 

Radim Vansa commented on JGRP-1876:
-----------------------------------

I don't follow the full discussion, but I also consider losing member than can respond, even temporarily, an unhappy property of the merge algorithm. Given that all nodes are alive (responding), I think that they should get a monotonous (in subset relationship) sequence of views.
Also, the merge algorithm should have guaranteed (and rather easily computable) upper bounds of completion, given some response time bounds. In ideal case, this value would be the input property of the protocol and all other properties would be computed based on it.

> MERGE3 : Strange number and content of subgroups
> ------------------------------------------------
>
>                 Key: JGRP-1876
>                 URL: https://issues.jboss.org/browse/JGRP-1876
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.4.2
>            Reporter: Karim AMMOUS
>            Assignee: Bela Ban
>             Fix For: 3.5.1, 3.6, 3.6.2
>
>         Attachments: 4Subgroups.zip, ChannelCreator.java, DkeJgrpAddress.java, JGRP-1876-1.pdf, karim-logs-files.zip, MergeTest4.java, MergeViewWith210Subgroups.log, SplitMergeTest.java, views.txt
>
>
> Using JGroups 3.4.2, a split occurred and a merge was processed successfully but number of subgroups is wrong (210 instead of 2).
> The final mergeView is correct and contains 210 members.
> Here is an extract of subviews: 
> {code}
> INFO | Incoming-18,cluster,term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F] | [MyMembershipListener.java:126] | (middleware) | MergeView view ID = [serv-ZM2BU35940-58033:vt-14:192.168.55.55:1:CL(GROUP01)[F]|172]
> 210 subgroups 
> [....
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [term-ETJ104215245-11092:host:192.168.56.72:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU38960-6907:asb:192.168.55.52:1:CL(GROUP01)[F]]
> [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]|171] (1) [term-ETJ101697729-31726:host:192.168.56.6:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU47533-55240:vt-14:192.168.55.57:1:CL(GROUP01)[F]]
> [term-ETJ100691812-36873:host:192.168.56.16:1:CL(GROUP01)[F]|170] (1) [serv-ZM2BU35943-49435:asb:192.168.55.51:1:CL(GROUP01)[F]]
> ....]
> {code}
> II wasn't able to reproduce that with a simple program. But I observed that merge was preceded by an ifdown/ifup on host 192.168.56.6. That member lost all others members, but it still present in their view.
> Example:  
> {code}
> {A, B, C} => {A, B, C} and {C} => {A, B, C}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.11#6341)


More information about the jboss-jira mailing list