[jboss-jira] [JBoss JIRA] (JGRP-2293) Graceful concurrent leaving of coordinator(s) leaves the cluster with stale views

Wed Feb 6 06:28:02 EST 2019

    [ https://issues.jboss.org/browse/JGRP-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691734#comment-13691734 ] 

Bela Ban commented on JGRP-2293:
--------------------------------

Hi [~dan.berindei]
you can run the test via ant: {{ant single-test -Dtest-class=LeaveTest}}.

I guess you have to make sure there's a route to the mcast address from the bind_addr. In [1], for example, if you use 230.x.x.x, you should bind to a *non-loopback* address. Vice versa, if you use loopback 127.0.0.x, you should use a 224.x.x.x mcast address.

BTW: what's {{230.0.5.6.7}} ? I assume that's a misspelling?

[1] https://github.com/belaban/JGroups/wiki/Multicast-routing-on-Mac-OS

> Graceful concurrent leaving of coordinator(s) leaves the cluster with stale views
> ---------------------------------------------------------------------------------
>
>                 Key: JGRP-2293
>                 URL: https://issues.jboss.org/browse/JGRP-2293
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 4.0.14
>            Reporter: Radoslav Husar
>            Assignee: Bela Ban
>            Priority: Critical
>             Fix For: 4.0.17
>
>         Attachments: IMG_20190123_124154.jpg
>
>
> JGroups does not handle concurrent leaving of nodes correctly. This is a typical use case in cloud environment when scaled down with an autoscaler/manually which we need to handle.
> A simple test can be devised which fails first n (where n>1) nodes from a cluster, reproducer PR https://github.com/belaban/JGroups/pull/397

--
This message was sent by Atlassian Jira
(v7.12.1#712002)