[
https://issues.jboss.org/browse/JGRP-2293?page=com.atlassian.jira.plugin....
]
Dan Berindei edited comment on JGRP-2293 at 2/6/19 7:06 AM:
------------------------------------------------------------
[~belaban] I ran our test suite a few times without reproducing the failure. Then I got
the idea to repeat the offending test 100 times, and I got it to fail both with JGroups
4.0.15.Final and with 4.0.17-SNAPSHOT. Finally I analyzed the logs and I think it is a
problem with the test itself, so the fix is good for me.
I still haven't managed to run {{LeaveTest}} successfully from the command line
though, the nodes never form a cluster because they don't see each other's MPING
requests. If I run without {{<jvmarg
value="-Djava.net.preferIPv4Stack=true"/>}} I get a sendto error (see below),
but with it I get no error message, the nodes just don't see each other. I'd say
it's a problem with my environment, but the same test using the same mcast address
(230.5.6.7) passes when run from the IDE.
{noformat}
12:55:16,177 ERROR (main:[]) [MPING] JGRP000200: failed sending discovery request
java.io.IOException: Invalid argument (sendto failed)
at java.net.PlainDatagramSocketImpl.send(Native Method) ~[?:1.8.0_171]
at java.net.DatagramSocket.send(DatagramSocket.java:693) ~[?:1.8.0_171]
at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:306) [classes/:?]
at org.jgroups.protocols.PING.sendDiscoveryRequest(PING.java:64) [classes/:?]
at org.jgroups.protocols.PING.findMembers(PING.java:32) [classes/:?]
{noformat}
was (Author: dan.berindei):
[~belaban] I ran our test suite a few times without reproducing the failure. Then I got
the idea to repeat the offending test 100 times, and I got it to fail both with JGroups
4.0.15.Final and with 4.0.17-SNAPSHOT. Finally I analyzed the logs and I think it is a
problem with the test itself, so the fix is good for me.
I still haven't managed to run {{LeaveTest}} successfully from the command line
though, the nodes never form a cluster because they don't see each other's MPING
requests. If I run without {{<jvmarg
value="-Djava.net.preferIPv4Stack=true"/>}} I get a sendto error (see below),
but with it I get no error message, the nodes just don't see each other. I'd say
it's a problem with my environment, but the same test using the same mcast address
(230.0.5.6.7) passes when run from the IDE.
{noformat}
12:55:16,177 ERROR (main:[]) [MPING] JGRP000200: failed sending discovery request
java.io.IOException: Invalid argument (sendto failed)
at java.net.PlainDatagramSocketImpl.send(Native Method) ~[?:1.8.0_171]
at java.net.DatagramSocket.send(DatagramSocket.java:693) ~[?:1.8.0_171]
at org.jgroups.protocols.MPING.sendMcastDiscoveryRequest(MPING.java:306) [classes/:?]
at org.jgroups.protocols.PING.sendDiscoveryRequest(PING.java:64) [classes/:?]
at org.jgroups.protocols.PING.findMembers(PING.java:32) [classes/:?]
{noformat}
Graceful concurrent leaving of coordinator(s) leaves the cluster with
stale views
---------------------------------------------------------------------------------
Key: JGRP-2293
URL:
https://issues.jboss.org/browse/JGRP-2293
Project: JGroups
Issue Type: Bug
Affects Versions: 4.0.14
Reporter: Radoslav Husar
Assignee: Bela Ban
Priority: Critical
Fix For: 4.0.17
Attachments: IMG_20190123_124154.jpg
JGroups does not handle concurrent leaving of nodes correctly. This is a typical use case
in cloud environment when scaled down with an autoscaler/manually which we need to
handle.
A simple test can be devised which fails first n (where n>1) nodes from a cluster,
reproducer PR
https://github.com/belaban/JGroups/pull/397
--
This message was sent by Atlassian Jira
(v7.12.1#712002)