[
https://issues.jboss.org/browse/JGRP-2372?page=com.atlassian.jira.plugin....
]
Bela Ban edited comment on JGRP-2372 at 9/4/19 7:43 AM:
--------------------------------------------------------
An attempt to fix this issue was to get rid of GMS.leave_timeout and have a member simply
wait (on a graceful leave) until a LEAVE response was received from the coord, or until
stop() was called.
However, this introduces the following problem:
* In ASYM_ENCRYPT, if we have \{A,B,C\} and C is excluded by A and B, then we have view
\{A,B\} in A and B, and view \{A,B,C\} in C
* C keeps sending the LEAVE request to A, but A discards it, as C is not a member
* MERGE3 will not be able to help, as A and C won't be able to decrypt each
other's messages because view \{A,B\} installed a new shared group key
* C will therefore block forever in {{JChannel.disconnect()}!
Perhaps we should add GMS.leave_timeout back!
was (Author: belaban):
An attempt to fix this issue was to get rid of GMS.leave_timeout and have a member simply
wait (on a graceful leave) until a LEAVE response was received from the coord, or until
stop() was called.
However, this introduces the following problem:
* In ASYM_ENCRYPT, if we have \{A,B,C\} and C is excluded by A and B, then we have view
\{A,B\} in A and B, and view \{A,B,C\} in C
* C keeps sending the LEAVE request to A, but A discards it, as C is not a member
* MERGE3 will not be able to help, as A and C won't be able to decrypt each
other's messages because view \{A,B\} installed a new shared group key
* C will therefore block forever!
Perhaps we should add GMS.leave_timeout back!
LeaveTest fails frequently
--------------------------
Key: JGRP-2372
URL:
https://issues.jboss.org/browse/JGRP-2372
Project: JGroups
Issue Type: Task
Reporter: Bela Ban
Assignee: Bela Ban
Priority: Major
Fix For: 4.1.5
Ditto for ASYM_ENCRYPT_LeaveTest and ASYM_ENCRYPT_LeaveTestKeyExchange. Multiple members
leaving seems to leave some members behind; the view is never correct.
This happens only when running the entire test suite; running a test individually, or
running all encryption tests ({{ant encrypt}}) almost never reproduces the errors.
This is possibly caused by the high load of running a lot of tests concurrently, and the
subsequent delays resulting from it. Nevertheless, these tests should not fail.
Error message:
{noformat}
Timeout 30000 kicked in, views are: 9: [7|15] (4) [7, 8, 9, 10] 10: [7|15] (4) [7, 8, 9,
10]
java.util.concurrent.TimeoutException
at org.jgroups.util.Util.waitUntilAllChannelsHaveSameView(Util.java:293)
at org.jgroups.tests.BaseLeaveTest.testConcurrentLeaves(BaseLeaveTest.java:189)
at org.jgroups.tests.BaseLeaveTest.testLeaveOfFirstNMembers(BaseLeaveTest.java:214)
at org.jgroups.tests.BaseLeaveTest.testLeaveOfCoordAndNext8(BaseLeaveTest.java:146)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:583)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:719)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:989)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}
--
This message was sent by Atlassian Jira
(v7.13.5#713005)