[jboss-jira] [JBoss JIRA] (JGRP-1846) RELAY2: delay shutting down bridge
Bela Ban (JIRA)
issues at jboss.org
Tue Jun 10 09:09:16 EDT 2014
[ https://issues.jboss.org/browse/JGRP-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974894#comment-12974894 ]
Bela Ban edited comment on JGRP-1846 at 6/10/14 9:08 AM:
---------------------------------------------------------
Please create a separate JIRA when you can reproduce the second issue. Concerning the first issue:
* {{Relay2Test}} has {{RELAY2.async_relay_creation == true}}. This means that when you shut down A, B is starting to create the bridge channels but you immediately shut it down, too, so B's shut down takes a little longer. This is expected as it needs to clean up resources which were in a half-started state.
To remedy this:
* Set {{async_relay_creation}} to true
* After shutting down A wait for bridge view X,*B*:
{{noformat}}
public void testCoordinatorShutdown() throws Exception {
a=createNode(LON, "A", LON_CLUSTER, null);
b=createNode(LON, "B", LON_CLUSTER, null);
x=createNode(SFO, "X", SFO_CLUSTER, null);
y=createNode(SFO, "Y", SFO_CLUSTER, null);
Util.waitUntilAllChannelsHaveSameSize(10000, 100, a, b);
Util.waitUntilAllChannelsHaveSameSize(10000, 100, x, y);
waitForBridgeView(2, 20000, 100, a, x); // A and X are site masters
a.close();
Util.waitUntilAllChannelsHaveSameSize(10000, 100, b);
waitForBridgeView(2, 20000, 100, b, x); // <----- B and X are now site masters
b.close();
waitForBridgeView(1, 20000, 100, x);
Util.close(x,y);
}
{{noformat}}
was (Author: belaban):
Please create a separate JIRA when you can reproduce the second issue. Concerning the first issue:
* {{Relay2Test}} has {{RELAY2.async_relay_creation == true}}. This means that when you shut down A, B is starting to create the bridge channels but you immediately shut it down, too, so B's shut down takes a little longer. This is expected as it needs to clean up resources which were in a half-started state.
To remedy this:
* Set {{async_relay_creation}} to true
* After shutting down A wait for bridge view X,*B*:
{{noformat}}
public void testCoordinatorShutdown() throws Exception {
a=createNode(LON, "A", LON_CLUSTER, null);
b=createNode(LON, "B", LON_CLUSTER, null);
x=createNode(SFO, "X", SFO_CLUSTER, null);
y=createNode(SFO, "Y", SFO_CLUSTER, null);
Util.waitUntilAllChannelsHaveSameSize(10000, 100, a, b);
Util.waitUntilAllChannelsHaveSameSize(10000, 100, x, y);
waitForBridgeView(2, 20000, 100, a, x); // A and X are site masters
a.close();
Util.waitUntilAllChannelsHaveSameSize(10000, 100, b);
waitForBridgeView(2, 20000, 100, b, x); // <----- B and X are now site masters
b.close();
waitForBridgeView(1, 20000, 100, x);
Util.close(x,y);
}
{{noformat}}
> RELAY2: delay shutting down bridge
> ----------------------------------
>
> Key: JGRP-1846
> URL: https://issues.jboss.org/browse/JGRP-1846
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.5
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Fix For: 3.5
>
>
> A simple test that starts 2 sites x 2 nodes each and shuts them down in order shows a 1 second delay when shutting down the last node in the first site (B):
> {code:java}
> public void testCoordinatorShutdown() throws Exception {
> a=createNode(LON, "A", LON_CLUSTER, null);
> b=createNode(LON, "B", LON_CLUSTER, null);
> x=createNode(SFO, "X", SFO_CLUSTER, null);
> y=createNode(SFO, "Y", SFO_CLUSTER, null);
> Util.waitUntilAllChannelsHaveSameSize(10000, 100, a, b);
> Util.waitUntilAllChannelsHaveSameSize(10000, 100, x, y);
> waitForBridgeView(2, 20000, 100, a, x);
> a.close();
> Util.waitUntilAllChannelsHaveSameSize(10000, 100, b);
> b.close();
> waitForBridgeView(1, 20000, 100, x);
> x.close();
> y.close();
> }
> {code}
> And the relevant logs:
> {noformat}
> 13:51:30,017 DEBUG (Timer-2,sfo-cluster,X:) [GMS] _X:sfo: installing view [_A:lon|1] (2) [_A:lon, _X:sfo]
> 13:51:30,028 DEBUG (Incoming-2,global,_X:sfo:) [GMS] _X:sfo: installing view [_X:sfo|2] (1) [_X:sfo]
> 13:51:30,046 TRACE (Timer-2,lon-cluster,B:) [SHARED_LOOPBACK] _B:lon: sending msg to _X:sfo, src=_B:lon, headers are GMS: GmsHeader[JOIN_REQ]: mbr=_B:lon, UNICAST3: DATA, seqno=1, first, SHARED_LOOPBACK: [cluster_name=global]
> 13:51:31,046 TRACE (Timer-2,global,_B:lon:) [SHARED_LOOPBACK] _B:lon: sending msg to _X:sfo, src=_B:lon, headers are GMS: GmsHeader[JOIN_REQ]: mbr=_B:lon, UNICAST3: DATA, seqno=1, first, SHARED_LOOPBACK: [cluster_name=global]
> 13:51:31,099 DEBUG (Incoming-2,global,_X:sfo:) [GMS] _X:sfo: installing view [_X:sfo|3] (2) [_X:sfo, _B:lon]
> {noformat}
> Note that while this happens on a background timer thread, the shutdown is delayed nonetheless because {{TP.destroy()}} waits at least 500ms for all the timer threads to finish ({{TimeScheduler3.stopRunning()}}. Perhaps that should change as well, so that timer threads are interrupted and finish immediately.
--
This message was sent by Atlassian JIRA
(v6.2.3#6260)
More information about the jboss-jira
mailing list