[jboss-jira] [JBoss JIRA] (JGRP-1846) RELAY2: delay shutting down bridge

Bela Ban (JIRA) issues at jboss.org
Tue Jun 10 09:09:18 EDT 2014


    [ https://issues.jboss.org/browse/JGRP-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974894#comment-12974894 ] 

Bela Ban edited comment on JGRP-1846 at 6/10/14 9:09 AM:
---------------------------------------------------------

Please create a separate JIRA when you can reproduce the second issue. Concerning the first issue:
* {{Relay2Test}} has {{RELAY2.async_relay_creation == true}}. This means that when you shut down A, B is starting to create the bridge channels but you immediately shut it down, too, so B's shut down takes a little longer. This is expected as it needs to clean up resources which were in a half-started state.

To remedy this:
* Set {{async_relay_creation}} to false
* After shutting down A wait for bridge view X,*B*:
{{noformat}}
   public void testCoordinatorShutdown() throws Exception {
        a=createNode(LON, "A", LON_CLUSTER, null);
        b=createNode(LON, "B", LON_CLUSTER, null);
        x=createNode(SFO, "X", SFO_CLUSTER, null);
        y=createNode(SFO, "Y", SFO_CLUSTER, null);
        Util.waitUntilAllChannelsHaveSameSize(10000, 100, a, b);
        Util.waitUntilAllChannelsHaveSameSize(10000, 100, x, y);
        waitForBridgeView(2, 20000, 100, a, x); // A and X are site masters

        a.close();
        Util.waitUntilAllChannelsHaveSameSize(10000, 100, b);
        waitForBridgeView(2, 20000, 100, b, x); //  B and X are now site masters
      
        b.close();
        waitForBridgeView(1, 20000, 100, x);
        Util.close(x,y);
    }
{{noformat}}


was (Author: belaban):
Please create a separate JIRA when you can reproduce the second issue. Concerning the first issue:
* {{Relay2Test}} has {{RELAY2.async_relay_creation == true}}. This means that when you shut down A, B is starting to create the bridge channels but you immediately shut it down, too, so B's shut down takes a little longer. This is expected as it needs to clean up resources which were in a half-started state.

To remedy this:
* Set {{async_relay_creation}} to false
* After shutting down A wait for bridge view X,*B*:

{{noformat}}
   public void testCoordinatorShutdown() throws Exception {
        a=createNode(LON, "A", LON_CLUSTER, null);
        b=createNode(LON, "B", LON_CLUSTER, null);
        x=createNode(SFO, "X", SFO_CLUSTER, null);
        y=createNode(SFO, "Y", SFO_CLUSTER, null);
        Util.waitUntilAllChannelsHaveSameSize(10000, 100, a, b);
        Util.waitUntilAllChannelsHaveSameSize(10000, 100, x, y);
        waitForBridgeView(2, 20000, 100, a, x); // A and X are site masters

        a.close();
        Util.waitUntilAllChannelsHaveSameSize(10000, 100, b);
        waitForBridgeView(2, 20000, 100, b, x); // <----- B and X are now site masters
      
        b.close();
        waitForBridgeView(1, 20000, 100, x);
        Util.close(x,y);
    }
{{noformat}}

> RELAY2: delay shutting down bridge
> ----------------------------------
>
>                 Key: JGRP-1846
>                 URL: https://issues.jboss.org/browse/JGRP-1846
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Dan Berindei
>            Assignee: Bela Ban
>             Fix For: 3.5
>
>
> A simple test that starts 2 sites x 2 nodes each and shuts them down in order shows a 1 second delay when shutting down the last node in the first site (B):
> {code:java}
>     public void testCoordinatorShutdown() throws Exception {
>        a=createNode(LON, "A", LON_CLUSTER, null);
>        b=createNode(LON, "B", LON_CLUSTER, null);
>        x=createNode(SFO, "X", SFO_CLUSTER, null);
>        y=createNode(SFO, "Y", SFO_CLUSTER, null);
>        Util.waitUntilAllChannelsHaveSameSize(10000, 100, a, b);
>        Util.waitUntilAllChannelsHaveSameSize(10000, 100, x, y);
>        waitForBridgeView(2, 20000, 100, a, x);
>        a.close();
>        Util.waitUntilAllChannelsHaveSameSize(10000, 100, b);
>        b.close();
>        waitForBridgeView(1, 20000, 100, x);
>        x.close();
>        y.close();
>     }
> {code}
> And the relevant logs:
> {noformat}
> 13:51:30,017 DEBUG (Timer-2,sfo-cluster,X:) [GMS] _X:sfo: installing view [_A:lon|1] (2) [_A:lon, _X:sfo]
> 13:51:30,028 DEBUG (Incoming-2,global,_X:sfo:) [GMS] _X:sfo: installing view [_X:sfo|2] (1) [_X:sfo]
> 13:51:30,046 TRACE (Timer-2,lon-cluster,B:) [SHARED_LOOPBACK] _B:lon: sending msg to _X:sfo, src=_B:lon, headers are GMS: GmsHeader[JOIN_REQ]: mbr=_B:lon, UNICAST3: DATA, seqno=1, first, SHARED_LOOPBACK: [cluster_name=global]
> 13:51:31,046 TRACE (Timer-2,global,_B:lon:) [SHARED_LOOPBACK] _B:lon: sending msg to _X:sfo, src=_B:lon, headers are GMS: GmsHeader[JOIN_REQ]: mbr=_B:lon, UNICAST3: DATA, seqno=1, first, SHARED_LOOPBACK: [cluster_name=global]
> 13:51:31,099 DEBUG (Incoming-2,global,_X:sfo:) [GMS] _X:sfo: installing view [_X:sfo|3] (2) [_X:sfo, _B:lon]
> {noformat}
> Note that while this happens on a background timer thread, the shutdown is delayed nonetheless because {{TP.destroy()}} waits at least 500ms for all the timer threads to finish ({{TimeScheduler3.stopRunning()}}. Perhaps that should change as well, so that timer threads are interrupted and finish immediately.



--
This message was sent by Atlassian JIRA
(v6.2.3#6260)


More information about the jboss-jira mailing list