[jboss-jira] [JBoss JIRA] (JGRP-2030) GMS: view_ack_collection_timeout delay when last 2 members leave concurrently

Thu Apr 14 08:43:00 EDT 2016

     [ https://issues.jboss.org/browse/JGRP-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bela Ban updated JGRP-2030:
---------------------------
    Fix Version/s: 3.6.10
                       (was: 3.6.9)


> GMS: view_ack_collection_timeout delay when last 2 members leave concurrently
> -----------------------------------------------------------------------------
>
>                 Key: JGRP-2030
>                 URL: https://issues.jboss.org/browse/JGRP-2030
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.6.8
>            Reporter: Dan Berindei
>            Assignee: Bela Ban
>            Priority: Minor
>             Fix For: 3.6.10, 4.0
>
>
> When the coordinator ({{NodeE}}) leaves, it tries to install a new view on behalf of the new coordinator ({{NodeG}}, the last member).
> {noformat}
> 21:33:26,844 TRACE (ViewHandler,InitialClusterSizeTest-NodeE-42422:) [GMS] InitialClusterSizeTest-NodeE-42422: mcasting view [InitialClusterSizeTest-NodeG-30521|3] (1) [InitialClusterSizeTest-NodeG-30521] (1 mbrs)
> 21:33:26,844 TRACE (ViewHandler,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: sending msg to null, src=InitialClusterSizeTest-NodeE-42422, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=1], TP: [cluster_name=ISPN]
> {noformat}
> The message is actually sent later by the bundler, but {{NodeG}} is also sending its {{LEAVE_REQ}} message at the same time. Both nodes try to create a connection to each other, and only {{NodeG}} succeeds:
> {noformat}
> 21:33:26,844 TRACE (ForkThread-2,InitialClusterSizeTest:) [TCP_NIO2] InitialClusterSizeTest-NodeG-30521: sending msg to InitialClusterSizeTest-NodeE-42422, src=InitialClusterSizeTest-NodeG-30521, headers are GMS: GmsHeader[LEAVE_REQ]: mbr=InitialClusterSizeTest-NodeG-30521, UNICAST3: DATA, seqno=1, conn_id=1, first, TP: [cluster_name=ISPN]
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] InitialClusterSizeTest-NodeG-30521: sending 1 msgs (83 bytes (0.27% of max_bundle_size) to 1 dests(s): [ISPN:InitialClusterSizeTest-NodeE-42422]
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: sending 1 msgs (91 bytes (0.29% of max_bundle_size) to 1 dests(s): [ISPN]
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] dest=127.0.0.1:7900 (86 bytes)
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] dest=127.0.0.1:7920 (94 bytes)
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] 127.0.0.1:7900: connecting to 127.0.0.1:7920
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] 127.0.0.1:7920: connecting to 127.0.0.1:7900
> 21:33:26,866 TRACE (NioConnection.Reader [null],InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] 127.0.0.1:7920: rejected connection from 127.0.0.1:7900  (connection existed and my address won as it's higher)
> 21:33:26,867 TRACE (OOB-1,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: received [dst: InitialClusterSizeTest-NodeE-42422, src: InitialClusterSizeTest-NodeG-30521 (3 headers), size=0 bytes, flags=OOB], headers are GMS: GmsHeader[LEAVE_REQ]: mbr=InitialClusterSizeTest-NodeG-30521, UNICAST3: DATA, seqno=1, conn_id=1, first, TP: [cluster_name=ISPN]
> {noformat}
> I'm guessing {{NodeE}} would need a {{STABLE}} round in order to retransmit the {{VIEW}} message, but I'm not sure if the stable round would work, since it already (partially?) installed the new view with {{NodeG}} as the only member. However, I think it should be possible for {{NodeE}} to remove {{NodeG}} from it's {{AckCollector}} once it receives its {{LEAVE_REQ}}, and stop blocking.
> This is a minor annoyance a few the Infinispan tests - most of them shut down the nodes serially, so they don't see this delay.
> The question is whether the concurrent connection setup can have an impact for other messages as well - e.g. during startup, when there aren't a lot of messages being sent around to trigger retransmission. Could the node that failed to open its connection retry immediately on the connection opened by the other node?


--
This message was sent by Atlassian JIRA
(v6.4.11#64026)