[
https://issues.jboss.org/browse/JGRP-1977?page=com.atlassian.jira.plugin....
]
Osamu Nagano commented on JGRP-1977:
------------------------------------
OK, so I can tune timing of {{MERGE}} to minimize the period of several coordinators. If
you come to your mind with any good workaround at discovery level, it's really
appreciated.
More redundant initial join logic to avoid becoming a fake
coordinator
----------------------------------------------------------------------
Key: JGRP-1977
URL:
https://issues.jboss.org/browse/JGRP-1977
Project: JGroups
Issue Type: Enhancement
Reporter: Osamu Nagano
Assignee: Bela Ban
Fix For: 3.6.7
If the very initial JGroups discovery packet is lost, it is never recovered by the
current GMS join logic. The node will be a standalone coordinator then merges after
several minutes.
This can happen if a new node reside in another network segment and a switch between the
segments requires some time to establish a new multicast route. Currently, there is no
enough time between IGMP join (by {{MulticastSocket#joinGroup()}}) and the JGroups
discovery packet and the later is lost in such a network environment. Because the number
of nodes can be very large, configuring a static route in the switch is not reasonable.
Specifically, in method {{org.jgroups.protocols.pbcast.ClientGmsImpl#joinInternal()}},
part of {{gms.getDownProtocol().down(Event.FIND_INITIAL_MBRS_EVT)}} is outside of the
retry loop of GMS.max_join_attempts and GMS.join_timeout.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)