[JBoss JIRA] (JGRP-2195) [JGRP00012] discarded message from different cluster with JGroups Upgrade

Thursday, 15 June 2017

    [
https://issues.jboss.org/browse/JGRP-2195?page=com.atlassian.jira.plugin....
] 

Bela Ban commented on JGRP-2195:
--------------------------------

Why didn't you upgrade to a more recent version of JGroups, e.g. 4.0 or at least
3.6.x?

Anyway, you should give more space between ports of different clusters: bind_ports of
34060, 34061 and 34062 with a port_range of 1 are likely to overlap. Below's a config
that's better:

ClusterA:
TCP bind_port=30000, 30001, 30002 // 3 members
TCPPING: initial_hosts=xxx\[30000\] port_range=2

ClusterB:
TCP bind_port=40000, 40001, 40002 // 3 members
TCPPING: initial_hosts=xxx\[40000\] port_range=2

ClusterC:
TCP bind_port=50000, 50001, 50002 // 3 members
TCPPING: initial_hosts=xxx\[50000\] port_range=2

...
 [JGRP00012] discarded message from different cluster with JGroups
Upgrade
 -------------------------------------------------------------------------

                 Key: JGRP-2195
                 URL: https://issues.jboss.org/browse/JGRP-2195
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 3.4
         Environment: All OS(Linux, AIX, Windows, Solaris)
            Reporter: Swathi Kumar
            Assignee: Bela Ban

 Greetings Team.
 We recently upgraded the jgroups jars from version 2_5_2/jgroups-all.jar to
3_4_0/jgroups-3.4.0.Alpha2.jar.
 With the upgrade we are seeing *[JGRP00012] discarded message from different cluster*
messages every alternate seconds on all the nodes in the cluster.
 Also it is to be noted that this issue started to recur only when we switched the
protocol from UDP to TCP. If we start using UDP again, we no longer see these *WARN*
messages.
 We no longer support UDP in our application and we can't be using UDP anymore.
 We have several 100's of customers in the field who are using our product with this
upgraded jgroups jar and have started to raise tickets against our product.
 We are clueless as to why the upgrade is producing enormous WARN messages - is there an
issue with this version of the jgroups jar?
 The sample WARN message is shown below :-
 [2017-06-13 11:56:38.117] ALL 000000000000 GLOBAL_SCOPE 141694
[OOB-1,Sterling_NodeInfo_group,dublr005vm-24633] WARN org.jgroups.protocols.TCP  -
[JGRP00012] discarded message from different cluster Sterling_NodeInfo_group_WFC (our
cluster is Sterling_NodeInfo_group). Sender was dublr005vm-2060
 [2017-06-13 11:56:41.72] ALL 000000000000 GLOBAL_SCOPE 145297
[OOB-1,Sterling_NodeInfo_group_WFC,dublr005vm-2060] WARN org.jgroups.protocols.TCP  -
[JGRP00012] discarded message from different cluster Sterling_NodeInfo_group (our cluster
is Sterling_NodeInfo_group_WFC). Sender was dublr005vm-24633
 We have the below jgroups config properties in our application to create 3 channels (for
security reasons have used a dummy host name here) :-

jgroups_cluster.property_string=TCP(bind_addr=host_name_A;bind_port=34061):TCPPING(initial_hosts=host_name_A[34061],host_name_A[44061],host_name_A[54061];port_range=1;timeout=5000;num_initial_members=2):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
 jgroups_cluster.distribution_property_string=TCP(bind_port=
34060;thread_pool_rejection_policy=run):TCPPING(initial_hosts=host_name_A[34060],host_name_A[44060],host_name_A[54060];port_range=1;timeout=5000;num_initial_members=2):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

jgroups_cluster.lock.protocolStack=TCP(bind_addr=host_name_A;bind_port=34062;):TCPPING(initial_hosts=host_name_A[34062],host_name_A[44062],host_name_A[54062];port_range=1;timeout=5000;num_initial_members=2):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
 Test considerations :-
 1. For in-house testing, I have created a 3 node cluster.
 2. All the 3 nodes reside on the same box.
 If you need any further information please let me know.
 Regards
 Swathi BN 

--
This message was sent by Atlassian JIRA
(v7.2.3#72005)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006