[jboss-jira] [JBoss JIRA] (JGRP-2206) Property strings are correct but JGROUPS is not recognizing other nodes

Swathi Kumar (JIRA) issues at jboss.org
Mon Jul 24 12:57:00 EDT 2017


Swathi Kumar created JGRP-2206:
----------------------------------

             Summary: Property strings are correct but JGROUPS is not recognizing other nodes
                 Key: JGRP-2206
                 URL: https://issues.jboss.org/browse/JGRP-2206
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 3.4
         Environment: With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options
OS: Windows Server 2008 R2 6.1,amd64
Java version: 1.7.0,pwa6470sr9fp10-20150708_01 (SR9 FP10),IBM Corporation

            Reporter: Swathi Kumar
            Assignee: Bela Ban
            Priority: Blocker
         Attachments: VisibilityIssue.zip

Our customer has a four node cluster which we believe is correctly defined yet the nodes are not communicating with each other.

All nodes are on VMWare. None of the hostnames are virtual (in that they are all directly attached to an IP and are not managed by load balancers, etc).
   
The nodes are located in separate data centers (2 in each) and jgroups is operating over tcp, rather than udp multicast.

NOTE:  The issue occurs only in the customer's environment (we are not able to reproduce this issue in our lab).

We are attaching our logs (noapp.log.<timestamp>) with JGROUPS debugging enabled.

*Node1 Property strings*:
[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)



*Node2 Property strings*:
[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.46.28;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5061],10.38.46.27[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.46.28;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5061],10.38.46.27[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5060],10.38.46.27[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5060],10.38.46.27[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

*Node3 Property strings*:
[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.175.30;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.175.30;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

*Node4 Property strings*:
[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.175.32;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.175.32;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)

[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060];port_range=1;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060];port_range=1;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)

  






--
This message was sent by Atlassian JIRA
(v7.2.3#72005)


More information about the jboss-jira mailing list