[jboss-jira] [JBoss JIRA] (JGRP-2206) Property strings are correct but JGROUPS is not recognizing other nodes

Swathi Kumar (JIRA) issues at jboss.org
Wed Jul 26 11:44:00 EDT 2017


    [ https://issues.jboss.org/browse/JGRP-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440686#comment-13440686 ] 

Swathi Kumar commented on JGRP-2206:
------------------------------------

Hi Bela,

The property string above shows our customer *IS* setting the bind address:  TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR) so something else in his vmware environment seems to be leading to this issue.

Our customer can ping between his nodes using the addresses above.  We have asked him to execute ipconfig from a command window on node1.

Here is what they see:

         C:\Users>ipconfig

         Windows IP Configuration

         Ethernet adapter Local Area Connection 5:

            Connection-specific DNS Suffix  . : s3.chp.cba
            Link-local IPv6 Address . . . . . : fe80::1dda:27b9:ca17:40d5%11
            IPv4 Address. . . . . . . . . . . : 10.38.46.27
            Subnet Mask . . . . . . . . . . . : 255.255.255.0
            Default Gateway . . . . . . . . . : 10.38.46.254

         Tunnel adapter isatap.s3.chp.cba:

            Media State . . . . . . . . . . . : Media disconnected
            Connection-specific DNS Suffix  . :


Do you have insight (or troubleshooting experience) into vmware configurations that might explain why the localhost value is actually bound and not the "bind_addr=10.38.46.27" above?

I appreciate any help you be able to provide,

Jeff




> Property strings are correct but JGROUPS is not recognizing other nodes
> -----------------------------------------------------------------------
>
>                 Key: JGRP-2206
>                 URL: https://issues.jboss.org/browse/JGRP-2206
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.4
>         Environment: With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options
> OS: Windows Server 2008 R2 6.1,amd64
> Java version: 1.7.0,pwa6470sr9fp10-20150708_01 (SR9 FP10),IBM Corporation
>            Reporter: Swathi Kumar
>            Assignee: Bela Ban
>            Priority: Blocker
>         Attachments: VisibilityIssue.zip
>
>
> Our customer has a four node cluster which we believe is correctly defined yet the nodes are not communicating with each other.
> All nodes are on VMWare. None of the hostnames are virtual (in that they are all directly attached to an IP and are not managed by load balancers, etc).
>    
> The nodes are located in separate data centers (2 in each) and jgroups is operating over tcp, rather than udp multicast.
> NOTE:  The issue occurs only in the customer's environment (we are not able to reproduce this issue in our lab).
> We are attaching our logs (noapp.log.<timestamp>) with JGROUPS debugging enabled.
> *Node1 Property strings*:
> [2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> [2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> *Node2 Property strings*:
> [2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.46.28;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5061],10.38.46.27[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.46.28;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5061],10.38.46.27[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5060],10.38.46.27[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> [2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5060],10.38.46.27[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> *Node3 Property strings*:
> [2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.175.30;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.175.30;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> [2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> *Node4 Property strings*:
> [2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.property_string. Receivied this property: TCP(bind_addr=10.38.175.32;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.property_string. Using this property: TCP(bind_addr=10.38.175.32;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
> [2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Initializing jgroups_cluster.distributed_property_string. Receivied this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060];port_range=1;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
> [2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Done initializing jgroups_cluster.distributed_property_string. Using this property: TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060];port_range=1;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
>   



--
This message was sent by Atlassian JIRA
(v7.2.3#72005)


More information about the jboss-jira mailing list