]
Swathi Kumar commented on JGRP-2206:
------------------------------------
Hi Bela,
The property string above shows our customer *IS* setting the bind address:
TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR) so something else in his vmware
environment seems to be leading to this issue.
Our customer can ping between his nodes using the addresses above. We have asked him to
execute ipconfig from a command window on node1.
Here is what they see:
C:\Users>ipconfig
Windows IP Configuration
Ethernet adapter Local Area Connection 5:
Connection-specific DNS Suffix . : s3.chp.cba
Link-local IPv6 Address . . . . . : fe80::1dda:27b9:ca17:40d5%11
IPv4 Address. . . . . . . . . . . : 10.38.46.27
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 10.38.46.254
Tunnel adapter isatap.s3.chp.cba:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Do you have insight (or troubleshooting experience) into vmware configurations that might
explain why the localhost value is actually bound and not the
"bind_addr=10.38.46.27" above?
I appreciate any help you be able to provide,
Jeff
Property strings are correct but JGROUPS is not recognizing other
nodes
-----------------------------------------------------------------------
Key: JGRP-2206
URL:
https://issues.jboss.org/browse/JGRP-2206
Project: JGroups
Issue Type: Bug
Affects Versions: 3.4
Environment: With the Partitioning, Real Application Clusters, Automatic Storage
Management, OLAP, Data Mining and Real Application Testing options
OS: Windows Server 2008 R2 6.1,amd64
Java version: 1.7.0,pwa6470sr9fp10-20150708_01 (SR9 FP10),IBM Corporation
Reporter: Swathi Kumar
Assignee: Bela Ban
Priority: Blocker
Attachments: VisibilityIssue.zip
Our customer has a four node cluster which we believe is correctly defined yet the nodes
are not communicating with each other.
All nodes are on VMWare. None of the hostnames are virtual (in that they are all directly
attached to an IP and are not managed by load balancers, etc).
The nodes are located in separate data centers (2 in each) and jgroups is operating over
tcp, rather than udp multicast.
NOTE: The issue occurs only in the customer's environment (we are not able to
reproduce this issue in our lab).
We are attaching our logs (noapp.log.<timestamp>) with JGROUPS debugging enabled.
*Node1 Property strings*:
[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.property_string. Receivied this property:
TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.property_string. Using this property:
TCP(bind_addr=10.38.46.27;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.distributed_property_string. Receivied this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
[2017-07-24 21:58:30.867] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.distributed_property_string. Using this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
*Node2 Property strings*:
[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.property_string. Receivied this property:
TCP(bind_addr=10.38.46.28;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5061],10.38.46.27[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.property_string. Using this property:
TCP(bind_addr=10.38.46.28;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5061],10.38.46.27[5061],10.38.175.30[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.distributed_property_string. Receivied this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5060],10.38.46.27[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
[2017-07-24 22:01:01.666] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.distributed_property_string. Using this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.46.28[5060],10.38.46.27[5060],10.38.175.30[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
*Node3 Property strings*:
[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.property_string. Receivied this property:
TCP(bind_addr=10.38.175.30;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.property_string. Using this property:
TCP(bind_addr=10.38.175.30;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.32[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.distributed_property_string. Receivied this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
[2017-07-24 22:02:01.411] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.distributed_property_string. Using this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.30[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.32[5060];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
*Node4 Property strings*:
[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.property_string. Receivied this property:
TCP(bind_addr=10.38.175.32;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.property_string. Using this property:
TCP(bind_addr=10.38.175.32;bind_port=5061;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5061],10.38.46.27[5061],10.38.46.28[5061],10.38.175.30[5061];port_range=0;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_ALL(interval=5000;timeout=20000):FD(timeout=5000;max_tries=110):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=100,200,300,600,1200,2400,4800;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(print_local_addr=true;join_timeout=5000)
[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Initializing
jgroups_cluster.distributed_property_string. Receivied this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060];port_range=1;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48;):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)
[2017-07-24 22:01:14.365] ALL 000000000000 GLOBAL_SCOPE Done initializing
jgroups_cluster.distributed_property_string. Using this property:
TCP(bind_port=5060;thread_pool_rejection_policy=run;level=ERROR):TCPPING(initial_hosts=10.38.175.32[5060],10.38.46.27[5060],10.38.46.28[5060],10.38.175.30[5060];port_range=1;timeout=5000;num_initial_members=4):MERGE2(min_interval=3000;max_interval=5000):FD_SOCK:FD(timeout=5000;max_tries=48):VERIFY_SUSPECT(timeout=1500):pbcast.NAKACK(retransmit_timeout=3000;discard_delivered_msgs=true):pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;max_bytes=0):pbcast.GMS(join_timeout=5000;print_local_addr=true)