[JBoss JIRA] Created: (JGRP-364) When using TCP_NIO, starting two nodes at the same time causes one of the nodes not to join group
by Matthew Todd (JIRA)
When using TCP_NIO, starting two nodes at the same time causes one of the nodes not to join group
-------------------------------------------------------------------------------------------------
Key: JGRP-364
URL: http://jira.jboss.com/jira/browse/JGRP-364
Project: JGroups
Issue Type: Bug
Affects Versions: 2.4
Environment: linux 2.6 kernel x86_64 running java 1.5.0_06
Reporter: Matthew Todd
Assigned To: Bela Ban
I am testing a jgroups tcp_nio configuration using the draw demo.If I start up my 3 nodes one by one then everything works fine. However if I start up node 1, then attempt to start node 2 and 3 in parallel then only node 2 will work. Node 3 will be isolated and not see the other nodes and logs the following message:
org.jgroups.protocols.pbcast.ClientGmsImpl join
WARNING: join(192.158.70.200:7802) sent to 192.158.70.200:7800 timed out, retrying
I am starting the draw demo like this;
java -cp jgroups-all.jar:commons-logging.jar:concurrent.jar:jmxri.jar org.jgroups.demos.Draw -props test.xml
Here is the configuration for one of my nodes:
<config>
<TCP_NIO
bind_addr="192.158.70.200"
recv_buf_size="20000000"
send_buf_size="640000"
loopback="false"
discard_incompatible_packets="true"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="true"
down_thread="false" up_thread="false"
enable_bundling="true"
start_port="7800"
end_port="7800"
use_send_queues="false"
sock_conn_timeout="300" skip_suspected_members="true"
/>
<MPING timeout="2000" num_initial_members="3" mcast_addr="229.6.7.8"
bind_addr="192.158.70.200" down_thread="false" up_thread="false"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="true" up_thread="true"
discard_delivered_msgs="true"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="true" up_thread="true"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<!-- <FC max_credits="2000000" down_thread="false" up_thread="false"
min_threshold="0.10"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/> -->
<pbcast.STATE_TRANSFER/>
<!-- <pbcast.FLUSH down_thread="false" up_thread="false"/>-->
</config>
Node 2 and 3 have the same configuration except the port they bind to has been changed
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months