[jboss-jira] [JBoss JIRA] Resolved: (JGRP-364) When using TCP_NIO, starting two nodes at the same time causes one of the nodes not to join group

Friday, 21 September 2007

     [ http://jira.jboss.com/jira/browse/JGRP-364?page=all ]

Bela Ban resolved JGRP-364.
---------------------------

    Resolution: Cannot Reproduce Bug

Works perfectly for me. Couldn not reproduce this on Fedora 7 2.6.22.6 kernel

...
 When using TCP_NIO, starting two nodes at the same time causes one of
the nodes not to join group

-------------------------------------------------------------------------------------------------

                 Key: JGRP-364
                 URL: http://jira.jboss.com/jira/browse/JGRP-364
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 2.4
         Environment: linux 2.6 kernel x86_64 running java 1.5.0_06
            Reporter: Matthew Todd
         Assigned To: Scott Marlow
             Fix For: 2.6

         Attachments: test.xml, test1.bat, test2.bat, test2.xml, test3.bat, test3.xml

 I am testing a jgroups tcp_nio configuration using the draw demo.If I start up my 3 nodes
one by one then everything works fine. However if I start up node 1, then attempt to start
node 2 and 3 in parallel then only node 2 will work. Node 3 will be isolated and not see
the other nodes and logs the following message: 
 org.jgroups.protocols.pbcast.ClientGmsImpl join
 WARNING: join(192.158.70.200:7802) sent to 192.158.70.200:7800 timed out, retrying
 I am starting the draw demo like this;
 java -cp jgroups-all.jar:commons-logging.jar:concurrent.jar:jmxri.jar 
org.jgroups.demos.Draw -props test.xml
 Here is the configuration for one of my nodes:
 <config>
            <TCP_NIO
             bind_addr="192.158.70.200"
             recv_buf_size="20000000"
             send_buf_size="640000"
             loopback="false"
             discard_incompatible_packets="true"
             max_bundle_size="64000"
             max_bundle_timeout="30"
             use_incoming_packet_handler="true"
             use_outgoing_packet_handler="true"
             down_thread="false" up_thread="false"
             enable_bundling="true"
             start_port="7800"
             end_port="7800"
             use_send_queues="false"
             sock_conn_timeout="300" skip_suspected_members="true"

             />

  <MPING timeout="2000" num_initial_members="3"
mcast_addr="229.6.7.8"
 bind_addr="192.158.70.200" down_thread="false"
up_thread="false"/>

    <MERGE2 max_interval="100000"
             down_thread="false" up_thread="false"
min_interval="20000"/>
     <FD_SOCK down_thread="false" up_thread="false"/>
     <VERIFY_SUSPECT timeout="1500" down_thread="false"
up_thread="false"/>
     <pbcast.NAKACK max_xmit_size="60000"
                    use_mcast_xmit="false" gc_lag="0"
                    retransmit_timeout="300,600,1200,2400,4800"
                    down_thread="true" up_thread="true"
                    discard_delivered_msgs="true"/>
     <pbcast.STABLE stability_delay="1000"
desired_avg_gossip="50000"
                    down_thread="false" up_thread="false"
                    max_bytes="400000"/>
     <pbcast.GMS print_local_addr="true" join_timeout="3000"
                 down_thread="true" up_thread="true"
                 join_retry_timeout="2000" shun="true"
                 view_bundling="true"/>
     <!-- <FC max_credits="2000000" down_thread="false"
up_thread="false"
         min_threshold="0.10"/>
     <FRAG2 frag_size="60000" down_thread="false"
up_thread="false"/> -->
 <pbcast.STATE_TRANSFER/>
 <!--    <pbcast.FLUSH down_thread="false"
up_thread="false"/>-->
 </config>
 Node 2 and 3 have the same configuration except the port they bind to has been changed

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-jira] [JBoss JIRA] Resolved: (JGRP-364) When using TCP_NIO, starting two nodes at the same time causes one of the nodes not to join group