[jboss-user] [JBossCache] - TCP clustering problem

Tue Feb 26 15:14:47 EST 2008

I am having difficulty getting a four node, TCP based JGroups (2.4.1)cluster operating properly. 
The cluster will function as expected until the coordinator dies or is gracefully shut down. At that point the three remaining nodes do not 'elect' a new coordinator and are forever waiting for the old coordinator to come back online.
However, when I try re-introducing the previous coordinator into the cluster, it hangs trying to re-establish itself to the coordinator (itself) as defined by the other nodes.
I've tested this same scenario using UDP multicast communication and it works fine. However, TCP is the only option we have in our target production environment.

Any help would be great. Here is a snippet of my cluster configuration:
++++++++++++++
                <TCP            loopback="true"
                                start_port="6006"
                                bind_addr="10.10.21.73"/>
                <TCPPING        initial_hosts="vhcertrh01[6006],vhcertrh01[6106],vhcertrh02[6006],vhcertrh02[6106]"
                                port_range="10"
                                timeout="3000"
                                num_initial_members="2"/>
                <pbcast.NAKACK  gc_lag="50"
                                retransmit_timeout="600,1200,2400,4800"
                                max_xmit_size="8192"
                                up_thread="false"
                                down_thread="false"/>
                <UNICAST        timeout="600,1200,2400"
                                window_size="100"
                                min_threshold="10"
                                down_thread="false"/>
                <pbcast.STABLE  desired_avg_gossip="20000"
                                up_thread="false"
                                down_thread="false"/>
                <FRAG           frag_size="8192"
                                down_thread="false"
                                up_thread="false"/>
                <pbcast.GMS     join_timeout="5000"
                                join_retry_timeout="2000"
                                shun="true"
                                print_local_addr="true"/>
                <pbcast.STATE_TRANSFER
                                up_thread="true"
                                down_thread="true"/>
                <FD             timeout="2500"
                                max_tries="3"
                                shun="true"/>
                <FD_SOCK                />
++++++++++++++++++++++++++
Thanks,
Tyke

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4132303#4132303

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4132303