I am having difficulty getting a four node, TCP based JGroups (2.4.1)cluster operating
properly.
The cluster will function as expected until the coordinator dies or is gracefully shut
down. At that point the three remaining nodes do not 'elect' a new coordinator and
are forever waiting for the old coordinator to come back online.
However, when I try re-introducing the previous coordinator into the cluster, it hangs
trying to re-establish itself to the coordinator (itself) as defined by the other nodes.
I've tested this same scenario using UDP multicast communication and it works fine.
However, TCP is the only option we have in our target production environment.
Any help would be great. Here is a snippet of my cluster configuration:
++++++++++++++
<TCP loopback="true"
start_port="6006"
bind_addr="10.10.21.73"/>
<TCPPING
initial_hosts="vhcertrh01[6006],vhcertrh01[6106],vhcertrh02[6006],vhcertrh02[6106]"
port_range="10"
timeout="3000"
num_initial_members="2"/>
<pbcast.NAKACK gc_lag="50"
retransmit_timeout="600,1200,2400,4800"
max_xmit_size="8192"
up_thread="false"
down_thread="false"/>
<UNICAST timeout="600,1200,2400"
window_size="100"
min_threshold="10"
down_thread="false"/>
<pbcast.STABLE desired_avg_gossip="20000"
up_thread="false"
down_thread="false"/>
<FRAG frag_size="8192"
down_thread="false"
up_thread="false"/>
<pbcast.GMS join_timeout="5000"
join_retry_timeout="2000"
shun="true"
print_local_addr="true"/>
<pbcast.STATE_TRANSFER
up_thread="true"
down_thread="true"/>
<FD timeout="2500"
max_tries="3"
shun="true"/>
<FD_SOCK />
++++++++++++++++++++++++++
Thanks,
Tyke
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4132303#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...