[jboss-user] [Clustering/JBoss] - Trouble getting clustering to work in a certain setup

heineson do-not-reply at jboss.com
Wed Feb 14 07:23:06 EST 2007


Hi,

I have a problem getting clustering to work in a test environment using identical Jboss-configurations (4.0.4GA and JGroups 2.2.9.3) on two different machines (one Win XP and one Ubuntu Linux, kernel 2.6.15).

Everything works fine when I first start the JBoss on Windows and then the JBoss on Linux, but if I do the other way around the Windows-JBoss wont join the cluster and prints these messages: 

2007-02-14 12:25:14,801 INFO  [org.jboss.system.server.Server] JBoss (MX MicroKernel) [4.0.4.GA (build: CVSTag=JBoss_4_0_4_GA date=200605151000)] Started in 26s:794ms
2007-02-14 12:25:17,536 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [172.30.153.65:34429|2] [172.30.153.65:34429, 172.30.153.46:1473, 172.30.153.46:1484]
2007-02-14 12:25:17,536 INFO  [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
2007-02-14 12:25:19,676 ERROR [org.jgroups.protocols.UNICAST] window_size is deprecated and will be ignored
2007-02-14 12:25:19,676 ERROR [org.jgroups.protocols.UNICAST] min_threshold is deprecated and will be ignored
2007-02-14 12:25:19,692 INFO  [STDOUT] 
-------------------------------------------------------
GMS: address is 172.30.153.46:1490
-------------------------------------------------------
2007-02-14 12:25:26,707 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1490) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:25:27,066 ERROR [org.jgroups.protocols.UNICAST] window_size is deprecated and will be ignored
2007-02-14 12:25:27,066 ERROR [org.jgroups.protocols.UNICAST] min_threshold is deprecated and will be ignored
2007-02-14 12:25:27,082 INFO  [STDOUT] 
-------------------------------------------------------
GMS: address is 172.30.153.46:1493
-------------------------------------------------------
2007-02-14 12:25:30,706 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [172.30.153.65:34435|2] [172.30.153.65:34435, 172.30.153.46:1486, 172.30.153.46:1490]
2007-02-14 12:25:30,706 INFO  [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
2007-02-14 12:25:34,112 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1493) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:25:40,221 ERROR [org.jgroups.protocols.UNICAST] window_size is deprecated and will be ignored
2007-02-14 12:25:40,221 ERROR [org.jgroups.protocols.UNICAST] min_threshold is deprecated and will be ignored
2007-02-14 12:25:40,237 INFO  [STDOUT] 
-------------------------------------------------------
GMS: address is 172.30.153.46:1496
-------------------------------------------------------
2007-02-14 12:25:43,127 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1493) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:25:47,267 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1496) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:25:52,142 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1493) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:25:56,282 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1496) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:25:57,032 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [172.30.153.65:34429|4] [172.30.153.65:34429, 172.30.153.46:1473, 172.30.153.46:1484, 172.30.153.46:1493]
2007-02-14 12:25:57,032 INFO  [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
2007-02-14 12:26:05,297 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1496) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:26:06,546 ERROR [org.jgroups.protocols.UNICAST] window_size is deprecated and will be ignored
2007-02-14 12:26:06,546 ERROR [org.jgroups.protocols.UNICAST] min_threshold is deprecated and will be ignored
2007-02-14 12:26:06,562 INFO  [STDOUT] 
-------------------------------------------------------
GMS: address is 172.30.153.46:1501
-------------------------------------------------------
2007-02-14 12:26:10,171 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [172.30.153.65:34435|4] [172.30.153.65:34435, 172.30.153.46:1486, 172.30.153.46:1490, 172.30.153.46:1496]
2007-02-14 12:26:10,171 INFO  [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
2007-02-14 12:26:13,577 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:26:19,686 ERROR [org.jgroups.protocols.UNICAST] window_size is deprecated and will be ignored
2007-02-14 12:26:19,686 ERROR [org.jgroups.protocols.UNICAST] min_threshold is deprecated and will be ignored
2007-02-14 12:26:19,701 INFO  [STDOUT] 
-------------------------------------------------------
GMS: address is 172.30.153.46:1504
-------------------------------------------------------
2007-02-14 12:26:22,592 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:26:26,701 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:26:31,607 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:26:35,716 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:26:40,606 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:26:44,715 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:26:49,636 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:26:53,745 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:26:58,651 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:27:02,760 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:27:07,666 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:27:11,775 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:27:16,696 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1501) sent to 172.30.153.65:34429 timed out, retrying
2007-02-14 12:27:20,696 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [172.30.153.65:34429|8] [172.30.153.65:34429, 172.30.153.46:1473, 172.30.153.46:1484, 172.30.153.46:1493, 172.30.153.46:1501]
2007-02-14 12:27:20,696 INFO  [org.jboss.cache.TreeCache] received the state (size=1024 bytes)
2007-02-14 12:27:20,774 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:27:29,789 WARN  [org.jgroups.protocols.pbcast.GMS] join(172.30.153.46:1504) sent to 172.30.153.65:34435 timed out, retrying
2007-02-14 12:27:30,226 ERROR [org.jgroups.protocols.UNICAST] window_size is deprecated and will be ignored
2007-02-14 12:27:30,226 ERROR [org.jgroups.protocols.UNICAST] min_threshold is deprecated and will be ignored
2007-02-14 12:27:30,242 INFO  [STDOUT] 
-------------------------------------------------------
GMS: address is 172.30.153.46:1512
-------------------------------------------------------

and it goes on and on like this, and the Linux-JBoss prints a bunch of messages like these:

2007-02-14 12:23:13,735 WARN  [org.jgroups.protocols.pbcast.GMS] failed to collect all ACKs (5) for view [172.30.153.65:34435|8] after 20000ms, missing ACKs from [172.30.153.65:34435, 172.30.153.46:1486, 172.30.153.46:1490, 172.30.153.46:1496, 172.30.153.46:1504] (received=[]), local_addr=172.30.153.65:34435
2007-02-14 12:23:13,736 WARN  [org.jgroups.protocols.pbcast.Digest] entry for 172.30.153.46:1504 was overwritten with low=0, high=0, highest seen=-1

What can be wrong?
Also, when we try to connect another machine (Win XP, identical JBoss config) we also fail to do that, we get messages like viewAccepted and can see all nodes there (on all machines) but the new machine is still not connected to the cluster.

cluster-sevice.xml:

            <UDP mcast_addr="${jboss.partition.udpGroup:228.1.2.3}" mcast_port="45588"
               ip_ttl="8" ip_mcast="true"
               mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
               ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
               loopback="true"/>
            <PING timeout="2000" num_initial_members="3"
               up_thread="true" down_thread="true"/>
            <MERGE2 min_interval="10000" max_interval="20000"/>
            <FD shun="true" up_thread="true" down_thread="true"
               timeout="2500" max_tries="5"/>
            <VERIFY_SUSPECT timeout="3000" num_msgs="3"
               up_thread="true" down_thread="true"/>
            <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
               max_xmit_size="8192"
               up_thread="true" down_thread="true"/>
            <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
               down_thread="true"/>
            <pbcast.STABLE desired_avg_gossip="20000"
               up_thread="true" down_thread="true"/>
            <FRAG frag_size="8192"
               down_thread="true" up_thread="true"/>
            <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
               shun="true" print_local_addr="true"/>
            <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
         

tc5-cluster.sar/META-INF/jboss-service.xml:

                <UDP mcast_addr="230.1.2.7" 
                     mcast_port="45599"
                     ucast_recv_buf_size="20000000"
                     ucast_send_buf_size="640000"
                     mcast_recv_buf_size="25000000" 
                     mcast_send_buf_size="640000" 
                     loopback="true" 
                     max_bundle_size="64000" 
                     max_bundle_timeout="30" 
                     use_incoming_packet_handler="true" 
                     use_outgoing_packet_handler="true" 
                     ip_ttl="2" 
                     down_thread="false" up_thread="false"
                     enable_bundling="true"/>
                <PING timeout="2000"
                      down_thread="false" up_thread="false" num_initial_members="3"/>
                <MERGE2 max_interval="100000"
                        down_thread="false" up_thread="false" min_interval="20000"/>
                <FD shun="true" up_thread="false" down_thread="false"
                        timeout="2500" max_tries="5"/>
                <VERIFY_SUSPECT timeout="1500"
                        up_thread="false" down_thread="false"/>
                <pbcast.NAKACK max_xmit_size="60000"
                               use_mcast_xmit="false" gc_lag="50" 
                               retransmit_timeout="100,200,300,600,1200,2400,4800" 
                               down_thread="false" up_thread="false"
                               discard_delivered_msgs="true"/>
                <UNICAST timeout="300,600,1200,2400,3600" 
                         down_thread="false" up_thread="false"/>
                <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" 
                               down_thread="false" up_thread="false"
                               max_bytes="2100000"/>
                <pbcast.GMS print_local_addr="true" join_timeout="3000" 
                            down_thread="false" up_thread="false"
                            join_retry_timeout="2000" shun="true"/>
                <!-- If your CacheMode is set to REPL_SYNC we recommend you
                     comment out the FC (flow control) protocol -->
                <FC max_credits="10000000" down_thread="false" up_thread="false"
                    min_threshold="0.20"/>
                <FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
                <pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/>
           

JBoss i started with -Djboss.partition.name=MatsProdPartition -Djboss.partition.udpGroup=228.1.2.4


Regards
Jonas Heineson

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4016333#4016333

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4016333



More information about the jboss-user mailing list