[jboss-user] [JBossCache] - slow JBoss startup in a 4 machine cluster

mnewcomb do-not-reply at jboss.com
Thu Oct 26 09:39:30 EDT 2006


When running JBoss in clustered mode there is a relatively long pause when the TreeCache starts up. After it prints the GMS information for the cache, it sits there for 15+ seconds (for the first machine) then finishes starting up.

Then, when for each other cluster node I start, it pauses at the same place but for increasing times. 

Member 1: 17 seconds 
Member 2: 14 seconds 
Member 3: 64 seconds 
Member 4: 74 seconds 

It is the standard JBoss 4.0.4 installation, but I know the cache uses JGroups and it's pausing inside of the cache initialization.

The CPU/network are all not being hit, so, I'm assuming that it could be something with the JGroups configuration?

Thanks for any help you can provide, 
Michael 


Below are the cache portions of the JBoss logs for each member. I've placed '**********' around the slow parts of each log. For brevity, I only pasted the whole cache section of member # 1 but I pasted the slow portions of each other member.

Member 1:

09:33:47,253 INFO  [TreeCache] setting cluster properties from xml to: UD(down_thread=false;enable_bundling=true;ip_ttl=2;loopback=true;max_bundle_size=64000;max_bundle_timeout=30;mcast_addr=230.1.2.7;mcast_port=45577;mcast_recv_buf_size=25000000;mcast_send_buf_size=640000;ucast_recv_buf_size=20000000;ucast_send_buf_size=640000;up_thread=false;use_incoming_packet_handler=true;use_outgoing_packet_handler=true):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE(down_thread=false;max_interval=100000;min_interval=20000;up_thread=false):FD(down_thread=false;max_tries=5;shun=true;timeout=2500;up_thread=false):VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(discard_delivered_msgs=true;down_thread=false;gc_lag=50;max_xmit_size=60000;retransmit_timeout=100,200,300,600,1200,2400,4800;up_thread=false;use_mcast_xmit=false):UNICAST(down_thread=false;timeout=300,600,1200,2400,3600;up_thread=false):pbcast.STABLE(desired_avg_gossip=50000;down_thread=false;max_bytes=2100000;stability_delay=1000;up_thread=false):pbcast.GMS(down_thread=false;join_retry_timeout=2000;join_timeout=3000;print_local_addr=true;shun=true;up_thread=false):FC(down_thread=false;max_credits=10000000;min_threshold=0.20;up_threa
d=false):FRAG2(down_thread=false;frag_size=60000;up_thread=false):pbcast.STATE_TRANSFER(down_thread=false;up_thread=false)
09:33:47,534 INFO  [WebService] Using RMI server codebase: http://tcc-srvr-04:8083/
09:33:47,658 INFO  [TreeCache] setting cluster properties from xml to: UDP(ip_mcast=true;ip_ttl=64;loopback=true;mcast_addr=228.1.2.3;mcast_port=45551;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD(down_thread=true;shun=true;up_thread=true):VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(
desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=false;up_thread=false)
09:33:47,674 INFO  [TreeCache] setEvictionPolicyConfig(): [config: null]
09:33:47,674 WARN  [TreeCache] No transaction manager lookup class has been defined. Transactions cannot be used
09:33:47,690 INFO  [TreeCache] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.PessimisticLockInterceptor
class org.jboss.cache.interceptors.CacheLoaderInterceptor
class org.jboss.cache.interceptors.UnlockInterceptor
class org.jboss.cache.interceptors.ReplicationInterceptor
class org.jboss.cache.interceptors.CacheStoreInterceptor
09:33:47,705 INFO  [TreeCache] cache mode is REPL_SYNC
09:33:47,877 ERROR [UNICAST] window_size is deprecated and will be ignored
09:33:47,877 ERROR [UNICAST] min_threshold is deprecated and will be ignored
09:33:47,939 INFO  [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.200.74:1540
-------------------------------------------------------
09:33:49,950 INFO  [TreeCache] viewAccepted(): [192.168.200.74:1540|0] [192.168.200.74:1540]
09:33:49,950 INFO  [TreeCache] my local address is 192.168.200.74:1540
09:33:49,950 INFO  [TreeCache] new cache is null (may be first member in cluster)
09:33:49,950 INFO  [TreeCache] state could not be retrieved (must be first member in group)
09:33:49,950 INFO  [LRUPolicy] Starting eviction policy using the provider: org.jboss.ejb3.cache.tree.StatefulEvictionPolicy
09:33:49,950 INFO  [LRUPolicy] Starting a eviction timer with wake up interval of (secs) 1
09:33:49,950 INFO  [TreeCache] Cache is started!!
09:33:50,028 INFO  [TreeCache] setting cluster properties from xml to: UDP(ip_mcast=true;ip_ttl=2;loopback=true;mcast_addr=228.1.2.3;mcast_p
ort=43333;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD(down_thread=true;shun=true;up_thread=true):VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_t
imeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=false;up_thread=false)
09:33:50,028 INFO  [TreeCache] setEvictionPolicyConfig(): [config: null]
09:33:50,698 INFO  [MailService] Mail Service bound to java:/Mail
09:33:50,760 ERROR [UNICAST] window_size is deprecated and will be ignored
09:33:50,760 ERROR [UNICAST] min_threshold is deprecated and will be ignored
09:33:50,776 INFO  [DefaultPartition] Initializing
09:33:50,838 INFO  [TreeCache] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.PessimisticLockInterceptor
class org.jboss.cache.interceptors.UnlockInterceptor
class org.jboss.cache.interceptors.ReplicationInterceptor
09:33:50,838 INFO  [TreeCache] cache mode is REPL_ASYNC
09:33:50,854 INFO  [TreeCache] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.PessimisticLockInterceptor
class org.jboss.cache.interceptors.UnlockInterceptor
class org.jboss.cache.interceptors.ReplicationInterceptor
09:33:50,854 INFO  [TreeCache] cache mode is REPL_SYNC
09:33:50,869 ERROR [UNICAST] window_size is deprecated and will be ignored
09:33:50,869 ERROR [UNICAST] min_threshold is deprecated and will be ignored
09:33:50,916 INFO  [NamingService] JNDI bootstrap JNP=/0.0.0.0:1099, RMI=/0.0.0.0:1098, backlog=50, no client SocketFactory, Server SocketFa
ctory=class org.jboss.net.sockets.DefaultSocketFactory
09:33:50,932 INFO  [SubscriptionManager] Bound event dispatcher to java:/EventDispatcher

***********
09:33:50,932 INFO  [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.200.74:1545 (additional data: 19 bytes)
-------------------------------------------------------
09:34:07,920 INFO  [DefaultPartition] Number of cluster members: 1
*********** 
+17 seconds in between logs! 

09:34:07,920 INFO  [DefaultPartition] Other members: 0
09:34:07,920 INFO  [DefaultPartition] New cluster view for partition DefaultPartition (id: 0, delta: 0) : [192.168.100.74:1099]
09:34:07,920 INFO  [DefaultPartition] Fetching state (will wait for 30000 milliseconds):
09:34:07,920 INFO  [DefaultPartition] I am (192.168.100.74:1099) received membershipChanged event:
09:34:07,920 INFO  [DefaultPartition] Dead members: 0 ([])
09:34:07,920 INFO  [DefaultPartition] New Members : 0 ([])
09:34:07,920 INFO  [DefaultPartition] All Members : 1 ([192.168.100.74:1099])
09:34:07,936 INFO  [HANamingService] Started ha-jndi bootstrap jnpPort=1100, backlog=50, bindAddress=/0.0.0.0
09:34:07,951 INFO  [DetachedHANamingService$AutomaticDiscovery] Listening on /0.0.0.0:1102, group=230.0.0.4, HA-JNDI address=192.168.100.74:
1100
09:34:08,388 INFO  [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.200.74:1547
-------------------------------------------------------
09:34:10,398 INFO  [TreeCache] viewAccepted(): [192.168.200.74:1547|0] [192.168.200.74:1547]
09:34:10,398 INFO  [TreeCache] my local address is 192.168.200.74:1547
09:34:10,398 INFO  [TreeCache] new cache is null (may be first member in cluster)
09:34:10,398 INFO  [TreeCache] state could not be retrieved (must be first member in group)
09:34:10,398 INFO  [TreeCache] Cache is started!!
09:34:10,414 INFO  [STDOUT]
-------------------------------------------------------
GMS: address is 192.168.200.74:1549
-------------------------------------------------------
09:34:12,425 INFO  [TreeCache] viewAccepted(): [192.168.200.74:1549|0] [192.168.200.74:1549]
09:34:12,425 INFO  [TreeCache] my local address is 192.168.200.74:1549
09:34:12,425 INFO  [TreeCache] new cache is null (may be first member in cluster)
09:34:12,425 INFO  [TreeCache] state could not be retrieved (must be first member in group)
09:34:12,425 INFO  [LRUPolicy] Starting eviction policy using the provider: org.jboss.cache.eviction.LRUPolicy
09:34:12,425 INFO  [LRUPolicy] Starting a eviction timer with wake up interval of (secs) 5
09:34:12,425 INFO  [TreeCache] Cache is started!!



Member 2: 

09:39:48,828 INFO  [TreeCache] cache mode is REPL_SYNC 
09:39:48,828 ERROR [UNICAST] window_size is deprecated and will be ignored 
09:39:48,828 ERROR [UNICAST] min_threshold is deprecated and will be ignored 
09:39:48,875 INFO  [NamingService] JNDI bootstrap JNP=/0.0.0.0:1099, RMI=/0.0.0.0:1098, backlog=50, no client SocketFactory, Server SocketFactory=class org.jboss.net.sockets.DefaultSocketFactory 
09:39:48,890 INFO  [SubscriptionManager] Bound event dispatcher to java:/EventDispatcher 

*********** 
09:39:48,906 INFO  [STDOUT] 
------------------------------------------------------- 
GMS: address is 192.168.200.77:2781 (additional data: 19 bytes) 
------------------------------------------------------- 
09:40:02,937 INFO  [DefaultPartition] Number of cluster members: 2 
*********** 
+14 seconds in between logs! 

09:40:02,937 INFO  [DefaultPartition] Other members: 1 



Member 3: 

09:42:04,323 INFO  [TreeCache] cache mode is REPL_SYNC 
09:42:04,339 ERROR [UNICAST] window_size is deprecated and will be ignored 
09:42:04,339 ERROR [UNICAST] min_threshold is deprecated and will be ignored 
09:42:04,386 INFO  [NamingService] JNDI bootstrap JNP=/0.0.0.0:1099, RMI=/0.0.0.0:1098, backlog=50, no client SocketFactory, Server SocketFactory=class org.jboss.net.sockets.DefaultSocketFactory 
09:42:04,402 INFO  [SubscriptionManager] Bound event dispatcher to java:/EventDispatcher 

*********** 
09:42:04,418 INFO  [STDOUT] 
------------------------------------------------------- 
GMS: address is 192.168.200.76:2871 (additional data: 19 bytes) 
------------------------------------------------------- 
09:43:08,487 INFO  [DefaultPartition] Number of cluster members: 3 
*********** 
+64 seconds in between logs! 

09:43:08,487 INFO  [DefaultPartition] Other members: 2 



Member 4: 

09:44:06,472 INFO  [TreeCache] cache mode is REPL_SYNC 
09:44:06,472 ERROR [UNICAST] window_size is deprecated and will be ignored 
09:44:06,472 ERROR [UNICAST] min_threshold is deprecated and will be ignored 
09:44:06,517 INFO  [NamingService] JNDI bootstrap JNP=/0.0.0.0:1099, RMI=/0.0.0.0:1098, backlog=50, no client SocketFactory, Server SocketFactory=class org.jboss.net.sockets.DefaultSocketFactory 
09:44:06,533 INFO  [SubscriptionManager] Bound event dispatcher to java:/EventDispatcher 

*********** 
09:44:06,548 INFO  [STDOUT] 
------------------------------------------------------- 
GMS: address is 192.168.200.75:3123 (additional data: 19 bytes) 
------------------------------------------------------- 
09:45:20,729 INFO  [DefaultPartition] Number of cluster members: 4 
*********** 
+74 seconds in between logs! 

09:45:20,729 INFO  [DefaultPartition] Other members: 3 



View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3981007#3981007

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3981007




More information about the jboss-user mailing list