[jboss-user] [JBossCache] - Retrieving state problem

kkrivopustov do-not-reply at jboss.com
Fri Mar 2 11:45:50 EST 2007


Hello,
we have a cluster with 2 nodes, and have to restart one server while other is working. Sometimes after several restarts of each node in turn the TreeCache doesn't retrieve the state on startup, because it doesn't see other node. After some time it finds the other node, but it doesn't affect its state. This problem is very serious for us, because we store user sessions in the cache, so if the new node doesn't receive the state from existing node, all user requests to the new node fail.

Here is log excerpt from the starting node:

  | 2007-03-02 19:45:08,367 DEBUG [org.jboss.cache.TreeCache] Starting jboss.cache:service=GearSessionsTreeCache
  | 2007-03-02 19:45:11,878 INFO  [org.jboss.cache.TreeCache] viewAccepted(): [192.168.3.71:7810|0] [192.168.3.71:7810]
  | 2007-03-02 19:45:11,878 INFO  [org.jboss.cache.TreeCache] TreeCache local address is 192.168.3.71:7810
  | 2007-03-02 19:45:11,878 DEBUG [org.jboss.cache.TreeCache] transferred state is null (may be first member in cluster)
  | 2007-03-02 19:45:11,894 INFO  [org.jboss.cache.TreeCache] State could not be retrieved (we are the first member in group)
  | ...
  | 2007-03-02 19:45:24,192 INFO  [org.jboss.cache.TreeCache] viewAccepted(): MergeView::[192.168.3.65:7810|5] [192.168.3.65:7810, 192.168.3.71:7810], subgroups=[[192.168.3.71:7810|4] [192.168.3.65:7810], [192.168.3.71:7810|0] [192.168.3.71:7810]]
  | 
We use TCP stack of JGroups:

  |           <TCP bind_addr="192.168.3.71" start_port="7810" loopback="true"/>
  |           <TCPPING initial_hosts="192.168.3.65[7810]"
  |                    port_range="3"
  |                    timeout="3500"
  |                    num_initial_members="3"
  |                    up_thread="true"
  |                    down_thread="true"/>
  |           <MERGE2 min_interval="5000" max_interval="10000"/>
  |           <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
  |           <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
  |           <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100" retransmit_timeout="3000" />
  |           <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
  |           <pbcast.GMS join_timeout="5000"
  |                       join_retry_timeout="2000"
  |                       shun="false"
  |                       print_local_addr="false"
  |                       down_thread="true"
  |                       up_thread="true"/>
  |           <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
  | 
Any help with this would be very appreciated...

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4024692#4024692

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4024692



More information about the jboss-user mailing list