I am running into an issue in the case when something is wrong with several nodes in the
cluster, and the surviving node somehow does not evict the troublesome nodes and starts
accumulating messages.
The current config looks like this:
| <property name="isolationLevel"
value="REPEATABLE_READ" />
| <property name="cacheMode" value="REPL_ASYNC"
/>
| <property name="clusterName"
value="${treeCache.clusterName}" />
| <property name="useReplQueue" value="false"
/>
| <property name="replQueueInterval" value="0"
/>
| <property name="replQueueMaxElements" value="0"
/>
| <property name="fetchInMemoryState"
value="true" />
| <property name="initialStateRetrievalTimeout"
value="20000" />
| <property name="syncReplTimeout" value="20000"
/>
| <property name="lockAcquisitionTimeout"
value="5000" />
| <property name="useRegionBasedMarshalling"
value="false" />
| <property name="clusterProperties"
| value="${treeCache.clusterProperties}" />
| <property name="serviceName">
| <bean class="javax.management.ObjectName">
| <constructor-arg
value="jboss.cache:service=${treeCache.clusterName},name=${treeCache.instanceName}"/>
| </bean>
| </property>
| <property name="evictionPolicyClass"
value="org.jboss.cache.eviction.LRUPolicy"/>
| <property name="maxAgeSeconds"
value="${treeCache.eviction.maxAgeSeconds}"/>
| <property name="maxNodes"
value="${treeCache.eviction.maxNodes}"/>
| <property name="timeToLiveSeconds"
value="${treeCache.eviction.timeToLiveSeconds}"/>
|
The jgroups stack is this:
|
treeCache.clusterProperties=UDP(ip_mcast=true;ip_ttl=64;loopback=false;mcast_addr=${treeCache.mcastAddress};mcast_port=${treeCache.mcastPort};mcast_recv_buf_
|
size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000;bind_addr=${treeCache.bind_addr}):\
| PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):\
| MERGE2(max_interval=20000;min_interval=10000):\
| FD_SOCK(down_thread=false;up_thread=false):\
| VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):\
|
pbcast.NAKACK(down_thread=false;gc_lag=50;retransmit_timeout=600,1200,2400,4800;up_thread=false):\
| pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):\
| UNICAST(down_thread=false;;timeout=600,1200,2400):\
| FRAG(down_thread=false;frag_size=8192;up_thread=false):\
|
pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):\
| pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
|
The cluster has 12 nodes, and I had this situation occur when 3 of the nodes failed, which
provoked the ops team into restarting 9 of them. The remaning 3 all went OOM quickly.
Analysing the heap dump post-mortem, I see this:
org.jgroups.protocols.pbcast.NAKACK retained size=245MB
My first step is to add FD into the stack to adress the issue of failure detection not
working properly in some cases. Then I would like to limit the size of the NAKACK
structure (even if this means losing consistency accross the cluster): is this possible at
all? What are your suggestions?
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3990413#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...