[jboss-user] [JBossCache] - Increasing threads handling replicate() messages for a clust

hsaha do-not-reply at jboss.com
Wed May 16 16:23:49 EDT 2007


Hi,
Under load, the replicate() message on a cluster takes too long to appear. Following is the log snippet:
Note: both the nodes are time synced.


  | Node1:
  | 2007-05-16 23:37:59,583 DEBUG [app-nbdvjlljy7o3|10.253.205.12-1503236585|5fbb8cb142088f50] invoking method _put; id:3(null, /ipunity/mgcpstac
  | k/sentCommandCache/48400013, item, com.ipunity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask at 7b24c2, true), members=[10.253.205.16:53498, 10.25
  | 3.205.15:50987], mode=REPL_SYNC, exclude_self=true, timeout=10000
  | 2007-05-16 23:37:59,583 DEBUG [app-nbdvjlljy7o3|10.253.205.12-1503236585|5fbb8cb142088f50] Broadcasting call _put; id:3(null, /ipunity/mgcpst
  | ack/sentCommandCache/48400013, item, com.ipunity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask at 7b24c2, true) to recipient list null
  | 2007-05-16 23:37:59,583 DEBUG [app-nbdvjlljy7o3|10.253.205.12-1503236585|5fbb8cb142088f50] callRemoteMethods(): valid members are [10.253.205
  | .15:50987] method: _replicate; id:13(_put; id:3(null, /ipunity/mgcpstack/sentCommandCache/48400013, item, com.ipunity.ri.jain.protocol.ip.mgc
  | p.SelfRetransmitTask at 7b24c2, true))
  | 2007-05-16 23:37:59,583 DEBUG [app-nbdvjlljy7o3|10.253.205.12-1503236585|5fbb8cb142088f50] Marshalling object _replicate; id:13(_put; id:3(nu
  | ll, /ipunity/mgcpstack/sentCommandCache/48400013, item, com.ipunity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask at 7b24c2, true))
  | 2007-05-16 23:37:59,583 DEBUG [app-nbdvjlljy7o3|10.253.205.12-1503236585|5fbb8cb142088f50] Warning: using object serialization for class com.
  | ipunity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask
  | 
  | Node2:
  | 2007-05-16 23:38:12,726 DEBUG [] 10.253.205.15:50987 received call _put; id:3(null, /ipunity/mgcpstack/sentCommandCache/48400013, item, com.i
  | punity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask at 1d38cb3, true)
  | 2007-05-16 23:38:12,726 DEBUG [] (10.253.205.15:50987) call on method [_put; id:3(null, /ipunity/mgcpstack/sentCommandCache/48400013, item, c
  | om.ipunity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask at 1d38cb3, true)]
  | 2007-05-16 23:38:12,726 DEBUG [] PessimisticLockInterceptor invoked for method _put; id:3(null, /ipunity/mgcpstack/sentCommandCache/48400013,
  |  item, com.ipunity.ri.jain.protocol.ip.mgcp.SelfRetransmitTask at 1d38cb3, true)
  | 2007-05-16 23:38:12,726 DEBUG [] Attempting to lock node /ipunity/mgcpstack/sentCommandCache/48400013 for owner Thread[UpHandler (STATE_TRANS
  | FER),5,Pooled Threads]
  | 

Observe that the replicate() messages appeared only after 12 seconds. I have 10 seconds as replication timeout. Hence i get replication exceptions from the node2 in the cluster. I would not want to increase the replication timeout because of performance issues in my application.

My guess is all the jgroups receive threads were busy handling other messages in the cluster. Is there a way to specify the thread pool size .. or to turn off thread-pooling so that the messages are handled as they appear?

My treecache.xml is as follows:

  | <?xml version="1.0" encoding="UTF-8"?>
  | 
  | <!-- ===================================================================== -->
  | <!--                                                                       -->
  | <!--  Sample TreeCache Service Configuration                               -->
  | <!--                                                                       -->
  | <!-- ===================================================================== -->
  | 
  | <server>
  | 
  |     <classpath codebase="./lib" archives="jboss-cache.jar, jgroups.jar"/>
  | 
  | 
  |     <!-- ==================================================================== -->
  |     <!-- Defines TreeCache configuration                                      -->
  |     <!-- ==================================================================== -->
  | 
  |     <mbean code="org.jboss.cache.TreeCache"
  |         name="jboss.cache:service=TreeCache">
  | 
  |         <depends>jboss:service=Naming</depends>
  |         <depends>jboss:service=TransactionManager</depends>
  | 
  |         <!--
  |         Configure the TransactionManager
  |     -->
  |         <attribute name="TransactionManagerLookupClass">com.ipunity.common.cache.WeblogicTransactionManagerLookup</attribute>
  | 
  |         <!--
  |             Isolation level : SERIALIZABLE
  |                               REPEATABLE_READ (default)
  |                               READ_COMMITTED
  |                               READ_UNCOMMITTED
  |                               NONE
  |         -->
  |         <attribute name="IsolationLevel">READ_COMMITTED</attribute>
  | 
  |         <!--
  |              Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC
  |         -->
  |         <attribute name="CacheMode">REPL_SYNC</attribute>
  | 
  |         <!--
  |         Just used for async repl: use a replication queue
  |         -->
  |         <attribute name="UseReplQueue">false</attribute>
  | 
  |         <!--
  |             Replication interval for replication queue (in ms)
  |         -->
  |         <attribute name="ReplQueueInterval">0</attribute>
  | 
  |         <!--
  |             Max number of elements which trigger replication
  |         -->
  |         <attribute name="ReplQueueMaxElements">0</attribute>
  | 
  |         <!-- Name of cluster. Needs to be the same for all clusters, in order
  |              to find each other
  |         -->
  |         <attribute name="ClusterName">IPUnity-Cluster-2</attribute>
  | 
  |         <!-- JGroups protocol stack properties. Can also be a URL,
  |              e.g. file:/home/bela/default.xml
  |            <attribute name="ClusterProperties"></attribute>
  |         -->
  | 
  |         <attribute name="ClusterConfig">
  |             <config>
  |                 <!-- UDP: if you have a multihomed machine,
  |                 set the bind_addr attribute to the appropriate NIC IP address, e.g bind_addr="192.168.0.2"
  |                 -->
  |                 <!-- UDP: On Windows machines, because of the media sense feature
  |                  being broken with multicast (even after disabling media sense)
  |                  set the loopback attribute to true -->
  |                 <UDP mcast_addr="224.10.10.16" mcast_port="45568"
  |                     ip_ttl="64" ip_mcast="true" 
  |                     mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
  |                     ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
  |                     loopback="false" bind_addr="10.253.205.16"/>
  |                 <PING timeout="2000" num_initial_members="3"
  |                     up_thread="false" down_thread="false"/>
  |                 <MERGE2 min_interval="10000" max_interval="20000"/>
  |                 <!--        <FD shun="true" up_thread="true" down_thread="true" />-->
  |                 <FD_SOCK/>
  |                 <VERIFY_SUSPECT timeout="1500"
  |                     up_thread="false" down_thread="false"/>
  |                 <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
  |                     max_xmit_size="8192" up_thread="false" down_thread="false"/>
  |                 <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
  |                     down_thread="false"/>
  |                 <pbcast.STABLE desired_avg_gossip="20000"
  |                     up_thread="false" down_thread="false"/>
  |                 <FRAG frag_size="8192"
  |                     down_thread="false" up_thread="false"/>
  |                 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
  |                     shun="true" print_local_addr="true"/>
  |                 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
  |             </config>
  |         </attribute>
  | 
  | 
  |         <!--
  |         Whether or not to fetch state on joining a cluster
  |        -->
  |         <attribute name="FetchStateOnStartup">true</attribute>
  | 
  |         <!--
  |             The max amount of time (in milliseconds) we wait until the
  |             initial state (ie. the contents of the cache) are retrieved from
  |             existing members in a clustered environment
  |         -->
  |         <attribute name="InitialStateRetrievalTimeout">5000</attribute>
  | 
  |         <!--
  |             Number of milliseconds to wait until all responses for a
  |             synchronous call have been received.
  |         -->
  |         <attribute name="SyncReplTimeout">10000</attribute>
  | 
  |         <!-- Max number of milliseconds to wait for a lock acquisition -->
  |         <attribute name="LockAcquisitionTimeout">15000</attribute>
  | 
  |         <!-- Name of the eviction policy class. Not supported now. -->
  |         <attribute name="EvictionPolicyClass"></attribute>
  | 
  |        <!--
  |        <attribute name="CacheLoaderClass">org.jboss.cache.loader.bdbje.BdbjeCacheLoader</attribute>
  |        <attribute name="CacheLoaderConfig">c:\tmp\bdbje</attribute>
  |        <attribute name="CacheLoaderShared">true</attribute>
  |        <attribute name="CacheLoaderPreload">/</attribute>
  |        -->
  | 	
  | <!--
  |        <attribute name="CacheLoaderClass">org.jboss.cache.loader.FileCacheLoader</attribute>
  |        <attribute name="CacheLoaderConfig">/tmp</attribute>
  |        <attribute name="CacheLoaderShared">true</attribute>
  |        <attribute name="CacheLoaderPreload">/</attribute>
  | -->
  | 
  | 
  |     </mbean>
  | 
  | 
  |    <!--  Uncomment to get a graphical view of the TreeCache MBean above -->
  |    <!--   <mbean code="org.jboss.cache.TreeCacheView" name="jboss.cache:service=TreeCacheView">-->
  |    <!--      <depends>jboss.cache:service=TreeCache</depends>-->
  |    <!--      <attribute name="CacheService">jboss.cache:service=TreeCache</attribute>-->
  |    <!--   </mbean>-->
  | 
  | 
  | </server>
  | 

Version Details follow:
JBC version - 1.4.1SP3
Application server - Weblogic

Any help would be appreciated.

Regards,
Himadri

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4046298#4046298

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4046298



More information about the jboss-user mailing list