We are seeing lots of replication timeout exceptions and have extensively played with the
different isolation levels and locking schemes with little success. Things are all good
with a single node cluster. Once we add a 2nd node to the cluster and attempt concurrent
writes to the same node in the tree cache we see lots of timeout exceptions. I believe we
need Serializable as an IsolationLevel since we need to ensure global synchronization.
But, it does not seem to be locking the nodes appropriately. Environment is JBoss AS
4.2.2.GA and JBoss Cache 2.0.0.GA. A few questions about locking and transactions:
- With Serializable IsolationLevel, should this not prevent reads to any of the nodes
touched in the cache until the transaction commits? When is the lock fetched?
- Can you recommend the appropriate configurations for a reasonably high transaction
environment? Basically we are looking for the ability to synchronize the entire
boundaries of a transaction. In general, a txn would take 10 seconds or less.
Here is our existing config:
| <?xml version="1.0" encoding="UTF-8"?>
| <!-- ===================================================================== -->
| <!-- -->
| <!-- Sample TreeCache Service Configuration -->
| <!-- -->
| <!-- ===================================================================== -->
| <server>
| <!-- ====================================================================
| <!-- Defines TreeCache configuration
| <!-- ====================================================================
| <mbean code="org.jboss.cache.pojo.jmx.PojoCacheJmxWrapper"
| name="jboss.cache:service=TreeCache">
| <depends>jboss:service=Naming</depends>
| <depends>jboss:service=TransactionManager</depends>
| <!--
| Configure the TransactionManager
| -->
| <attribute
| <!--
| Isolation level : SERIALIZABLE
| -->
| <attribute name="IsolationLevel">SERIALIZABLE</attribute>
| <!--
| Valid modes are LOCAL
| -->
| <attribute name="CacheMode">REPL_SYNC</attribute>
| <!--
| Node locking scheme:
| PESSIMISTIC (default)
| -->
| <attribute name="NodeLockingScheme">PESSIMISTIC</attribute>
| <!--
| Just used for async repl: use a replication queue
| -->
| <attribute name="UseReplQueue">false</attribute>
| <!--
| Replication interval for replication queue (in ms)
| -->
| <attribute name="ReplQueueInterval">0</attribute>
| <!--
| Max number of elements which trigger replication
| -->
| <attribute name="ReplQueueMaxElements">0</attribute>
| <!-- Name of cluster. Needs to be the same for all TreeCache nodes in a
| cluster in order to find each other. Needs to be different in order to
| separate caches
| -->
| <attribute
| <!--Uncomment next three statements to enable JGroups multiplexer.
| This configuration is dependent on the JGroups multiplexer being
| registered in an MBean server such as JBossAS. -->
| <!--
| <depends>jgroups.mux:name=Multiplexer</depends>
| <attribute
| <attribute
| -->
| <!-- JGroups protocol stack properties.
| ClusterConfig isn't used if the multiplexer is enabled and successfully
| -->
| <attribute name="ClusterConfig">
| <config>
| <UDP mcast_addr=""
| mcast_port="50008"
| tos="8"
| ucast_recv_buf_size="20000000"
| ucast_send_buf_size="640000"
| mcast_recv_buf_size="25000000"
| mcast_send_buf_size="640000"
| loopback="false"
| discard_incompatible_packets="true"
| max_bundle_size="64000"
| max_bundle_timeout="30"
| use_incoming_packet_handler="true"
| ip_ttl="2"
| enable_bundling="false"
| enable_diagnostics="true"
| use_concurrent_stack="true"
| thread_naming_pattern="pl"
| thread_pool.enabled="true"
| thread_pool.min_threads="1"
| thread_pool.max_threads="25"
| thread_pool.keep_alive_time="30000"
| thread_pool.queue_enabled="true"
| thread_pool.queue_max_size="10"
| thread_pool.rejection_policy="Run"
| oob_thread_pool.enabled="true"
| oob_thread_pool.min_threads="1"
| oob_thread_pool.max_threads="4"
| oob_thread_pool.keep_alive_time="10000"
| oob_thread_pool.queue_enabled="true"
| oob_thread_pool.queue_max_size="10"
| oob_thread_pool.rejection_policy="Run"/>
| <PING timeout="2000" num_initial_members="3"/>
| <MERGE2 max_interval="30000"
| <FD_SOCK/>
| <FD timeout="10000" max_tries="5"
| <VERIFY_SUSPECT timeout="1500"/>
| <pbcast.NAKACK max_xmit_size="60000"
| use_mcast_xmit="false" gc_lag="0"
| retransmit_timeout="300,600,1200,2400,4800"
| discard_delivered_msgs="true"/>
| <UNICAST timeout="300,600,1200,2400,3600"/>
| <pbcast.STABLE stability_delay="1000"
| max_bytes="400000"/>
| <AUTH auth_class="org.jgroups.auth.MD5Token"
| auth_value="desktone"
| token_hash="MD5"/>
| <pbcast.GMS print_local_addr="true"
| join_retry_timeout="2000" shun="false"
| view_bundling="true"
| <FRAG2 frag_size="60000"/>
| <!-- <pbcast.STATE_TRANSFER/> -->
| <pbcast.FLUSH timeout="0"/>
| </config>
| </attribute>
| <!--
| Whether or not to fetch state on joining a cluster
| NOTE this used to be called FetchStateOnStartup and has been renamed to be more
| -->
| <attribute name="FetchInMemoryState">false</attribute>
| <!--
| The max amount of time (in milliseconds) we wait until the
| state (ie. the contents of the cache) are retrieved from
| existing members in a clustered environment
| -->
| <attribute name="StateRetrievalTimeout">15000</attribute>
| <!--
| Number of milliseconds to wait until all responses for a
| synchronous call have been received.
| -->
| <attribute name="SyncReplTimeout">15000</attribute>
| <!-- Max number of milliseconds to wait for a lock acquisition -->
| <attribute
| <!--
| Indicate whether to use region based marshalling or not. Set this to true if
you are running under a scoped
| class loader, e.g., inside an application server. Default is
| -->
| <attribute
| <!-- Cache Loader configuration block -->
| <attribute name="CacheLoaderConfig">
| <config>
| <!-- if passivation is true, only the first cache loader is used; the rest
are ignored -->
| <passivation>false</passivation>
| <preload>/</preload>
| <shared>true</shared>
| <!-- we can now have multiple cache loaders, which get chained -->
| <cacheloader>
| <class>org.jboss.cache.loader.JDBCCacheLoader</class>
| <properties>
| cache.jdbc.table.name=dht
| cache.jdbc.table.primarykey=dht_pk
| cache.jdbc.table.create=true
| cache.jdbc.table.drop=false
| cache.jdbc.fqn.column=fqn
| cache.jdbc.fqn.type=varchar(255)
| cache.jdbc.node.column=value
| cache.jdbc.node.type=LONGBLOB
| cache.jdbc.parent.column=parent_fqn
| cache.jdbc.datasource=java:/jdbc/FabricDS
| cache.jdbc.sql-concat=concat(1,2)
| </properties>
| <!-- whether the cache loader writes are asynchronous -->
| <async>false</async>
| <!-- only one cache loader in the chain may set fetchPersistentState to
| An exception is thrown if more than one cache loader sets this to true.
| <fetchPersistentState>false</fetchPersistentState>
| <!-- determines whether this cache loader ignores writes - defaults to
false. -->
| <ignoreModifications>false</ignoreModifications>
| <purgeOnStartup>false</purgeOnStartup>
| </cacheloader>
| </config>
| </attribute>
| <!-- Buddy Replication config -->
| <attribute name="BuddyReplicationConfig">
| <config>
| <!-- Enables buddy replication. This is the ONLY mandatory
configuration element here. -->
| <buddyReplicationEnabled>false</buddyReplicationEnabled>
| <!-- These are the default values anyway -->
| <!-- numBuddies is the number of backup nodes each node maintains.
ignoreColocatedBuddies means that
| each node will *try* to select a buddy on a different physical host. If not
able to do so though,
| it will fall back to colocated nodes. -->
| <buddyLocatorProperties>
| numBuddies = 1
| ignoreColocatedBuddies = true
| </buddyLocatorProperties>
| <!-- A way to specify a preferred replication group. If specified,
we try and pick a buddy why shares
| the same pool name (falling back to other buddies if not available). This
allows the sysdmin to hint at
| backup buddies are picked, so for example, nodes may be hinted topick buddies
on a different physical rack
| or power supply for added fault tolerance. Note: to override this value, use
system property desktone.cache.buddyName -->
| <buddyPoolName>myBuddyPoolReplicationGroup</buddyPoolName>
| <!-- Communication timeout for inter-buddy group organisation
messages (such as assigning to and removing
| from groups, defaults to 1000. -->
| <!-- Whether data is removed from old owners when gravitated to a
new owner. Defaults to true. -->
| <!-- Whether backup nodes can respond to data gravitation requests,
or only the data owner is supposed to respond.
| defaults to true. -->
| <!-- Whether all cache misses result in a data gravitation request.
Defaults to false, requiring callers to
| enable data gravitation on a per-invocation basis
using the Options API. -->
| <autoDataGravitation>false</autoDataGravitation>
| </config>
| </attribute>
| </mbean>
| <!-- Uncomment to get a graphical view of the TreeCache MBean above -->
| <!-- <mbean code="org.jboss.cache.TreeCacheView"
| <!-- <depends>jboss.cache:service=TreeCache</depends>-->
| <!-- <attribute
| <!-- </mbean>-->
| </server>
View the original post :
Reply to the post :