[jboss-user] [JBoss Cache: Core Edition] - Transactions Created On Reads Holding Up Writers ???

Thu Jul 23 15:23:36 EDT 2009

We have a cache which is read by 100's of threads per second.
The threads are all reading one node.

The first thread that finds the node missing loads the data (which takes 5-10 seconds) the others wait (with a timeout) for it to be populated.

If we dump the stack traces, one of the threads is waiting here:
sun.misc.Unsafe.park(Native Method)
  | java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
  | java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:841)
  | java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1160)
  | org.jboss.cache.util.concurrent.locks.OwnableReentrantLock.tryLock(OwnableReentrantLock.java:100)
  | org.jboss.cache.util.concurrent.locks.AbstractSharedLockContainer.acquireLock(AbstractSharedLockContainer.java:94)
  | org.jboss.cache.lock.MVCCLockManager.lockAndRecord(MVCCLockManager.java:132)
  | org.jboss.cache.mvcc.MVCCNodeHelper.acquireLock(MVCCNodeHelper.java:155)
  | org.jboss.cache.mvcc.MVCCNodeHelper.wrapNodeForWriting(MVCCNodeHelper.java:235)
  | org.jboss.cache.mvcc.MVCCNodeHelper.wrapNodeForWriting(MVCCNodeHelper.java:184)
  | org.jboss.cache.interceptors.MVCCLockingInterceptor.handlePutKeyValueCommand(MVCCLockingInterceptor.java:101)

All of the other threads are waiting in our code for the other thread to complete.

If we set the timeout in our code (which wait for the thread above to complete) less than the timeout of the jboss cache the writer thread above will complete ONLY once our waiting threads have timed out.

If we set our internal timeout greater than the timeout of the jboss cache then the writer thread will throw timeout errors.

We can reproduce this any the time using just 2 threads.

We have tried different settings, but can't get this to work correctly.

The only thing that we've found that didn't lock the cache up like this is to use the DummyTransactionManager.

So is the jboss cache tying in to the current EJB transaction when a simple read is done?

However 1) that shouldn't be used in production 2) the cache doesn't replicate with that transaction manager.

Here are the cache settings (not using the "Dummy" txn manager):
<?xml version="1.0" encoding="UTF-8"?>
  |   <jbosscache xmlns="urn:jboss:jbosscache-core:config:3.1">
  |   <locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="15000"/>
  | 
  |   <transaction
  |     transactionManagerLookupClass="org.jboss.cache.transaction.GenericTransactionManagerLookup"
  |   />
  |   
  |   <clustering mode="replication" clusterName="SystemCache-Cluster">
  |     <!--jmxStatistics exposeManagementStatistics="true"/-->
  |     <sync replTimeout="20000"/>
  |     <jgroupsConfig>
  |       <TCP 
  |         bind_addr="devA"
  |         loopback="true"
  |         start_port="7855"
  |         enable_bundling="false"
  |       />
  |       <TCPPING
  |         down_thread="true"
  |         initial_hosts="devB[7855]"
  |         num_initial_members="2"
  |         port_range="1"
  |         timeout="3500"
  |       />
  |       <MERGE2 max_interval="10000" min_interval="5000"/>
  |       <FD_SOCK/>
  |       <FD max_tries="5" shun="false" timeout="2500" />
  |       <VERIFY_SUSPECT timeout="1500" />
  |       <pbcast.NAKACK 
  |         use_mcast_xmit="false"
  |         gc_lag="0"
  |         retransmit_timeout="300,600,1200,2400,4800"
  |         discard_delivered_msgs="false"
  |       />
  |       <pbcast.STABLE
  |         desired_avg_gossip="50000"
  |         max_bytes="2100000"
  |         stability_delay="1000"
  |       />
  |       <pbcast.GMS
  |         join_retry_timeout="2000"
  |         join_timeout="5000"
  |         print_local_addr="true"
  |         shun="false"
  |         view_bundling="true
  |       />
  |       <pbcast.STREAMING_STATE_TRANSFER/>
  |     </jgroupsConfig>
  |   </clustering>
  |   <eviction wakeUpInterval="600000">
  |   <default algorithmClass="org.jboss.cache.eviction.LRUAlgorithm">
  |   <property name="maxNodes" value="10000"/>
  |   <property name="maxAge" value="-1"/>
  |   <property name="timeToLive" value="-1"/>
  | </default>
  | 

Some other notes:
  1) the cache doesn't auto-deploy as in 4.0.1, so we have to load it manually in the constructor giving it the filename. (a jboss rep at JavaOne '09 seemed puzzled by this)
  2) the jmxStatistics node caused a null pointer when the the api is parsed
  3) if we comment out the transaction element or use "" for the value of transactionManagerLookupClass it throws a "class not found 'null'" error.

Any ideas?

View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4245670#4245670

Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4245670